hunch_net hunch_net-2007 hunch_net-2007-231 knowledge-graph by maker-knowledge-mining

231 hunch net-2007-02-10-Best Practices for Collaboration


meta infos for this blog

Source: html

Introduction: Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. Enable your collaborator . A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think. Avoid Duplication . Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the e


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Collaboration, especially with remote people can be tricky. [sent-2, score-0.192]

2 We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. [sent-6, score-0.235]

3 A collaboration can fall apart because one collaborator disables another. [sent-8, score-0.515]

4 Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. [sent-11, score-0.47]

5 The best way to avoid this to be explicit about who has write permission to what. [sent-12, score-0.255]

6 Most of the time, a write lock is held for the entire document, just to be sure. [sent-13, score-0.496]

7 Some people are perfectionists so they have a real problem giving up the write lock on a draft until it is perfect. [sent-15, score-0.698]

8 Releasing write lock (at least) when you sleep, is a good idea. [sent-17, score-0.496]

9 Forcing your collaborator to deal with the missing subdocument problem is disabling. [sent-21, score-0.403]

10 Space and bandwidth are cheap while your collaborators time is precious. [sent-22, score-0.335]

11 This doesn’t mean “use version control software”, although that’s fine. [sent-25, score-0.213]

12 Instead, it means: have a version number for drafts passed back and forth. [sent-26, score-0.266]

13 This means you can talk about “draft 3″ rather than “the draft that was passed last tuesday”. [sent-27, score-0.251]

14 When deciding who should have a chance to be a coauthor, the rule should be “anyone who has helped produce a result conditioned on previous work”. [sent-32, score-0.22]

15 “Helped produce” is often interpreted too narrowly—a theoretician should be generous about crediting experimental results and vice-versa. [sent-33, score-0.325]

16 Control over who is a coauthor is best (and most naturally) exercised by the choice of who you talk to. [sent-35, score-0.227]

17 A good default for presentations at a conference is “student presents” (or suitable generalizations). [sent-40, score-0.195]

18 Senior collaborators already have plentiful alternative methods to present research at workshops or invited talks. [sent-42, score-0.289]

19 Communicate by default Not cc’ing a collaborator is a bad idea. [sent-43, score-0.531]

20 Even if you have a very specific question for one collaborator and not another, it’s a good idea to cc everyone. [sent-44, score-0.554]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('collaborator', 0.403), ('lock', 0.303), ('coauthor', 0.227), ('collaborators', 0.227), ('write', 0.193), ('collaborations', 0.168), ('cc', 0.151), ('presents', 0.151), ('ordering', 0.147), ('draft', 0.139), ('generous', 0.134), ('default', 0.128), ('control', 0.126), ('prevents', 0.118), ('senior', 0.118), ('materials', 0.112), ('passed', 0.112), ('collaboration', 0.112), ('bandwidth', 0.108), ('travel', 0.095), ('send', 0.087), ('version', 0.087), ('helped', 0.082), ('produce', 0.075), ('stupid', 0.067), ('remotely', 0.067), ('tuesday', 0.067), ('annoyance', 0.067), ('decline', 0.067), ('drafts', 0.067), ('duplication', 0.067), ('editing', 0.067), ('practices', 0.067), ('standing', 0.067), ('strive', 0.067), ('suitable', 0.067), ('theoretician', 0.067), ('especially', 0.067), ('naturally', 0.067), ('people', 0.063), ('chance', 0.063), ('narrowly', 0.062), ('crediting', 0.062), ('permission', 0.062), ('interpreted', 0.062), ('backup', 0.062), ('files', 0.062), ('generalizations', 0.062), ('plentiful', 0.062), ('remote', 0.062)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 231 hunch net-2007-02-10-Best Practices for Collaboration

Introduction: Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. Enable your collaborator . A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think. Avoid Duplication . Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the e

2 0.092634365 206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time

Introduction: This title is a lie, but it is a special lie which has a bit of truth. If n players each play each other, you have a tournament. How do you order the players from weakest to strongest? The standard first attempt is “find the ordering which agrees with the tournament on as many player pairs as possible”. This is called the “minimum feedback arcset” problem in the CS theory literature and it is a well known NP-hard problem. A basic guarantee holds for the solution to this problem: if there is some “true” intrinsic ordering, and the outcome of the tournament disagrees k times (due to noise for instance), then the output ordering will disagree with the original ordering on at most 2k edges (and no solution can be better). One standard approach to tractably solving an NP-hard problem is to find another algorithm with an approximation guarantee. For example, Don Coppersmith , Lisa Fleischer and Atri Rudra proved that ordering players according to the number of wins is

3 0.077064633 461 hunch net-2012-04-09-ICML author feedback is open

Introduction: as of last night, late. When the reviewing deadline passed Wednesday night 15% of reviews were still missing, much higher than I expected. Between late reviews coming in, ACs working overtime through the weekend, and people willing to help in the pinch another ~390 reviews came in, reducing the missing mass to 0.2%. Nailing that last bit and a similar quantity of papers with uniformly low confidence reviews is what remains to be done in terms of basic reviews. We are trying to make all of those happen this week so authors have some chance to respond. I was surprised by the quantity of late reviews, and I think that’s an area where ICML needs to improve in future years. Good reviews are not done in a rush—they are done by setting aside time (like an afternoon), and carefully reading the paper while thinking about implications. Many reviewers do this well but a significant minority aren’t good at scheduling their personal time. In this situation there are several ways to fail:

4 0.076707445 414 hunch net-2010-10-17-Partha Niyogi has died

Introduction: from brain cancer. I asked Misha who worked with him to write about it. Partha Niyogi, Louis Block Professor in Computer Science and Statistics at the University of Chicago passed away on October 1, 2010, aged 43. I first met Partha Niyogi almost exactly ten years ago when I was a graduate student in math and he had just started as a faculty in Computer Science and Statistics at the University of Chicago. Strangely, we first talked at length due to a somewhat convoluted mathematical argument in a paper on pattern recognition. I asked him some questions about the paper, and, even though the topic was new to him, he had put serious thought into it and we started regular meetings. We made significant progress and developed a line of research stemming initially just from trying to understand that one paper and to simplify one derivation. I think this was typical of Partha, showing both his intellectual curiosity and his intuition for the serendipitous; having a sense and focus fo

5 0.076597877 225 hunch net-2007-01-02-Retrospective

Introduction: It’s been almost two years since this blog began. In that time, I’ve learned enough to shift my expectations in several ways. Initially, the idea was for a general purpose ML blog where different people could contribute posts. What has actually happened is most posts come from me, with a few guest posts that I greatly value. There are a few reasons I see for this. Overload . A couple years ago, I had not fully appreciated just how busy life gets for a researcher. Making a post is not simply a matter of getting to it, but rather of prioritizing between {writing a grant, finishing an overdue review, writing a paper, teaching a class, writing a program, etc…}. This is a substantial transition away from what life as a graduate student is like. At some point the question is not “when will I get to it?” but rather “will I get to it?” and the answer starts to become “no” most of the time. Feedback failure . This blog currently receives about 3K unique visitors per day from

6 0.075864568 296 hunch net-2008-04-21-The Science 2.0 article

7 0.074629404 318 hunch net-2008-09-26-The SODA Program Committee

8 0.072286516 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy

9 0.068401739 437 hunch net-2011-07-10-ICML 2011 and the future

10 0.067568399 454 hunch net-2012-01-30-ICML Posters and Scope

11 0.066225618 295 hunch net-2008-04-12-It Doesn’t Stop

12 0.065945454 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

13 0.065187037 22 hunch net-2005-02-18-What it means to do research.

14 0.064639978 343 hunch net-2009-02-18-Decision by Vetocracy

15 0.064431772 292 hunch net-2008-03-15-COLT Open Problems

16 0.063464597 134 hunch net-2005-12-01-The Webscience Future

17 0.063230105 208 hunch net-2006-09-18-What is missing for online collaborative research?

18 0.062376238 30 hunch net-2005-02-25-Why Papers?

19 0.061175376 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

20 0.060661238 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.151), (1, -0.047), (2, -0.017), (3, 0.048), (4, -0.037), (5, 0.031), (6, 0.014), (7, -0.009), (8, -0.0), (9, 0.022), (10, -0.016), (11, 0.003), (12, 0.023), (13, 0.036), (14, 0.026), (15, -0.012), (16, -0.009), (17, 0.042), (18, 0.006), (19, 0.054), (20, 0.006), (21, 0.06), (22, -0.032), (23, -0.001), (24, 0.04), (25, -0.023), (26, -0.016), (27, -0.011), (28, 0.007), (29, -0.009), (30, -0.03), (31, 0.0), (32, 0.068), (33, 0.023), (34, 0.015), (35, -0.005), (36, 0.037), (37, -0.037), (38, -0.015), (39, 0.068), (40, -0.01), (41, 0.1), (42, -0.001), (43, -0.009), (44, 0.036), (45, 0.073), (46, 0.003), (47, 0.041), (48, 0.036), (49, -0.02)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95138532 231 hunch net-2007-02-10-Best Practices for Collaboration

Introduction: Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. Enable your collaborator . A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think. Avoid Duplication . Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the e

2 0.63619471 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

3 0.61172754 249 hunch net-2007-06-21-Presentation Preparation

Introduction: A big part of doing research is presenting it at a conference. Since many people start out shy of public presentations, this can be a substantial challenge. Here are a few notes which might be helpful when thinking about preparing a presentation on research. Motivate . Talks which don’t start by describing the problem to solve cause many people to zone out. Prioritize . It is typical that you have more things to say than time to say them, and many presenters fall into the failure mode of trying to say too much. This is an easy-to-understand failure mode as it’s very natural to want to include everything. A basic fact is: you can’t. Example of this are: Your slides are so densely full of equations and words that you can’t cover them. Your talk runs over and a moderator prioritizes for you by cutting you off. You motor-mouth through the presentation, and the information absorption rate of the audience prioritizes in some uncontrolled fashion. The rate of flow of c

4 0.59356773 73 hunch net-2005-05-17-A Short Guide to PhD Graduate Study

Introduction: Graduate study is a mysterious and uncertain process. This easiest way to see this is by noting that a very old advisor/student mechanism is preferred. There is no known succesful mechanism for “mass producing” PhDs as is done (in some sense) for undergraduate and masters study. Here are a few hints that might be useful to prospective or current students based on my own experience. Masters or PhD (a) You want a PhD if you want to do research. (b) You want a masters if you want to make money. People wanting (b) will be manifestly unhappy with (a) because it typically means years of low pay. People wanting (a) should try to avoid (b) because it prolongs an already long process. Attitude . Many students struggle for awhile with the wrong attitude towards research. Most students come into graduate school with 16-19 years of schooling where the principle means of success is proving that you know something via assignments, tests, etc… Research does not work this way. Re

5 0.58628154 29 hunch net-2005-02-25-Solution: Reinforcement Learning with Classification

Introduction: I realized that the tools needed to solve the problem just posted were just created. I tried to sketch out the solution here (also in .lyx and .tex ). It is still quite sketchy (and probably only the few people who understand reductions well can follow). One of the reasons why I started this weblog was to experiment with “research in the open”, and this is an opportunity to do so. Over the next few days, I’ll be filling in details and trying to get things to make sense. If you have additions or ideas, please propose them.

6 0.58624059 288 hunch net-2008-02-10-Complexity Illness

7 0.5826512 42 hunch net-2005-03-17-Going all the Way, Sometimes

8 0.57221442 134 hunch net-2005-12-01-The Webscience Future

9 0.57204932 296 hunch net-2008-04-21-The Science 2.0 article

10 0.57200509 358 hunch net-2009-06-01-Multitask Poisoning

11 0.5673427 91 hunch net-2005-07-10-Thinking the Unthought

12 0.56025314 30 hunch net-2005-02-25-Why Papers?

13 0.55778652 22 hunch net-2005-02-18-What it means to do research.

14 0.55737811 146 hunch net-2006-01-06-MLTV

15 0.55410564 208 hunch net-2006-09-18-What is missing for online collaborative research?

16 0.55225509 233 hunch net-2007-02-16-The Forgetting

17 0.54223168 449 hunch net-2011-11-26-Giving Thanks

18 0.52949321 1 hunch net-2005-01-19-Why I decided to run a weblog.

19 0.52587855 307 hunch net-2008-07-04-More Presentation Preparation

20 0.52139246 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.011), (27, 0.183), (38, 0.048), (53, 0.045), (55, 0.086), (64, 0.017), (67, 0.01), (84, 0.01), (94, 0.035), (95, 0.066), (98, 0.405)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.92090237 167 hunch net-2006-03-27-Gradients everywhere

Introduction: One of the basic observations from the atomic learning workshop is that gradient-based optimization is pervasive. For example, at least 7 (of 12) speakers used the word ‘gradient’ in their talk and several others may be approximating a gradient. The essential useful quality of a gradient is that it decouples local updates from global optimization. Restated: Given a gradient, we can determine how to change individual parameters of the system so as to improve overall performance. It’s easy to feel depressed about this and think “nothing has happened”, but that appears untrue. Many of the talks were about clever techniques for computing gradients where your calculus textbook breaks down. Sometimes there are clever approximations of the gradient. ( Simon Osindero ) Sometimes we can compute constrained gradients via iterated gradient/project steps. ( Ben Taskar ) Sometimes we can compute gradients anyways over mildly nondifferentiable functions. ( Drew Bagnell ) Even give

2 0.90211761 211 hunch net-2006-10-02-$1M Netflix prediction contest

Introduction: Netflix is running a contest to improve recommender prediction systems. A 10% improvement over their current system yields a $1M prize. Failing that, the best smaller improvement yields a smaller $50K prize. This contest looks quite real, and the $50K prize money is almost certainly achievable with a bit of thought. The contest also comes with a dataset which is apparently 2 orders of magnitude larger than any other public recommendation system datasets.

3 0.90051067 322 hunch net-2008-10-20-New York’s ML Day

Introduction: I’m not as naturally exuberant as Muthu 2 or David about CS/Econ day, but I believe it and ML day were certainly successful. At the CS/Econ day, I particularly enjoyed Toumas Sandholm’s talk which showed a commanding depth of understanding and application in automated auctions. For the machine learning day, I enjoyed several talks and posters (I better, I helped pick them.). What stood out to me was number of people attending: 158 registered, a level qualifying as “scramble to find seats”. My rule of thumb for workshops/conferences is that the number of attendees is often something like the number of submissions. That isn’t the case here, where there were just 4 invited speakers and 30-or-so posters. Presumably, the difference is due to a critical mass of Machine Learning interested people in the area and the ease of their attendance. Are there other areas where a local Machine Learning day would fly? It’s easy to imagine something working out in the San Franci

same-blog 4 0.85185117 231 hunch net-2007-02-10-Best Practices for Collaboration

Introduction: Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. Enable your collaborator . A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think. Avoid Duplication . Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the e

5 0.80775356 111 hunch net-2005-09-12-Fast Gradient Descent

Introduction: Nic Schaudolph has been developing a fast gradient descent algorithm called Stochastic Meta-Descent (SMD). Gradient descent is currently untrendy in the machine learning community, but there remains a large number of people using gradient descent on neural networks or other architectures from when it was trendy in the early 1990s. There are three problems with gradient descent. Gradient descent does not necessarily produce easily reproduced results. Typical algorithms start with “set the initial parameters to small random values”. The design of the representation that gradient descent is applied to is often nontrivial. In particular, knowing exactly how to build a large neural network so that it will perform well requires knowledge which has not been made easily applicable. Gradient descent can be slow. Obviously, taking infinitesimal steps in the direction of the gradient would take forever, so some finite step size must be used. What exactly this step size should be

6 0.55216807 379 hunch net-2009-11-23-ICML 2009 Workshops (and Tutorials)

7 0.47620112 343 hunch net-2009-02-18-Decision by Vetocracy

8 0.47598296 194 hunch net-2006-07-11-New Models

9 0.47247094 225 hunch net-2007-01-02-Retrospective

10 0.47195202 360 hunch net-2009-06-15-In Active Learning, the question changes

11 0.47077468 466 hunch net-2012-06-05-ICML acceptance statistics

12 0.47059307 406 hunch net-2010-08-22-KDD 2010

13 0.46903569 36 hunch net-2005-03-05-Funding Research

14 0.4689813 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

15 0.4681882 464 hunch net-2012-05-03-Microsoft Research, New York City

16 0.46745908 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

17 0.46657664 230 hunch net-2007-02-02-Thoughts regarding “Is machine learning different from statistics?”

18 0.46602559 12 hunch net-2005-02-03-Learning Theory, by assumption

19 0.46545023 220 hunch net-2006-11-27-Continuizing Solutions

20 0.46496251 370 hunch net-2009-09-18-Necessary and Sufficient Research