hunch_net hunch_net-2006 hunch_net-2006-221 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: This is a very difficult post to write, because it is about a perenially touchy subject. Nevertheless, it is an important one which needs to be thought about carefully. There are a few things which should be understood: The system is changing and responsive. We-the-authors are we-the-reviewers, we-the-PC, and even we-the-NIPS-board. NIPS has implemented ‘secondary program chairs’, ‘author response’, and ‘double blind reviewing’ in the last few years to help with the decision process, and more changes may happen in the future. Agreement creates a perception of correctness. When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. For the many people who have been on the NIPS PC it’s reasonable to entertain a healthy skepticism in the face of this reinforcing certainty. This post is about structural problems. What problems arise because of the structure
sentIndex sentText sentNum sentScore
1 When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. [sent-7, score-0.319]
2 PC committee members pick reviewers for their areas. [sent-17, score-0.318]
3 Reviewers are assigned papers based on bid plus coverage. [sent-21, score-0.331]
4 PC members present all papers that they believe are worth considering to other PC members and a decision is made. [sent-27, score-0.665]
5 The number of papers assigned to individual PC members is large—perhaps 40 to 80, plus a similar number assigned as a secondary. [sent-32, score-0.733]
6 No one on the PC has seen the paper except for the primary and the secondary (if you are lucky) PC members, so decisions are made quickly based upon relatively little information. [sent-38, score-0.367]
7 NIPS is a single track conference with 3 levels of acceptance “Accept for an oral presentation”, “Accept for a poster with a spotlight”, and “Accept as a poster only”. [sent-41, score-0.308]
8 It’s fairly difficult to justify a paper as “of broad interest”, which is ideal for an oral presentation. [sent-42, score-0.295]
9 It’s substantially easier to justify a paper as “possibly of interest to a number of people”, which is about right for poster spotlight. [sent-45, score-0.291]
10 Since the number of spotlights and the number of orals is similar, two effects occur: papers which are about right for spotlights become orals, and many reasonable spotlights aren’t spotlights because they don’t fit. [sent-46, score-1.226]
11 This is true even when attention is explicitly payed by the PC chair to avoiding the veto problem. [sent-49, score-0.353]
12 The fundamental problem here is that papers aren’t getting the attention that they deserve by the final decision maker. [sent-62, score-0.383]
13 Much of this has to do with inexperience—many authors are first time paper writers. [sent-66, score-0.268]
14 Not making a decision at the PC meeting could be a real option for a small number of troublesome papers. [sent-81, score-0.368]
15 There is perhaps a week-long timegap between the PC meeting and the release of the decisions during which decisions could be double checked. [sent-82, score-0.287]
16 At the PC meeting itself, it would be helpful to have all of the papers available to all of the members. [sent-86, score-0.309]
17 Even working within the single track format, it’s not clear that the ratio between orals and spotlights is right. [sent-91, score-0.368]
18 Spotlights take about 1/10th the time that an oral presentation takes, and yet only 1/10th or so of the overall time is allocated to spotlight presentations. [sent-92, score-0.324]
19 Losing one oral presentation (out of about 20) would yield a significant increase in the number of spotlights, and it’s easy to imagine this would be beneficial to attendees while easing decision making. [sent-93, score-0.549]
20 The are two ways I can imagine for dealing with the veto effect: (1) allowing author feedback (2) devolving power from the PC to the reviewers. [sent-97, score-0.284]
wordName wordTfidf (topN-words)
[('pc', 0.67), ('spotlights', 0.206), ('members', 0.185), ('veto', 0.159), ('nips', 0.154), ('attention', 0.138), ('meeting', 0.131), ('decision', 0.13), ('oral', 0.127), ('secondary', 0.123), ('papers', 0.115), ('paper', 0.113), ('authors', 0.11), ('deficit', 0.107), ('orals', 0.107), ('assigned', 0.106), ('member', 0.083), ('decisions', 0.078), ('reviewers', 0.074), ('blinded', 0.071), ('reconciliation', 0.071), ('author', 0.071), ('physically', 0.063), ('would', 0.063), ('poster', 0.063), ('number', 0.06), ('effects', 0.06), ('committee', 0.059), ('digest', 0.059), ('bid', 0.059), ('perception', 0.059), ('chair', 0.056), ('spotlight', 0.055), ('justify', 0.055), ('going', 0.055), ('track', 0.055), ('fraction', 0.055), ('imagine', 0.054), ('primary', 0.053), ('artificial', 0.053), ('presentation', 0.052), ('accept', 0.052), ('controversial', 0.051), ('plus', 0.051), ('worth', 0.05), ('individual', 0.05), ('meet', 0.049), ('option', 0.047), ('knows', 0.047), ('time', 0.045)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999905 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
Introduction: This is a very difficult post to write, because it is about a perenially touchy subject. Nevertheless, it is an important one which needs to be thought about carefully. There are a few things which should be understood: The system is changing and responsive. We-the-authors are we-the-reviewers, we-the-PC, and even we-the-NIPS-board. NIPS has implemented ‘secondary program chairs’, ‘author response’, and ‘double blind reviewing’ in the last few years to help with the decision process, and more changes may happen in the future. Agreement creates a perception of correctness. When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. For the many people who have been on the NIPS PC it’s reasonable to entertain a healthy skepticism in the face of this reinforcing certainty. This post is about structural problems. What problems arise because of the structure
2 0.33344388 318 hunch net-2008-09-26-The SODA Program Committee
Introduction: Claire asked me to be on the SODA program committee this year, which was quite a bit of work. I had a relatively light load—merely 49 theory papers. Many of these papers were not on subjects that I was expert about, so (as is common for theory conferences) I found various reviewers that I trusted to help review the papers. I ended up reviewing about 1/3 personally. There were a couple instances where I ended up overruling a subreviewer whose logic seemed off, but otherwise I generally let their reviews stand. There are some differences in standards for paper reviews between the machine learning and theory communities. In machine learning it is expected that a review be detailed, while in the theory community this is often not the case. Every paper given to me ended up with a review varying between somewhat and very detailed. I’m sure not every author was happy with the outcome. While we did our best to make good decisions, they were difficult decisions to make. For exam
3 0.22981119 437 hunch net-2011-07-10-ICML 2011 and the future
Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI
4 0.1991069 40 hunch net-2005-03-13-Avoiding Bad Reviewing
Introduction: If we accept that bad reviewing often occurs and want to fix it, the question is “how”? Reviewing is done by paper writers just like yourself, so a good proxy for this question is asking “How can I be a better reviewer?” Here are a few things I’ve learned by trial (and error), as a paper writer, and as a reviewer. The secret ingredient is careful thought. There is no good substitution for a deep and careful understanding. Avoid reviewing papers that you feel competitive about. You almost certainly will be asked to review papers that feel competitive if you work on subjects of common interest. But, the feeling of competition can easily lead to bad judgement. If you feel biased for some other reason, then you should avoid reviewing. For example… Feeling angry or threatened by a paper is a form of bias. See above. Double blind yourself (avoid looking at the name even in a single-blind situation). The significant effect of a name you recognize is making you pay close a
5 0.19490629 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept
Introduction: Last year about this time, we received a conditional accept for the searn paper , which asked us to reference a paper that was not reasonable to cite because there was strictly more relevant work by the same authors that we already cited. We wrote a response explaining this, and didn’t cite it in the final draft, giving the SPC an excuse to reject the paper , leading to unhappiness for all. Later, Sanjoy Dasgupta suggested that an alternative was to talk to the PC chair instead, as soon as you see that a conditional accept is unreasonable. William Cohen and I spoke about this by email, the relevant bit of which is: If an SPC asks for a revision that is inappropriate, the correct action is to contact the chairs as soon as the decision is made, clearly explaining what the problem is, so we can decide whether or not to over-rule the SPC. As you say, this is extra work for us chairs, but that’s part of the job, and we’re willing to do that sort of work to improve the ov
6 0.19450437 463 hunch net-2012-05-02-ICML: Behind the Scenes
7 0.18582772 116 hunch net-2005-09-30-Research in conferences
8 0.18466169 453 hunch net-2012-01-28-Why COLT?
9 0.18437251 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
10 0.17828035 343 hunch net-2009-02-18-Decision by Vetocracy
11 0.17762142 468 hunch net-2012-06-29-ICML survey and comments
12 0.17637828 304 hunch net-2008-06-27-Reviewing Horror Stories
13 0.17277594 484 hunch net-2013-06-16-Representative Reviewing
14 0.15999943 315 hunch net-2008-09-03-Bidding Problems
15 0.15721995 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
16 0.14010425 454 hunch net-2012-01-30-ICML Posters and Scope
17 0.1265458 207 hunch net-2006-09-12-Incentive Compatible Reviewing
18 0.12513703 461 hunch net-2012-04-09-ICML author feedback is open
19 0.12100601 458 hunch net-2012-03-06-COLT-ICML Open Questions and ICML Instructions
20 0.11784915 447 hunch net-2011-10-10-ML Symposium and ICML details
topicId topicWeight
[(0, 0.247), (1, -0.201), (2, 0.195), (3, 0.07), (4, 0.042), (5, 0.12), (6, 0.005), (7, 0.0), (8, -0.017), (9, -0.039), (10, 0.011), (11, -0.007), (12, 0.019), (13, -0.01), (14, -0.019), (15, -0.011), (16, 0.035), (17, 0.032), (18, 0.002), (19, 0.002), (20, -0.008), (21, 0.062), (22, 0.043), (23, -0.027), (24, -0.066), (25, 0.021), (26, 0.03), (27, 0.022), (28, 0.062), (29, -0.009), (30, 0.046), (31, 0.003), (32, -0.042), (33, 0.0), (34, 0.06), (35, 0.052), (36, -0.018), (37, 0.039), (38, 0.041), (39, 0.009), (40, 0.007), (41, -0.017), (42, 0.024), (43, -0.003), (44, -0.022), (45, 0.048), (46, -0.065), (47, 0.014), (48, -0.025), (49, 0.054)]
simIndex simValue blogId blogTitle
same-blog 1 0.97390646 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
Introduction: This is a very difficult post to write, because it is about a perenially touchy subject. Nevertheless, it is an important one which needs to be thought about carefully. There are a few things which should be understood: The system is changing and responsive. We-the-authors are we-the-reviewers, we-the-PC, and even we-the-NIPS-board. NIPS has implemented ‘secondary program chairs’, ‘author response’, and ‘double blind reviewing’ in the last few years to help with the decision process, and more changes may happen in the future. Agreement creates a perception of correctness. When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. For the many people who have been on the NIPS PC it’s reasonable to entertain a healthy skepticism in the face of this reinforcing certainty. This post is about structural problems. What problems arise because of the structure
2 0.90305525 318 hunch net-2008-09-26-The SODA Program Committee
Introduction: Claire asked me to be on the SODA program committee this year, which was quite a bit of work. I had a relatively light load—merely 49 theory papers. Many of these papers were not on subjects that I was expert about, so (as is common for theory conferences) I found various reviewers that I trusted to help review the papers. I ended up reviewing about 1/3 personally. There were a couple instances where I ended up overruling a subreviewer whose logic seemed off, but otherwise I generally let their reviews stand. There are some differences in standards for paper reviews between the machine learning and theory communities. In machine learning it is expected that a review be detailed, while in the theory community this is often not the case. Every paper given to me ended up with a review varying between somewhat and very detailed. I’m sure not every author was happy with the outcome. While we did our best to make good decisions, they were difficult decisions to make. For exam
3 0.87813187 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
Introduction: Although I’m greatly interested in machine learning, I think it must be admitted that there is a large amount of low quality logic being used in reviews. The problem is bad enough that sometimes I wonder if the Byzantine generals limit has been exceeded. For example, I’ve seen recent reviews where the given reasons for rejecting are: [ NIPS ] Theorem A is uninteresting because Theorem B is uninteresting. [ UAI ] When you learn by memorization, the problem addressed is trivial. [NIPS] The proof is in the appendix. [NIPS] This has been done before. (… but not giving any relevant citations) Just for the record I want to point out what’s wrong with these reviews. A future world in which such reasons never come up again would be great, but I’m sure these errors will be committed many times more in the future. This is nonsense. A theorem should be evaluated based on it’s merits, rather than the merits of another theorem. Learning by memorization requires an expon
4 0.86028075 463 hunch net-2012-05-02-ICML: Behind the Scenes
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
5 0.84513158 304 hunch net-2008-06-27-Reviewing Horror Stories
Introduction: Essentially everyone who writes research papers suffers rejections. They always sting immediately, but upon further reflection many of these rejections come to seem reasonable. Maybe the equations had too many typos or maybe the topic just isn’t as important as was originally thought. A few rejections do not come to seem acceptable, and these form the basis of reviewing horror stories, a great material for conversations. I’ve decided to share three of mine, now all safely a bit distant in the past. Prediction Theory for Classification Tutorial . This is a tutorial about tight sample complexity bounds for classification that I submitted to JMLR . The first decision I heard was a reject which appeared quite unjust to me—for example one of the reviewers appeared to claim that all the content was in standard statistics books. Upon further inquiry, several citations were given, none of which actually covered the content. Later, I was shocked to hear the paper was accepted. App
6 0.84026712 484 hunch net-2013-06-16-Representative Reviewing
7 0.83302021 315 hunch net-2008-09-03-Bidding Problems
8 0.81413954 437 hunch net-2011-07-10-ICML 2011 and the future
9 0.79995197 461 hunch net-2012-04-09-ICML author feedback is open
10 0.78693342 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept
11 0.75401425 38 hunch net-2005-03-09-Bad Reviewing
12 0.72485656 343 hunch net-2009-02-18-Decision by Vetocracy
13 0.72469997 207 hunch net-2006-09-12-Incentive Compatible Reviewing
14 0.71080637 40 hunch net-2005-03-13-Avoiding Bad Reviewing
15 0.69510102 52 hunch net-2005-04-04-Grounds for Rejection
16 0.67888814 453 hunch net-2012-01-28-Why COLT?
17 0.66822368 468 hunch net-2012-06-29-ICML survey and comments
18 0.66687298 116 hunch net-2005-09-30-Research in conferences
19 0.66674948 180 hunch net-2006-05-21-NIPS paper evaluation criteria
20 0.66632956 98 hunch net-2005-07-27-Not goal metrics
topicId topicWeight
[(10, 0.034), (27, 0.172), (38, 0.049), (48, 0.024), (53, 0.033), (55, 0.112), (56, 0.012), (83, 0.018), (92, 0.015), (94, 0.344), (95, 0.043)]
simIndex simValue blogId blogTitle
1 0.98809361 81 hunch net-2005-06-13-Wikis for Summer Schools and Workshops
Introduction: Chicago ’05 ended a couple of weeks ago. This was the sixth Machine Learning Summer School , and the second one that used a wiki . (The first was Berder ’04, thanks to Gunnar Raetsch.) Wikis are relatively easy to set up, greatly aid social interaction, and should be used a lot more at summer schools and workshops. They can even be used as the meeting’s webpage, as a permanent record of its participants’ collaborations — see for example the wiki/website for last year’s NVO Summer School . A basic wiki is a collection of editable webpages, maintained by software called a wiki engine . The engine used at both Berder and Chicago was TikiWiki — it is well documented and gets you something running fast. It uses PHP and MySQL, but doesn’t require you to know either. Tikiwiki has far more features than most wikis, as it is really a full Content Management System . (My thanks to Sebastian Stark for pointing this out.) Here are the features we found most useful: Bulletin boa
2 0.97745872 115 hunch net-2005-09-26-Prediction Bounds as the Mathematics of Science
Introduction: “Science” has many meanings, but one common meaning is “the scientific method ” which is a principled method for investigating the world using the following steps: Form a hypothesis about the world. Use the hypothesis to make predictions. Run experiments to confirm or disprove the predictions. The ordering of these steps is very important to the scientific method. In particular, predictions must be made before experiments are run. Given that we all believe in the scientific method of investigation, it may be surprising to learn that cheating is very common. This happens for many reasons, some innocent and some not. Drug studies. Pharmaceutical companies make predictions about the effects of their drugs and then conduct blind clinical studies to determine their effect. Unfortunately, they have also been caught using some of the more advanced techniques for cheating here : including “reprobleming”, “data set selection”, and probably “overfitting by review”
3 0.96403807 346 hunch net-2009-03-18-Parallel ML primitives
Introduction: Previously, we discussed parallel machine learning a bit. As parallel ML is rather difficult, I’d like to describe my thinking at the moment, and ask for advice from the rest of the world. This is particularly relevant right now, as I’m attending a workshop tomorrow on parallel ML. Parallelizing slow algorithms seems uncompelling. Parallelizing many algorithms also seems uncompelling, because the effort required to parallelize is substantial. This leaves the question: Which one fast algorithm is the best to parallelize? What is a substantially different second? One compellingly fast simple algorithm is online gradient descent on a linear representation. This is the core of Leon’s sgd code and Vowpal Wabbit . Antoine Bordes showed a variant was competitive in the large scale learning challenge . It’s also a decades old primitive which has been reused in many algorithms, and continues to be reused. It also applies to online learning rather than just online optimiz
4 0.96305478 120 hunch net-2005-10-10-Predictive Search is Coming
Introduction: “Search” is the other branch of AI research which has been succesful. Concrete examples include Deep Blue which beat the world chess champion and Chinook the champion checkers program. A set of core search techniques exist including A * , alpha-beta pruning, and others that can be applied to any of many different search problems. Given this, it may be surprising to learn that there has been relatively little succesful work on combining prediction and search. Given also that humans typically solve search problems using a number of predictive heuristics to narrow in on a solution, we might be surprised again. However, the big successful search-based systems have typically not used “smart” search algorithms. Insteady they have optimized for very fast search. This is not for lack of trying… many people have tried to synthesize search and prediction to various degrees of success. For example, Knightcap achieves good-but-not-stellar chess playing performance, and TD-gammon
5 0.95690405 42 hunch net-2005-03-17-Going all the Way, Sometimes
Introduction: At many points in research, you face a choice: should I keep on improving some old piece of technology or should I do something new? For example: Should I refine bounds to make them tighter? Should I take some learning theory and turn it into a learning algorithm? Should I implement the learning algorithm? Should I test the learning algorithm widely? Should I release the algorithm as source code? Should I go see what problems people actually need to solve? The universal temptation of people attracted to research is doing something new. That is sometimes the right decision, but is also often not. I’d like to discuss some reasons why not. Expertise Once expertise are developed on some subject, you are the right person to refine them. What is the real problem? Continually improving a piece of technology is a mechanism forcing you to confront this question. In many cases, this confrontation is uncomfortable because you discover that your method has fundamen
same-blog 6 0.94901192 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
7 0.94732177 35 hunch net-2005-03-04-The Big O and Constants in Learning
8 0.94682741 276 hunch net-2007-12-10-Learning Track of International Planning Competition
9 0.87747335 136 hunch net-2005-12-07-Is the Google way the way for machine learning?
10 0.87198538 229 hunch net-2007-01-26-Parallel Machine Learning Problems
11 0.84100085 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
12 0.81952488 423 hunch net-2011-02-02-User preferences for search engines
13 0.81728697 73 hunch net-2005-05-17-A Short Guide to PhD Graduate Study
14 0.81433612 75 hunch net-2005-05-28-Running A Machine Learning Summer School
15 0.81407857 146 hunch net-2006-01-06-MLTV
16 0.81377846 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning
17 0.81176996 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy
18 0.81086522 306 hunch net-2008-07-02-Proprietary Data in Academic Research?
19 0.80345345 253 hunch net-2007-07-06-Idempotent-capable Predictors
20 0.80076045 178 hunch net-2006-05-08-Big machine learning