hunch_net hunch_net-2006 hunch_net-2006-207 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce
sentIndex sentText sentNum sentScore
1 Given this integral nature, the quality of reviewing is often frustrating. [sent-2, score-0.385]
2 Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. [sent-4, score-0.418]
3 My personal experience is that these mechanisms help, especially the author feedback. [sent-5, score-0.342]
4 The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. [sent-7, score-0.931]
5 Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. [sent-8, score-0.358]
6 ) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acceptable (say: subyear) timespan. [sent-10, score-0.824]
7 We could try to engineer new mechanisms for finding a reference truth into a conference and then use a ‘proper scoring rule’ which is incentive compatible. [sent-12, score-1.34]
8 Consequently, the understanding of the paper at the conference is not nearly as deep as, say, after reading through it carefully in a reading group. [sent-15, score-0.362]
9 This is inherently useless for judging reviews of rejected papers and it is highly biased for judging reviews of papers presented in two different formats (say, a poster versus an oral presentation). [sent-16, score-0.794]
10 We could ignore the time issue and try to measure reviewer performance based upon (say) long term citation count. [sent-17, score-0.641]
11 Who the reviewers are and how an individual reviewer reviews may change drastically in just a 5 year timespan. [sent-19, score-0.761]
12 A system which can provide track records for only a small subset of current reviewers isn’t very capable. [sent-20, score-0.344]
13 We could try to manufacture an incentive compatible system even when the truth is never known. [sent-21, score-0.928]
14 Essentially, the scheme works by rewarding reviewer i according to a proper scoring rule applied to P(reviewer j’s score | reviewer i’s score). [sent-23, score-1.233]
15 (A simple example of a proper scoring rule is log[P()] . [sent-24, score-0.512]
16 The significant problem I see is that this mechanism may reward joint agreement instead of a good contribution towards good joint decision making. [sent-26, score-0.391]
17 None of these mechanisms are perfect, but they may each yield a little bit of information about what was or was not a good decision over time. [sent-27, score-0.364]
18 Combining these sources of information to create some reviewer judgement system may yield another small improvement in the reviewing process. [sent-28, score-0.807]
19 Are we interested in tracking our reviewing performance over time in order to make better judgements? [sent-30, score-0.453]
20 Such tracking often happens on an anecdotal or personal basis, but shifting to an automated incentive compatible system would be a big change in scope. [sent-31, score-0.787]
wordName wordTfidf (topN-words)
[('reviewing', 0.258), ('reviewer', 0.241), ('incentive', 0.237), ('scoring', 0.198), ('mechanisms', 0.184), ('truthful', 0.178), ('proper', 0.178), ('try', 0.163), ('truth', 0.16), ('reviewers', 0.147), ('say', 0.142), ('rule', 0.136), ('judging', 0.132), ('incentives', 0.132), ('tracking', 0.132), ('reviews', 0.13), ('reading', 0.128), ('system', 0.128), ('integral', 0.127), ('compatible', 0.127), ('could', 0.113), ('joint', 0.112), ('conference', 0.106), ('score', 0.104), ('reference', 0.1), ('may', 0.094), ('personal', 0.087), ('yield', 0.086), ('isn', 0.084), ('engineer', 0.079), ('oral', 0.079), ('perverse', 0.079), ('change', 0.076), ('judgements', 0.073), ('announcements', 0.073), ('contribution', 0.073), ('drastically', 0.073), ('author', 0.071), ('problems', 0.069), ('records', 0.069), ('rewarding', 0.069), ('combining', 0.069), ('versus', 0.069), ('feasibility', 0.069), ('richard', 0.069), ('discusses', 0.066), ('scheme', 0.066), ('performance', 0.063), ('papers', 0.061), ('citation', 0.061)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999976 207 hunch net-2006-09-12-Incentive Compatible Reviewing
Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce
2 0.30475056 484 hunch net-2013-06-16-Representative Reviewing
Introduction: When thinking about how best to review papers, it seems helpful to have some conception of what good reviewing is. As far as I can tell, this is almost always only discussed in the specific context of a paper (i.e. your rejected paper), or at most an area (i.e. what a “good paper” looks like for that area) rather than general principles. Neither individual papers or areas are sufficiently general for a large conference—every paper differs in the details, and what if you want to build a new area and/or cross areas? An unavoidable reason for reviewing is that the community of research is too large. In particular, it is not possible for a researcher to read every paper which someone thinks might be of interest. This reason for reviewing exists independent of constraints on rooms or scheduling formats of individual conferences. Indeed, history suggests that physical constraints are relatively meaningless over the long term — growing conferences simply use more rooms and/or change fo
3 0.24836966 395 hunch net-2010-04-26-Compassionate Reviewing
Introduction: Most long conversations between academics seem to converge on the topic of reviewing where almost no one is happy. A basic question is: Should most people be happy? The case against is straightforward. Anyone who watches the flow of papers realizes that most papers amount to little in the longer term. By it’s nature research is brutal, where the second-best method is worthless, and the second person to discover things typically gets no credit. If you think about this for a moment, it’s very different from most other human endeavors. The second best migrant laborer, construction worker, manager, conductor, quarterback, etc… all can manage quite well. If a reviewer has even a vaguely predictive sense of what’s important in the longer term, then most people submitting papers will be unhappy. But this argument unravels, in my experience. Perhaps half of reviews are thoughtless or simply wrong with a small part being simply malicious. And yet, I’m sure that most reviewers genuine
4 0.23066622 437 hunch net-2011-07-10-ICML 2011 and the future
Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI
5 0.22786742 40 hunch net-2005-03-13-Avoiding Bad Reviewing
Introduction: If we accept that bad reviewing often occurs and want to fix it, the question is “how”? Reviewing is done by paper writers just like yourself, so a good proxy for this question is asking “How can I be a better reviewer?” Here are a few things I’ve learned by trial (and error), as a paper writer, and as a reviewer. The secret ingredient is careful thought. There is no good substitution for a deep and careful understanding. Avoid reviewing papers that you feel competitive about. You almost certainly will be asked to review papers that feel competitive if you work on subjects of common interest. But, the feeling of competition can easily lead to bad judgement. If you feel biased for some other reason, then you should avoid reviewing. For example… Feeling angry or threatened by a paper is a form of bias. See above. Double blind yourself (avoid looking at the name even in a single-blind situation). The significant effect of a name you recognize is making you pay close a
6 0.22720419 461 hunch net-2012-04-09-ICML author feedback is open
7 0.22062926 343 hunch net-2009-02-18-Decision by Vetocracy
8 0.21301629 315 hunch net-2008-09-03-Bidding Problems
9 0.21151029 38 hunch net-2005-03-09-Bad Reviewing
10 0.20198883 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
11 0.1851207 116 hunch net-2005-09-30-Research in conferences
12 0.16981611 304 hunch net-2008-06-27-Reviewing Horror Stories
13 0.16757071 318 hunch net-2008-09-26-The SODA Program Committee
14 0.15821511 65 hunch net-2005-05-02-Reviewing techniques for conferences
15 0.15391102 454 hunch net-2012-01-30-ICML Posters and Scope
16 0.15257895 468 hunch net-2012-06-29-ICML survey and comments
17 0.14990641 452 hunch net-2012-01-04-Why ICML? and the summer conferences
18 0.14681339 51 hunch net-2005-04-01-The Producer-Consumer Model of Research
19 0.14467442 463 hunch net-2012-05-02-ICML: Behind the Scenes
20 0.13983849 98 hunch net-2005-07-27-Not goal metrics
topicId topicWeight
[(0, 0.284), (1, -0.173), (2, 0.252), (3, 0.126), (4, 0.01), (5, 0.056), (6, -0.024), (7, 0.036), (8, 0.014), (9, 0.009), (10, -0.047), (11, 0.01), (12, 0.092), (13, -0.032), (14, -0.084), (15, 0.016), (16, 0.013), (17, 0.019), (18, 0.025), (19, 0.007), (20, 0.007), (21, -0.098), (22, 0.053), (23, -0.012), (24, -0.025), (25, 0.026), (26, -0.012), (27, -0.014), (28, -0.003), (29, 0.017), (30, 0.018), (31, 0.047), (32, 0.038), (33, 0.022), (34, 0.003), (35, -0.05), (36, -0.012), (37, -0.001), (38, -0.053), (39, -0.05), (40, -0.063), (41, -0.059), (42, -0.018), (43, -0.037), (44, 0.007), (45, -0.004), (46, -0.077), (47, 0.014), (48, 0.022), (49, -0.103)]
simIndex simValue blogId blogTitle
same-blog 1 0.9807173 207 hunch net-2006-09-12-Incentive Compatible Reviewing
Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce
2 0.86706787 315 hunch net-2008-09-03-Bidding Problems
Introduction: One way that many conferences in machine learning assign reviewers to papers is via bidding, which has steps something like: Invite people to review Accept papers Reviewers look at title and abstract and state the papers they are interested in reviewing. Some massaging happens, but reviewers often get approximately the papers they bid for. At the ICML business meeting, Andrew McCallum suggested getting rid of bidding for papers. A couple reasons were given: Privacy The title and abstract of the entire set of papers is visible to every participating reviewer. Some authors might be uncomfortable about this for submitted papers. I’m not sympathetic to this reason: the point of submitting a paper to review is to publish it, so the value (if any) of not publishing a part of it a little bit earlier seems limited. Cliques A bidding system is gameable. If you have 3 buddies and you inform each other of your submissions, you can each bid for your friend’s papers a
3 0.86556137 484 hunch net-2013-06-16-Representative Reviewing
Introduction: When thinking about how best to review papers, it seems helpful to have some conception of what good reviewing is. As far as I can tell, this is almost always only discussed in the specific context of a paper (i.e. your rejected paper), or at most an area (i.e. what a “good paper” looks like for that area) rather than general principles. Neither individual papers or areas are sufficiently general for a large conference—every paper differs in the details, and what if you want to build a new area and/or cross areas? An unavoidable reason for reviewing is that the community of research is too large. In particular, it is not possible for a researcher to read every paper which someone thinks might be of interest. This reason for reviewing exists independent of constraints on rooms or scheduling formats of individual conferences. Indeed, history suggests that physical constraints are relatively meaningless over the long term — growing conferences simply use more rooms and/or change fo
4 0.85658151 461 hunch net-2012-04-09-ICML author feedback is open
Introduction: as of last night, late. When the reviewing deadline passed Wednesday night 15% of reviews were still missing, much higher than I expected. Between late reviews coming in, ACs working overtime through the weekend, and people willing to help in the pinch another ~390 reviews came in, reducing the missing mass to 0.2%. Nailing that last bit and a similar quantity of papers with uniformly low confidence reviews is what remains to be done in terms of basic reviews. We are trying to make all of those happen this week so authors have some chance to respond. I was surprised by the quantity of late reviews, and I think that’s an area where ICML needs to improve in future years. Good reviews are not done in a rush—they are done by setting aside time (like an afternoon), and carefully reading the paper while thinking about implications. Many reviewers do this well but a significant minority aren’t good at scheduling their personal time. In this situation there are several ways to fail:
5 0.85003144 40 hunch net-2005-03-13-Avoiding Bad Reviewing
Introduction: If we accept that bad reviewing often occurs and want to fix it, the question is “how”? Reviewing is done by paper writers just like yourself, so a good proxy for this question is asking “How can I be a better reviewer?” Here are a few things I’ve learned by trial (and error), as a paper writer, and as a reviewer. The secret ingredient is careful thought. There is no good substitution for a deep and careful understanding. Avoid reviewing papers that you feel competitive about. You almost certainly will be asked to review papers that feel competitive if you work on subjects of common interest. But, the feeling of competition can easily lead to bad judgement. If you feel biased for some other reason, then you should avoid reviewing. For example… Feeling angry or threatened by a paper is a form of bias. See above. Double blind yourself (avoid looking at the name even in a single-blind situation). The significant effect of a name you recognize is making you pay close a
6 0.80521029 463 hunch net-2012-05-02-ICML: Behind the Scenes
7 0.80204219 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
8 0.78211719 395 hunch net-2010-04-26-Compassionate Reviewing
9 0.77468675 38 hunch net-2005-03-09-Bad Reviewing
10 0.76857758 116 hunch net-2005-09-30-Research in conferences
11 0.75393474 343 hunch net-2009-02-18-Decision by Vetocracy
12 0.75118744 485 hunch net-2013-06-29-The Benefits of Double-Blind Review
13 0.71383512 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
14 0.7035349 318 hunch net-2008-09-26-The SODA Program Committee
15 0.6944899 437 hunch net-2011-07-10-ICML 2011 and the future
16 0.69229019 52 hunch net-2005-04-04-Grounds for Rejection
17 0.68925852 98 hunch net-2005-07-27-Not goal metrics
18 0.67405915 363 hunch net-2009-07-09-The Machine Learning Forum
19 0.6516487 304 hunch net-2008-06-27-Reviewing Horror Stories
20 0.63762796 468 hunch net-2012-06-29-ICML survey and comments
topicId topicWeight
[(2, 0.225), (10, 0.065), (27, 0.202), (38, 0.028), (53, 0.113), (55, 0.123), (92, 0.017), (94, 0.089), (95, 0.048)]
simIndex simValue blogId blogTitle
1 0.93650079 47 hunch net-2005-03-28-Open Problems for Colt
Introduction: Adam Klivans and Rocco Servedio are looking for open (learning theory) problems for COLT . This is a good idea in the same way that the KDDcup challenge is a good idea: crisp problem definitions that anyone can attack yield solutions that advance science.
same-blog 2 0.8911497 207 hunch net-2006-09-12-Incentive Compatible Reviewing
Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce
3 0.76713997 437 hunch net-2011-07-10-ICML 2011 and the future
Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI
4 0.76293063 454 hunch net-2012-01-30-ICML Posters and Scope
Introduction: Normally, I don’t indulge in posters for ICML , but this year is naturally an exception for me. If you want one, there are a small number left here , if you sign up before February. It also seems worthwhile to give some sense of the scope and reviewing criteria for ICML for authors considering submitting papers. At ICML, the (very large) program committee does the reviewing which informs final decisions by area chairs on most papers. Program chairs setup the process, deal with exceptions or disagreements, and provide advice for the reviewing process. Providing advice is tricky (and easily misleading) because a conference is a community, and in the end the aggregate interests of the community determine the conference. Nevertheless, as a program chair this year it seems worthwhile to state the overall philosophy I have and what I plan to encourage (and occasionally discourage). At the highest level, I believe ICML exists to further research into machine learning, which I gene
5 0.75704157 244 hunch net-2007-05-09-The Missing Bound
Introduction: Sham Kakade points out that we are missing a bound. Suppose we have m samples x drawn IID from some distribution D . Through the magic of exponential moment method we know that: If the range of x is bounded by an interval of size I , a Chernoff/Hoeffding style bound gives us a bound on the deviations like O(I/m 0.5 ) (at least in crude form). A proof is on page 9 here . If the range of x is bounded, and the variance (or a bound on the variance) is known, then Bennett’s bound can give tighter results (*). This can be a huge improvment when the true variance small. What’s missing here is a bound that depends on the observed variance rather than a bound on the variance. This means that many people attempt to use Bennett’s bound (incorrectly) by plugging the observed variance in as the true variance, invalidating the bound application. Most of the time, they get away with it, but this is a dangerous move when doing machine learning. In machine learning,
6 0.75019264 297 hunch net-2008-04-22-Taking the next step
7 0.7454102 141 hunch net-2005-12-17-Workshops as Franchise Conferences
8 0.74521369 343 hunch net-2009-02-18-Decision by Vetocracy
9 0.74507403 131 hunch net-2005-11-16-The Everything Ensemble Edge
10 0.74425256 225 hunch net-2007-01-02-Retrospective
11 0.74231994 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer
12 0.74153572 95 hunch net-2005-07-14-What Learning Theory might do
13 0.74017638 134 hunch net-2005-12-01-The Webscience Future
14 0.73856211 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
15 0.73774272 132 hunch net-2005-11-26-The Design of an Optimal Research Environment
16 0.73680258 22 hunch net-2005-02-18-What it means to do research.
17 0.73573422 333 hunch net-2008-12-27-Adversarial Academia
18 0.73538929 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
19 0.73532808 332 hunch net-2008-12-23-Use of Learning Theory
20 0.73433137 382 hunch net-2009-12-09-Future Publication Models @ NIPS