hunch_net hunch_net-2012 hunch_net-2012-463 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
sentIndex sentText sentNum sentScore
1 Each of these area chairs was asked to nominate a list of potential reviewers. [sent-19, score-0.557]
2 Bidding An unsupervised version of TMS was used to generate a list of candidate papers for each reviewer and area chair. [sent-29, score-0.645]
3 CMT did not have the functionality to show a good list of candidate papers to reviewers, so we crafted an interface to show this list and let reviewers use that in conjunction with CMT. [sent-31, score-0.746]
4 Another group received a list based on the matching between their subject area and that of the paper (referred to as the “relevance” score in CMT). [sent-37, score-0.754]
5 The third group received a list based on a mix of both TMS and relevance. [sent-38, score-0.458]
6 We can also see that those who were presented with a list based on TMS scores were more likely to find the list useful. [sent-64, score-0.475]
7 This is a real shortcoming when it comes to matching papers to reviewers, especially for those reviewers that did not bid often. [sent-68, score-0.58]
8 To mitigate this problem, we used the click information on the shortlist presented to the reviewers to find out which papers have been observed and ignored. [sent-69, score-0.446]
9 This is problematic because the tools used to do the actual reviewer-paper matching tend to assign the papers without any bids to the reviewers who did not bid, regardless of the match in expertise. [sent-72, score-0.667]
10 To deal with the wildly varying number of bids per person, we imputed zero bids, first from papers that were plausibly skipped over, and if necessary at random from papers not bid on such that each person had the same expected bid in the dataset. [sent-76, score-0.872]
11 This trained predictor was used to predict bid values for the full paper-reviewer bid matrix. [sent-80, score-0.45]
12 Automated Area Chair and First Reviewer Assignment Once we had the imputed paper-reviewer bidding matrix, CMT was used to generate the actual match between papers and area chairs, and (separately) between papers and reviewers. [sent-81, score-0.652]
13 Each paper had two area chairs (sometimes called “meta-reviewers” in CMT) assigned to it, one primary, one secondary, by running two rounds of assignments (so that the primary was usually the “better” match). [sent-82, score-0.559]
14 CMT provides proper load balancing, so that all area chairs and reviewers had similar loads. [sent-84, score-0.512]
15 Manual Checks of the Automated Assignments Before finalizing the automated assignment, we manually looked through the list of papers to fix any potential problems that were not handled by the automated process. [sent-85, score-0.517]
16 The two major cases were papers that did not go through the TMS system (authors did not agree to do so), and cases of poor primary-secondary meta-reviewer pairs (when the two area chairs are judged to be too close to offer independent assessment, e. [sent-86, score-0.527]
17 Second and Third Reviewer Assignment Once the initial assignments were announced, we asked the two area chairs for a given paper to each manually assign another reviewer from the PC. [sent-89, score-0.659]
18 To help area chairs with this, we generated a shortlist of 10 recommended reviewers for each paper (using the estimated bid matrix and TMS score, with the CMT matching algorithm for load balancing of reviewer suggestions. [sent-90, score-1.116]
19 In a small number of regular submissions (less than 10), we received 2 very negative reviews and notified the third reviewer (who was usually late by this point! [sent-97, score-0.447]
20 Final Decisions To help us better decide on the quality of the papers, we asked the primary area chairs to provide a meta-review for each of their papers. [sent-107, score-0.409]
wordName wordTfidf (topN-words)
[('tms', 0.509), ('cmt', 0.495), ('bid', 0.199), ('list', 0.196), ('reviewers', 0.17), ('bids', 0.17), ('area', 0.149), ('chairs', 0.148), ('papers', 0.128), ('reviewer', 0.12), ('group', 0.118), ('received', 0.09), ('pc', 0.085), ('matching', 0.083), ('bidding', 0.083), ('scores', 0.083), ('reviews', 0.083), ('laurent', 0.075), ('manually', 0.065), ('match', 0.064), ('authors', 0.064), ('automated', 0.064), ('asked', 0.064), ('paper', 0.061), ('sorting', 0.06), ('total', 0.059), ('score', 0.057), ('review', 0.057), ('interface', 0.056), ('assigned', 0.054), ('third', 0.054), ('committee', 0.054), ('submissions', 0.053), ('used', 0.052), ('assignments', 0.052), ('cases', 0.051), ('sent', 0.05), ('secondary', 0.05), ('balancing', 0.048), ('hectic', 0.048), ('imputed', 0.048), ('mitigate', 0.048), ('shortlist', 0.048), ('unanimous', 0.048), ('preferred', 0.048), ('assignment', 0.048), ('primary', 0.048), ('usually', 0.047), ('load', 0.045), ('matrix', 0.045)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 463 hunch net-2012-05-02-ICML: Behind the Scenes
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
2 0.28852075 447 hunch net-2011-10-10-ML Symposium and ICML details
Introduction: Everyone should have received notice for NY ML Symposium abstracts. Check carefully, as one was lost by our system. The event itself is October 21, next week. Leon Bottou , Stephen Boyd , and Yoav Freund are giving the invited talks this year, and there are many spotlights on local work spread throughout the day. Chris Wiggins has setup 6(!) ML-interested startups to follow the symposium, which should be of substantial interest to the employment interested. I also wanted to give an update on ICML 2012 . Unlike last year, our deadline is coordinated with AIStat (which is due this Friday). The paper deadline for ICML has been pushed back to February 24 which should allow significant time for finishing up papers after the winter break. Other details may interest people as well: We settled on using CMT after checking out the possibilities. I wasn’t looking for this, because I’ve often found CMT clunky in terms of easy access to the right information. Nevert
3 0.26844916 315 hunch net-2008-09-03-Bidding Problems
Introduction: One way that many conferences in machine learning assign reviewers to papers is via bidding, which has steps something like: Invite people to review Accept papers Reviewers look at title and abstract and state the papers they are interested in reviewing. Some massaging happens, but reviewers often get approximately the papers they bid for. At the ICML business meeting, Andrew McCallum suggested getting rid of bidding for papers. A couple reasons were given: Privacy The title and abstract of the entire set of papers is visible to every participating reviewer. Some authors might be uncomfortable about this for submitted papers. I’m not sympathetic to this reason: the point of submitting a paper to review is to publish it, so the value (if any) of not publishing a part of it a little bit earlier seems limited. Cliques A bidding system is gameable. If you have 3 buddies and you inform each other of your submissions, you can each bid for your friend’s papers a
4 0.25115946 484 hunch net-2013-06-16-Representative Reviewing
Introduction: When thinking about how best to review papers, it seems helpful to have some conception of what good reviewing is. As far as I can tell, this is almost always only discussed in the specific context of a paper (i.e. your rejected paper), or at most an area (i.e. what a “good paper” looks like for that area) rather than general principles. Neither individual papers or areas are sufficiently general for a large conference—every paper differs in the details, and what if you want to build a new area and/or cross areas? An unavoidable reason for reviewing is that the community of research is too large. In particular, it is not possible for a researcher to read every paper which someone thinks might be of interest. This reason for reviewing exists independent of constraints on rooms or scheduling formats of individual conferences. Indeed, history suggests that physical constraints are relatively meaningless over the long term — growing conferences simply use more rooms and/or change fo
5 0.19450437 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
Introduction: This is a very difficult post to write, because it is about a perenially touchy subject. Nevertheless, it is an important one which needs to be thought about carefully. There are a few things which should be understood: The system is changing and responsive. We-the-authors are we-the-reviewers, we-the-PC, and even we-the-NIPS-board. NIPS has implemented ‘secondary program chairs’, ‘author response’, and ‘double blind reviewing’ in the last few years to help with the decision process, and more changes may happen in the future. Agreement creates a perception of correctness. When any PC meets and makes a group decision about a paper, there is a strong tendency for the reinforcement inherent in a group decision to create the perception of correctness. For the many people who have been on the NIPS PC it’s reasonable to entertain a healthy skepticism in the face of this reinforcing certainty. This post is about structural problems. What problems arise because of the structure
6 0.19337027 461 hunch net-2012-04-09-ICML author feedback is open
7 0.18783528 343 hunch net-2009-02-18-Decision by Vetocracy
8 0.17430295 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
9 0.16872239 453 hunch net-2012-01-28-Why COLT?
10 0.16484225 437 hunch net-2011-07-10-ICML 2011 and the future
11 0.14467442 207 hunch net-2006-09-12-Incentive Compatible Reviewing
12 0.14409338 304 hunch net-2008-06-27-Reviewing Horror Stories
13 0.14111181 318 hunch net-2008-09-26-The SODA Program Committee
14 0.13016649 454 hunch net-2012-01-30-ICML Posters and Scope
15 0.12670287 40 hunch net-2005-03-13-Avoiding Bad Reviewing
16 0.12331507 38 hunch net-2005-03-09-Bad Reviewing
17 0.11850613 116 hunch net-2005-09-30-Research in conferences
18 0.11625533 468 hunch net-2012-06-29-ICML survey and comments
19 0.11216693 466 hunch net-2012-06-05-ICML acceptance statistics
20 0.10613771 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept
topicId topicWeight
[(0, 0.206), (1, -0.183), (2, 0.203), (3, 0.062), (4, 0.068), (5, 0.107), (6, -0.044), (7, -0.015), (8, -0.049), (9, -0.078), (10, 0.019), (11, -0.04), (12, 0.053), (13, -0.035), (14, -0.052), (15, -0.005), (16, -0.007), (17, 0.025), (18, 0.013), (19, -0.011), (20, 0.054), (21, 0.014), (22, 0.026), (23, -0.033), (24, -0.076), (25, 0.029), (26, -0.001), (27, 0.004), (28, 0.101), (29, 0.051), (30, 0.08), (31, 0.023), (32, 0.074), (33, 0.015), (34, -0.018), (35, 0.096), (36, -0.022), (37, 0.056), (38, 0.042), (39, -0.031), (40, -0.065), (41, -0.055), (42, -0.037), (43, -0.036), (44, -0.01), (45, -0.039), (46, -0.074), (47, 0.013), (48, 0.042), (49, -0.033)]
simIndex simValue blogId blogTitle
same-blog 1 0.96933103 463 hunch net-2012-05-02-ICML: Behind the Scenes
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
2 0.81559509 461 hunch net-2012-04-09-ICML author feedback is open
Introduction: as of last night, late. When the reviewing deadline passed Wednesday night 15% of reviews were still missing, much higher than I expected. Between late reviews coming in, ACs working overtime through the weekend, and people willing to help in the pinch another ~390 reviews came in, reducing the missing mass to 0.2%. Nailing that last bit and a similar quantity of papers with uniformly low confidence reviews is what remains to be done in terms of basic reviews. We are trying to make all of those happen this week so authors have some chance to respond. I was surprised by the quantity of late reviews, and I think that’s an area where ICML needs to improve in future years. Good reviews are not done in a rush—they are done by setting aside time (like an afternoon), and carefully reading the paper while thinking about implications. Many reviewers do this well but a significant minority aren’t good at scheduling their personal time. In this situation there are several ways to fail:
3 0.8036955 315 hunch net-2008-09-03-Bidding Problems
Introduction: One way that many conferences in machine learning assign reviewers to papers is via bidding, which has steps something like: Invite people to review Accept papers Reviewers look at title and abstract and state the papers they are interested in reviewing. Some massaging happens, but reviewers often get approximately the papers they bid for. At the ICML business meeting, Andrew McCallum suggested getting rid of bidding for papers. A couple reasons were given: Privacy The title and abstract of the entire set of papers is visible to every participating reviewer. Some authors might be uncomfortable about this for submitted papers. I’m not sympathetic to this reason: the point of submitting a paper to review is to publish it, so the value (if any) of not publishing a part of it a little bit earlier seems limited. Cliques A bidding system is gameable. If you have 3 buddies and you inform each other of your submissions, you can each bid for your friend’s papers a
4 0.80123222 484 hunch net-2013-06-16-Representative Reviewing
Introduction: When thinking about how best to review papers, it seems helpful to have some conception of what good reviewing is. As far as I can tell, this is almost always only discussed in the specific context of a paper (i.e. your rejected paper), or at most an area (i.e. what a “good paper” looks like for that area) rather than general principles. Neither individual papers or areas are sufficiently general for a large conference—every paper differs in the details, and what if you want to build a new area and/or cross areas? An unavoidable reason for reviewing is that the community of research is too large. In particular, it is not possible for a researcher to read every paper which someone thinks might be of interest. This reason for reviewing exists independent of constraints on rooms or scheduling formats of individual conferences. Indeed, history suggests that physical constraints are relatively meaningless over the long term — growing conferences simply use more rooms and/or change fo
5 0.79340577 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
Introduction: Although I’m greatly interested in machine learning, I think it must be admitted that there is a large amount of low quality logic being used in reviews. The problem is bad enough that sometimes I wonder if the Byzantine generals limit has been exceeded. For example, I’ve seen recent reviews where the given reasons for rejecting are: [ NIPS ] Theorem A is uninteresting because Theorem B is uninteresting. [ UAI ] When you learn by memorization, the problem addressed is trivial. [NIPS] The proof is in the appendix. [NIPS] This has been done before. (… but not giving any relevant citations) Just for the record I want to point out what’s wrong with these reviews. A future world in which such reasons never come up again would be great, but I’m sure these errors will be committed many times more in the future. This is nonsense. A theorem should be evaluated based on it’s merits, rather than the merits of another theorem. Learning by memorization requires an expon
6 0.76322836 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making
7 0.74378836 304 hunch net-2008-06-27-Reviewing Horror Stories
8 0.73596376 38 hunch net-2005-03-09-Bad Reviewing
9 0.73031342 318 hunch net-2008-09-26-The SODA Program Committee
10 0.72749847 207 hunch net-2006-09-12-Incentive Compatible Reviewing
11 0.71018434 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept
12 0.67291284 363 hunch net-2009-07-09-The Machine Learning Forum
13 0.66229564 437 hunch net-2011-07-10-ICML 2011 and the future
14 0.64192772 343 hunch net-2009-02-18-Decision by Vetocracy
15 0.62176472 40 hunch net-2005-03-13-Avoiding Bad Reviewing
16 0.60146105 466 hunch net-2012-06-05-ICML acceptance statistics
17 0.60110307 468 hunch net-2012-06-29-ICML survey and comments
18 0.59343141 485 hunch net-2013-06-29-The Benefits of Double-Blind Review
19 0.58410573 453 hunch net-2012-01-28-Why COLT?
20 0.56236082 52 hunch net-2005-04-04-Grounds for Rejection
topicId topicWeight
[(3, 0.039), (9, 0.016), (10, 0.03), (27, 0.168), (36, 0.014), (38, 0.031), (48, 0.028), (51, 0.015), (53, 0.035), (54, 0.015), (55, 0.115), (60, 0.015), (67, 0.219), (90, 0.011), (92, 0.041), (94, 0.088), (95, 0.028)]
simIndex simValue blogId blogTitle
1 0.9195646 192 hunch net-2006-07-08-Some recent papers
Introduction: It was a fine time for learning in Pittsburgh. John and Sam mentioned some of my favorites. Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. Definitely an area I’m interested in. Maximum Entropy Distribution Estimation with Generalized Regularization Miroslav DudÃÂk, Robert E. Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. Are there advantages/disadvantages? Higher Order Learning On Graphs> by Sameer Agarwal, Kristin Branson, and Serge Belongie, looks to be interesteding. They seem to poo-poo “tensorization
2 0.90721262 296 hunch net-2008-04-21-The Science 2.0 article
Introduction: I found the article about science using modern tools interesting , especially the part about ‘blogophobia’, which in my experience is often a substantial issue: many potential guest posters aren’t quite ready, because of the fear of a permanent public mistake, because it is particularly hard to write about the unknown (the essence of research), and because the system for public credit doesn’t yet really handle blog posts. So far, science has been relatively resistant to discussing research on blogs. Some things need to change to get there. Public tolerance of the occasional mistake is essential, as is a willingness to cite (and credit) blogs as freely as papers. I’ve often run into another reason for holding back myself: I don’t want to overtalk my own research. Nevertheless, I’m slowly changing to the opinion that I’m holding back too much: the real power of a blog in research is that it can be used to confer with many people, and that just makes research work better.
same-blog 3 0.87254363 463 hunch net-2012-05-02-ICML: Behind the Scenes
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
4 0.86935133 180 hunch net-2006-05-21-NIPS paper evaluation criteria
Introduction: John Platt , who is PC-chair for NIPS 2006 has organized a NIPS paper evaluation criteria document with input from the program committee and others. The document contains specific advice about what is appropriate for the various subareas within NIPS. It may be very helpful, because the standards of evaluation for papers varies significantly. This is a bit of an experiment: the hope is that by carefully thinking about and stating what is important, authors can better understand whether and where their work fits. Update: The general submission page and Author instruction including how to submit an appendix .
5 0.75151479 252 hunch net-2007-07-01-Watchword: Online Learning
Introduction: It turns out that many different people use the term “Online Learning”, and often they don’t have the same definition in mind. Here’s a list of the possibilities I know of. Online Information Setting Online learning refers to a problem in which unlabeled data comes, a prediction is made, and then feedback is acquired. Online Adversarial Setting Online learning refers to algorithms in the Online Information Setting which satisfy guarantees of the form: “For all possible sequences of observations, the algorithim has regret at most log ( number of strategies) with respect to the best strategy in a set.” This is sometimes called online learning with experts. Online Optimization Constraint Online learning refers to optimizing a predictor via a learning algorithm tunes parameters on a per-example basis. This may or may not be applied in the Online Information Setting, and the strategy may or may not satisfy Adversarial setting theory. Online Computational Constra
6 0.72486401 403 hunch net-2010-07-18-ICML & COLT 2010
7 0.71469414 203 hunch net-2006-08-18-Report of MLSS 2006 Taipei
8 0.71124464 437 hunch net-2011-07-10-ICML 2011 and the future
9 0.70329726 40 hunch net-2005-03-13-Avoiding Bad Reviewing
10 0.69788456 41 hunch net-2005-03-15-The State of Tight Bounds
11 0.69461429 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
12 0.69438607 315 hunch net-2008-09-03-Bidding Problems
13 0.6938594 484 hunch net-2013-06-16-Representative Reviewing
14 0.69335276 343 hunch net-2009-02-18-Decision by Vetocracy
15 0.69245166 461 hunch net-2012-04-09-ICML author feedback is open
16 0.6876089 423 hunch net-2011-02-02-User preferences for search engines
17 0.68751568 454 hunch net-2012-01-30-ICML Posters and Scope
18 0.68722206 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
19 0.68524081 379 hunch net-2009-11-23-ICML 2009 Workshops (and Tutorials)
20 0.68419552 51 hunch net-2005-04-01-The Producer-Consumer Model of Research