hunch_net hunch_net-2005 hunch_net-2005-98 knowledge-graph by maker-knowledge-mining

98 hunch net-2005-07-27-Not goal metrics


meta infos for this blog

Source: html

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. [sent-4, score-0.317]

2 Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. [sent-5, score-0.31]

3 The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. [sent-6, score-0.395]

4 There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. [sent-8, score-0.482]

5 A low acceptance rate is often considered desirable for a conference. [sent-14, score-0.766]

6 But: It’s easy to skew an acceptance rate by adding (or inviting) many weak or bogus papers. [sent-15, score-0.639]

7 Consequently, a low acceptance rate can retard progress by simply raising the bar too high for what turns out to be a good idea when it is more fully developed. [sent-17, score-0.866]

8 With a low acceptance ratio, a strong objection by any one of several reviewers might torpedo a paper. [sent-19, score-0.744]

9 A low acceptance rate tends to spawn a multiplicity of conferences in one area. [sent-21, score-0.821]

10 (see also How to increase the acceptance ratios at top conferences? [sent-23, score-0.464]

11 ) Citation count Counting citations is somewhat better than counting papers because it is some evidence that an idea is actually useful. [sent-24, score-0.555]

12 A programmer who writes no lines of code isn’t very good. [sent-35, score-0.383]

13 Nevertheless, optimizing these metrics is not beneficial for a field of research. [sent-40, score-0.302]

14 In thinking about this, we must clearly differentiate 1) what is good for a field of research (solving important problems) and 2) what is good for individual researchers (getting jobs). [sent-41, score-0.373]

15 Any individual in academia cannot avoid being judged by these metrics. [sent-43, score-0.326]

16 Attempts by an individual or a small group of individuals to ignore these metrics is unlikely to change the system (and likely to result in the individual or small group being judged badly). [sent-44, score-0.686]

17 The best we can hope for is incremental progress which takes the form of the leadership in the academic community introducing new, saner metrics. [sent-46, score-0.417]

18 This is a difficult thing, particularly because any academic leader must have succeeded in the old system. [sent-47, score-0.321]

19 The “importance reviewer” is easier than the current standard: they must simply understand the problem being solved and rate how important this problem is. [sent-51, score-0.34]

20 The technical reviewers job is harder than the current standard: they must verify that all claims of solution to the problem are met. [sent-52, score-0.401]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('acceptance', 0.398), ('lines', 0.188), ('rate', 0.175), ('reviewers', 0.155), ('citation', 0.154), ('metrics', 0.154), ('academic', 0.148), ('count', 0.141), ('individual', 0.14), ('counting', 0.132), ('judge', 0.132), ('complex', 0.129), ('code', 0.129), ('low', 0.125), ('multiplicity', 0.123), ('papers', 0.12), ('importance', 0.111), ('judged', 0.106), ('must', 0.103), ('incremental', 0.103), ('failure', 0.103), ('reviewer', 0.101), ('progress', 0.1), ('citations', 0.094), ('mode', 0.092), ('metric', 0.086), ('fix', 0.083), ('technical', 0.081), ('academia', 0.08), ('optimizing', 0.08), ('complexity', 0.077), ('group', 0.073), ('isn', 0.071), ('old', 0.07), ('often', 0.068), ('idea', 0.068), ('field', 0.068), ('filtration', 0.066), ('ratios', 0.066), ('analog', 0.066), ('objection', 0.066), ('saner', 0.066), ('skew', 0.066), ('societies', 0.066), ('worthy', 0.066), ('writes', 0.066), ('paper', 0.063), ('current', 0.062), ('differentiate', 0.062), ('mutual', 0.062)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999964 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

2 0.20335461 484 hunch net-2013-06-16-Representative Reviewing

Introduction: When thinking about how best to review papers, it seems helpful to have some conception of what good reviewing is. As far as I can tell, this is almost always only discussed in the specific context of a paper (i.e. your rejected paper), or at most an area (i.e. what a “good paper” looks like for that area) rather than general principles. Neither individual papers or areas are sufficiently general for a large conference—every paper differs in the details, and what if you want to build a new area and/or cross areas? An unavoidable reason for reviewing is that the community of research is too large. In particular, it is not possible for a researcher to read every paper which someone thinks might be of interest. This reason for reviewing exists independent of constraints on rooms or scheduling formats of individual conferences. Indeed, history suggests that physical constraints are relatively meaningless over the long term — growing conferences simply use more rooms and/or change fo

3 0.19045663 343 hunch net-2009-02-18-Decision by Vetocracy

Introduction: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. This is most easily seen by comparison: Paper Banditron Offset Tree Notes Problem Scope Multiclass problems where only the loss of one choice can be probed. Strictly greater: Cost sensitive multiclass problems where only the loss of one choice can be probed. Often generalizations don’t matter. That’s not the case here, since every plausible application I’ve thought of involves loss functions substantially different from 0/1. What’s new Analysis and Experiments Algorithm, Analysis, and Experiments As far as I know, the essence of the more general problem was first stated and analyzed with the EXP4 algorithm (page 16) (1998). It’s also the time horizon 1 simplification of the Reinforcement Learning setting for the random trajectory method (page 15) (2002). The Banditron algorithm itself is functionally identi

4 0.18503121 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

Introduction: Although I’m greatly interested in machine learning, I think it must be admitted that there is a large amount of low quality logic being used in reviews. The problem is bad enough that sometimes I wonder if the Byzantine generals limit has been exceeded. For example, I’ve seen recent reviews where the given reasons for rejecting are: [ NIPS ] Theorem A is uninteresting because Theorem B is uninteresting. [ UAI ] When you learn by memorization, the problem addressed is trivial. [NIPS] The proof is in the appendix. [NIPS] This has been done before. (… but not giving any relevant citations) Just for the record I want to point out what’s wrong with these reviews. A future world in which such reasons never come up again would be great, but I’m sure these errors will be committed many times more in the future. This is nonsense. A theorem should be evaluated based on it’s merits, rather than the merits of another theorem. Learning by memorization requires an expon

5 0.17112726 315 hunch net-2008-09-03-Bidding Problems

Introduction: One way that many conferences in machine learning assign reviewers to papers is via bidding, which has steps something like: Invite people to review Accept papers Reviewers look at title and abstract and state the papers they are interested in reviewing. Some massaging happens, but reviewers often get approximately the papers they bid for. At the ICML business meeting, Andrew McCallum suggested getting rid of bidding for papers. A couple reasons were given: Privacy The title and abstract of the entire set of papers is visible to every participating reviewer. Some authors might be uncomfortable about this for submitted papers. I’m not sympathetic to this reason: the point of submitting a paper to review is to publish it, so the value (if any) of not publishing a part of it a little bit earlier seems limited. Cliques A bidding system is gameable. If you have 3 buddies and you inform each other of your submissions, you can each bid for your friend’s papers a

6 0.15188591 30 hunch net-2005-02-25-Why Papers?

7 0.15123796 466 hunch net-2012-06-05-ICML acceptance statistics

8 0.14863123 454 hunch net-2012-01-30-ICML Posters and Scope

9 0.14761174 40 hunch net-2005-03-13-Avoiding Bad Reviewing

10 0.14665613 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

11 0.14107211 116 hunch net-2005-09-30-Research in conferences

12 0.13983849 207 hunch net-2006-09-12-Incentive Compatible Reviewing

13 0.13781886 233 hunch net-2007-02-16-The Forgetting

14 0.13555723 304 hunch net-2008-06-27-Reviewing Horror Stories

15 0.12619171 288 hunch net-2008-02-10-Complexity Illness

16 0.12602957 38 hunch net-2005-03-09-Bad Reviewing

17 0.12052813 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

18 0.11882981 461 hunch net-2012-04-09-ICML author feedback is open

19 0.11693535 134 hunch net-2005-12-01-The Webscience Future

20 0.11107497 318 hunch net-2008-09-26-The SODA Program Committee


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.283), (1, -0.089), (2, 0.139), (3, 0.145), (4, -0.035), (5, 0.031), (6, 0.045), (7, -0.006), (8, -0.004), (9, 0.04), (10, -0.029), (11, 0.009), (12, 0.049), (13, -0.054), (14, 0.004), (15, -0.034), (16, -0.022), (17, 0.084), (18, 0.032), (19, -0.016), (20, -0.029), (21, 0.042), (22, -0.006), (23, 0.019), (24, -0.04), (25, 0.019), (26, -0.041), (27, -0.012), (28, 0.011), (29, -0.135), (30, 0.012), (31, 0.028), (32, 0.053), (33, 0.008), (34, 0.039), (35, -0.051), (36, 0.041), (37, -0.044), (38, 0.012), (39, 0.063), (40, -0.005), (41, -0.02), (42, 0.056), (43, -0.052), (44, 0.006), (45, -0.001), (46, -0.071), (47, 0.048), (48, 0.071), (49, -0.018)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97889298 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

2 0.78321815 288 hunch net-2008-02-10-Complexity Illness

Introduction: One of the enduring stereotypes of academia is that people spend a great deal of intelligence, time, and effort finding complexity rather than simplicity. This is at least anecdotally true in my experience. Math++ Several people have found that adding useless math makes their paper more publishable as evidenced by a reject-add-accept sequence. 8 page minimum Who submitted a paper to ICML violating the 8 page minimum? Every author fears that the reviewers won’t take their work seriously unless the allowed length is fully used. The best minimum violation I know is Adam ‘s paper at SODA on generating random factored numbers , but this is deeply exceptional. It’s a fair bet that 90% of papers submitted are exactly at the page limit. We could imagine that this is because papers naturally take more space, but few people seem to be clamoring for more space. Journalong Has anyone been asked to review a 100 page journal paper? I have. Journal papers can be nice, becaus

3 0.76315194 233 hunch net-2007-02-16-The Forgetting

Introduction: How many papers do you remember from 2006? 2005? 2002? 1997? 1987? 1967? One way to judge this would be to look at the citations of the papers you write—how many came from which year? For myself, the answers on recent papers are: year 2006 2005 2002 1997 1987 1967 count 4 10 5 1 0 0 This spectrum is fairly typical of papers in general. There are many reasons that citations are focused on recent papers. The number of papers being published continues to grow. This is not a very significant effect, because the rate of publication has not grown nearly as fast. Dead men don’t reject your papers for not citing them. This reason seems lame, because it’s a distortion from the ideal of science. Nevertheless, it must be stated because the effect can be significant. In 1997, I started as a PhD student. Naturally, papers after 1997 are better remembered because they were absorbed in real time. A large fraction of people writing papers and a

4 0.75155526 207 hunch net-2006-09-12-Incentive Compatible Reviewing

Introduction: Reviewing is a fairly formal process which is integral to the way academia is run. Given this integral nature, the quality of reviewing is often frustrating. I’ve seen plenty of examples of false statements, misbeliefs, reading what isn’t written, etc…, and I’m sure many other people have as well. Recently, mechanisms like double blind review and author feedback have been introduced to try to make the process more fair and accurate in many machine learning (and related) conferences. My personal experience is that these mechanisms help, especially the author feedback. Nevertheless, some problems remain. The game theory take on reviewing is that the incentive for truthful reviewing isn’t there. Since reviewers are also authors, there are sometimes perverse incentives created and acted upon. (Incidentially, these incentives can be both positive and negative.) Setting up a truthful reviewing system is tricky because their is no final reference truth available in any acce

5 0.74691117 315 hunch net-2008-09-03-Bidding Problems

Introduction: One way that many conferences in machine learning assign reviewers to papers is via bidding, which has steps something like: Invite people to review Accept papers Reviewers look at title and abstract and state the papers they are interested in reviewing. Some massaging happens, but reviewers often get approximately the papers they bid for. At the ICML business meeting, Andrew McCallum suggested getting rid of bidding for papers. A couple reasons were given: Privacy The title and abstract of the entire set of papers is visible to every participating reviewer. Some authors might be uncomfortable about this for submitted papers. I’m not sympathetic to this reason: the point of submitting a paper to review is to publish it, so the value (if any) of not publishing a part of it a little bit earlier seems limited. Cliques A bidding system is gameable. If you have 3 buddies and you inform each other of your submissions, you can each bid for your friend’s papers a

6 0.7440505 52 hunch net-2005-04-04-Grounds for Rejection

7 0.7346887 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

8 0.72893727 343 hunch net-2009-02-18-Decision by Vetocracy

9 0.72170478 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

10 0.72146648 30 hunch net-2005-02-25-Why Papers?

11 0.6944375 484 hunch net-2013-06-16-Representative Reviewing

12 0.66663301 318 hunch net-2008-09-26-The SODA Program Committee

13 0.66579729 231 hunch net-2007-02-10-Best Practices for Collaboration

14 0.64909101 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

15 0.64750618 461 hunch net-2012-04-09-ICML author feedback is open

16 0.64713639 38 hunch net-2005-03-09-Bad Reviewing

17 0.63971716 333 hunch net-2008-12-27-Adversarial Academia

18 0.63964415 463 hunch net-2012-05-02-ICML: Behind the Scenes

19 0.63855994 40 hunch net-2005-03-13-Avoiding Bad Reviewing

20 0.63522774 304 hunch net-2008-06-27-Reviewing Horror Stories


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.044), (10, 0.026), (13, 0.012), (27, 0.217), (38, 0.069), (48, 0.011), (53, 0.067), (55, 0.098), (83, 0.237), (94, 0.099), (95, 0.035)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95101136 52 hunch net-2005-04-04-Grounds for Rejection

Introduction: It’s reviewing season right now, so I thought I would list (at a high level) the sorts of problems which I see in papers. Hopefully, this will help us all write better papers. The following flaws are fatal to any paper: Incorrect theorem or lemma statements A typo might be “ok”, if it can be understood. Any theorem or lemma which indicates an incorrect understanding of reality must be rejected. Not doing so would severely harm the integrity of the conference. A paper rejected for this reason must be fixed. Lack of Understanding If a paper is understood by none of the (typically 3) reviewers then it must be rejected for the same reason. This is more controversial than it sounds because there are some people who maximize paper complexity in the hope of impressing the reviewer. The tactic sometimes succeeds with some reviewers (but not with me). As a reviewer, I sometimes get lost for stupid reasons. This is why an anonymized communication channel with the author can

2 0.91991752 135 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

3 0.89937371 228 hunch net-2007-01-15-The Machine Learning Department

Introduction: Carnegie Mellon School of Computer Science has the first academic Machine Learning department . This department already existed as the Center for Automated Learning and Discovery , but recently changed it’s name. The reason for changing the name is obvious: very few people think of themselves as “Automated Learner and Discoverers”, but there are number of people who think of themselves as “Machine Learners”. Machine learning is both more succinct and recognizable—good properties for a name. A more interesting question is “Should there be a Machine Learning Department?”. Tom Mitchell has a relevant whitepaper claiming that machine learning is answering a different question than other fields or departments. The fundamental debate here is “Is machine learning different from statistics?” At a cultural level, there is no real debate: they are different. Machine learning is characterized by several very active large peer reviewed conferences, operating in a computer

same-blog 4 0.88577205 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

5 0.8672452 321 hunch net-2008-10-19-NIPS 2008 workshop on Kernel Learning

Introduction: We’d like to invite hunch.net readers to participate in the NIPS 2008 workshop on kernel learning. While the main focus is on automatically learning kernels from data, we are also also looking at the broader questions of feature selection, multi-task learning and multi-view learning. There are no restrictions on the learning problem being addressed (regression, classification, etc), and both theoretical and applied work will be considered. The deadline for submissions is October 24 . More detail can be found here . Corinna Cortes, Arthur Gretton, Gert Lanckriet, Mehryar Mohri, Afshin Rostamizadeh

6 0.8076486 261 hunch net-2007-08-28-Live ML Class

7 0.74706715 95 hunch net-2005-07-14-What Learning Theory might do

8 0.74220496 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

9 0.7416569 343 hunch net-2009-02-18-Decision by Vetocracy

10 0.74064499 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

11 0.74001974 345 hunch net-2009-03-08-Prediction Science

12 0.73619026 225 hunch net-2007-01-02-Retrospective

13 0.73588258 351 hunch net-2009-05-02-Wielding a New Abstraction

14 0.73562533 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

15 0.73512912 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

16 0.73493356 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

17 0.73452562 437 hunch net-2011-07-10-ICML 2011 and the future

18 0.73369277 235 hunch net-2007-03-03-All Models of Learning have Flaws

19 0.73355937 194 hunch net-2006-07-11-New Models

20 0.73288637 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006