hunch_net hunch_net-2005 hunch_net-2005-30 knowledge-graph by maker-knowledge-mining

30 hunch net-2005-02-25-Why Papers?


meta infos for this blog

Source: html

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Makc asked a good question in comments—”Why bother to make a paper, at all? [sent-1, score-0.278]

2 ” There are several reasons for writing papers which may not be immediately obvious to people not in academia. [sent-2, score-0.947]

3 The basic idea is that papers have considerably more utility than the obvious “present an idea”. [sent-3, score-0.808]

4 Academics (especially young ones) are often judged on the number of papers they produce. [sent-5, score-0.785]

5 Papers have a formalized method of citing and crediting other—the bibliography. [sent-6, score-0.58]

6 Academics (especially older ones) are often judged on the number of citations they receive. [sent-7, score-0.444]

7 Conferences receive many papers, from which a subset are selected. [sent-9, score-0.196]

8 Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. [sent-10, score-0.719]

9 Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. [sent-12, score-0.087]

10 Papers are an excuse to get a large number of smart people in the same room and think about the same topic. [sent-14, score-0.541]

11 In particular, they are much easier to read (and understand) then a long discussion thread. [sent-16, score-0.262]

12 (Writing good papers is hard) All of the above are reasons why writing papers is a good idea. [sent-18, score-1.354]

13 It’s also important to understand that academia is a large system and large systems have a lot of inertia. [sent-19, score-0.466]

14 This means switching from paper writing to some other method of doing research won’t happen unless the other method is significantly more effective, and even then there will be a lot of inertia. [sent-20, score-0.957]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('papers', 0.402), ('writing', 0.263), ('excuse', 0.222), ('forums', 0.222), ('anonymous', 0.209), ('formalized', 0.209), ('academics', 0.192), ('judged', 0.192), ('lot', 0.144), ('discussion', 0.143), ('method', 0.14), ('ones', 0.13), ('citing', 0.12), ('receive', 0.12), ('units', 0.12), ('read', 0.119), ('especially', 0.119), ('obvious', 0.114), ('utility', 0.111), ('bother', 0.111), ('crediting', 0.111), ('reputation', 0.105), ('sites', 0.105), ('young', 0.105), ('considerably', 0.1), ('switching', 0.1), ('enable', 0.096), ('good', 0.096), ('reasons', 0.095), ('link', 0.093), ('meet', 0.093), ('conferences', 0.092), ('even', 0.089), ('friends', 0.087), ('wants', 0.087), ('number', 0.086), ('large', 0.085), ('citations', 0.085), ('idea', 0.081), ('older', 0.081), ('unless', 0.081), ('understand', 0.08), ('smart', 0.079), ('subset', 0.076), ('immediately', 0.073), ('fair', 0.073), ('academia', 0.072), ('easy', 0.072), ('asked', 0.071), ('room', 0.069)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 30 hunch net-2005-02-25-Why Papers?

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo

2 0.25182289 233 hunch net-2007-02-16-The Forgetting

Introduction: How many papers do you remember from 2006? 2005? 2002? 1997? 1987? 1967? One way to judge this would be to look at the citations of the papers you write—how many came from which year? For myself, the answers on recent papers are: year 2006 2005 2002 1997 1987 1967 count 4 10 5 1 0 0 This spectrum is fairly typical of papers in general. There are many reasons that citations are focused on recent papers. The number of papers being published continues to grow. This is not a very significant effect, because the rate of publication has not grown nearly as fast. Dead men don’t reject your papers for not citing them. This reason seems lame, because it’s a distortion from the ideal of science. Nevertheless, it must be stated because the effect can be significant. In 1997, I started as a PhD student. Naturally, papers after 1997 are better remembered because they were absorbed in real time. A large fraction of people writing papers and a

3 0.15188591 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

4 0.14082672 318 hunch net-2008-09-26-The SODA Program Committee

Introduction: Claire asked me to be on the SODA program committee this year, which was quite a bit of work. I had a relatively light load—merely 49 theory papers. Many of these papers were not on subjects that I was expert about, so (as is common for theory conferences) I found various reviewers that I trusted to help review the papers. I ended up reviewing about 1/3 personally. There were a couple instances where I ended up overruling a subreviewer whose logic seemed off, but otherwise I generally let their reviews stand. There are some differences in standards for paper reviews between the machine learning and theory communities. In machine learning it is expected that a review be detailed, while in the theory community this is often not the case. Every paper given to me ended up with a review varying between somewhat and very detailed. I’m sure not every author was happy with the outcome. While we did our best to make good decisions, they were difficult decisions to make. For exam

5 0.12959132 225 hunch net-2007-01-02-Retrospective

Introduction: It’s been almost two years since this blog began. In that time, I’ve learned enough to shift my expectations in several ways. Initially, the idea was for a general purpose ML blog where different people could contribute posts. What has actually happened is most posts come from me, with a few guest posts that I greatly value. There are a few reasons I see for this. Overload . A couple years ago, I had not fully appreciated just how busy life gets for a researcher. Making a post is not simply a matter of getting to it, but rather of prioritizing between {writing a grant, finishing an overdue review, writing a paper, teaching a class, writing a program, etc…}. This is a substantial transition away from what life as a graduate student is like. At some point the question is not “when will I get to it?” but rather “will I get to it?” and the answer starts to become “no” most of the time. Feedback failure . This blog currently receives about 3K unique visitors per day from

6 0.12873074 315 hunch net-2008-09-03-Bidding Problems

7 0.12283596 325 hunch net-2008-11-10-ICML Reviewing Criteria

8 0.12231367 395 hunch net-2010-04-26-Compassionate Reviewing

9 0.12022533 134 hunch net-2005-12-01-The Webscience Future

10 0.11395882 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

11 0.11373245 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

12 0.11227712 116 hunch net-2005-09-30-Research in conferences

13 0.11108631 454 hunch net-2012-01-30-ICML Posters and Scope

14 0.11034626 343 hunch net-2009-02-18-Decision by Vetocracy

15 0.10877401 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

16 0.10760997 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

17 0.10730885 208 hunch net-2006-09-18-What is missing for online collaborative research?

18 0.10681758 22 hunch net-2005-02-18-What it means to do research.

19 0.098803714 465 hunch net-2012-05-12-ICML accepted papers and early registration

20 0.097417019 297 hunch net-2008-04-22-Taking the next step


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.208), (1, -0.122), (2, 0.094), (3, 0.113), (4, 0.011), (5, 0.023), (6, 0.009), (7, -0.026), (8, 0.026), (9, 0.032), (10, 0.028), (11, 0.01), (12, -0.089), (13, 0.005), (14, 0.066), (15, -0.011), (16, -0.06), (17, 0.067), (18, 0.024), (19, 0.004), (20, 0.001), (21, 0.045), (22, -0.065), (23, -0.081), (24, 0.038), (25, -0.025), (26, 0.005), (27, 0.051), (28, -0.092), (29, -0.158), (30, 0.027), (31, 0.076), (32, 0.075), (33, -0.055), (34, 0.074), (35, -0.038), (36, 0.049), (37, 0.04), (38, 0.022), (39, 0.087), (40, 0.075), (41, 0.07), (42, 0.014), (43, 0.005), (44, -0.012), (45, 0.018), (46, -0.012), (47, -0.002), (48, 0.042), (49, 0.01)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98802668 30 hunch net-2005-02-25-Why Papers?

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo

2 0.90728867 233 hunch net-2007-02-16-The Forgetting

Introduction: How many papers do you remember from 2006? 2005? 2002? 1997? 1987? 1967? One way to judge this would be to look at the citations of the papers you write—how many came from which year? For myself, the answers on recent papers are: year 2006 2005 2002 1997 1987 1967 count 4 10 5 1 0 0 This spectrum is fairly typical of papers in general. There are many reasons that citations are focused on recent papers. The number of papers being published continues to grow. This is not a very significant effect, because the rate of publication has not grown nearly as fast. Dead men don’t reject your papers for not citing them. This reason seems lame, because it’s a distortion from the ideal of science. Nevertheless, it must be stated because the effect can be significant. In 1997, I started as a PhD student. Naturally, papers after 1997 are better remembered because they were absorbed in real time. A large fraction of people writing papers and a

3 0.76989836 288 hunch net-2008-02-10-Complexity Illness

Introduction: One of the enduring stereotypes of academia is that people spend a great deal of intelligence, time, and effort finding complexity rather than simplicity. This is at least anecdotally true in my experience. Math++ Several people have found that adding useless math makes their paper more publishable as evidenced by a reject-add-accept sequence. 8 page minimum Who submitted a paper to ICML violating the 8 page minimum? Every author fears that the reviewers won’t take their work seriously unless the allowed length is fully used. The best minimum violation I know is Adam ‘s paper at SODA on generating random factored numbers , but this is deeply exceptional. It’s a fair bet that 90% of papers submitted are exactly at the page limit. We could imagine that this is because papers naturally take more space, but few people seem to be clamoring for more space. Journalong Has anyone been asked to review a 100 page journal paper? I have. Journal papers can be nice, becaus

4 0.73756856 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

5 0.70741564 134 hunch net-2005-12-01-The Webscience Future

Introduction: The internet has significantly effected the way we do research but it’s capabilities have not yet been fully realized. First, let’s acknowledge some known effects. Self-publishing By default, all researchers in machine learning (and more generally computer science and physics) place their papers online for anyone to download. The exact mechanism differs—physicists tend to use a central repository ( Arxiv ) while computer scientists tend to place the papers on their webpage. Arxiv has been slowly growing in subject breadth so it now sometimes used by computer scientists. Collaboration Email has enabled working remotely with coauthors. This has allowed collaborationis which would not otherwise have been possible and generally speeds research. Now, let’s look at attempts to go further. Blogs (like this one) allow public discussion about topics which are not easily categorized as “a new idea in machine learning” (like this topic). Organization of some subfield

6 0.66653341 1 hunch net-2005-01-19-Why I decided to run a weblog.

7 0.66565013 208 hunch net-2006-09-18-What is missing for online collaborative research?

8 0.65494168 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

9 0.64878708 52 hunch net-2005-04-04-Grounds for Rejection

10 0.61495602 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

11 0.61298537 318 hunch net-2008-09-26-The SODA Program Committee

12 0.60297 325 hunch net-2008-11-10-ICML Reviewing Criteria

13 0.59151161 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three

14 0.58717942 231 hunch net-2007-02-10-Best Practices for Collaboration

15 0.58717942 315 hunch net-2008-09-03-Bidding Problems

16 0.58566737 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

17 0.56716806 363 hunch net-2009-07-09-The Machine Learning Forum

18 0.55261964 188 hunch net-2006-06-30-ICML papers

19 0.54919708 454 hunch net-2012-01-30-ICML Posters and Scope

20 0.54745793 51 hunch net-2005-04-01-The Producer-Consumer Model of Research


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.135), (38, 0.016), (53, 0.057), (55, 0.088), (95, 0.589)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99314058 479 hunch net-2013-01-31-Remote large scale learning class participation

Introduction: Yann and I have arranged so that people who are interested in our large scale machine learning class and not able to attend in person can follow along via two methods. Videos will be posted with about a 1 day delay on techtalks . This is a side-by-side capture of video+slides from Weyond . We are experimenting with Piazza as a discussion forum. Anyone is welcome to subscribe to Piazza and ask questions there, where I will be monitoring things. update2 : Sign up here . The first lecture is up now, including the revised version of the slides which fixes a few typos and rounds out references.

2 0.9758876 390 hunch net-2010-03-12-Netflix Challenge 2 Canceled

Introduction: The second Netflix prize is canceled due to privacy problems . I continue to believe my original assessment of this paper, that the privacy break was somewhat overstated. I still haven’t seen any serious privacy failures on the scale of the AOL search log release . I expect privacy concerns to continue to be a big issue when dealing with data releases by companies or governments. The theory of maintaining privacy while using data is improving, but it is not yet in a state where the limits of what’s possible are clear let alone how to achieve these limits in a manner friendly to a prediction competition.

3 0.97275066 319 hunch net-2008-10-01-NIPS 2008 workshop on ‘Learning over Empirical Hypothesis Spaces’

Introduction: This workshop asks for insights how far we may/can push the theoretical boundary of using data in the design of learning machines. Can we express our classification rule in terms of the sample, or do we have to stick to a core assumption of classical statistical learning theory, namely that the hypothesis space is to be defined independent from the sample? This workshop is particularly interested in – but not restricted to – the ‘luckiness framework’ and the recently introduced notion of ‘compatibility functions’ in a semi-supervised learning context (more information can be found at http://www.kuleuven.be/wehys ).

same-blog 4 0.94485271 30 hunch net-2005-02-25-Why Papers?

Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo

5 0.92339861 389 hunch net-2010-02-26-Yahoo! ML events

Introduction: Yahoo! is sponsoring two machine learning events that might interest people. The Key Scientific Challenges program (due March 5) for Machine Learning and Statistics offers $5K (plus bonuses) for graduate students working on a core problem of interest to Y! If you are already working on one of these problems, there is no reason not to submit, and if you aren’t you might want to think about it for next year, as I am confident they all press the boundary of the possible in Machine Learning. There are 7 days left. The Learning to Rank challenge (due May 31) offers an $8K first prize for the best ranking algorithm on a real (and really used) dataset for search ranking, with presentations at an ICML workshop. Unlike the Netflix competition, there are prizes for 2nd, 3rd, and 4th place, perhaps avoiding the heartbreak the ensemble encountered. If you think you know how to rank, you should give it a try, and we might all learn something. There are 3 months left.

6 0.88729656 456 hunch net-2012-02-24-ICML+50%

7 0.8597315 127 hunch net-2005-11-02-Progress in Active Learning

8 0.80457515 344 hunch net-2009-02-22-Effective Research Funding

9 0.77493864 373 hunch net-2009-10-03-Static vs. Dynamic multiclass prediction

10 0.75574267 462 hunch net-2012-04-20-Both new: STOC workshops and NEML

11 0.70893562 234 hunch net-2007-02-22-Create Your Own ICML Workshop

12 0.68574631 105 hunch net-2005-08-23-(Dis)similarities between academia and open source programmers

13 0.55938345 7 hunch net-2005-01-31-Watchword: Assumption

14 0.53054667 464 hunch net-2012-05-03-Microsoft Research, New York City

15 0.52504575 455 hunch net-2012-02-20-Berkeley Streaming Data Workshop

16 0.52416396 275 hunch net-2007-11-29-The Netflix Crack

17 0.52116621 466 hunch net-2012-06-05-ICML acceptance statistics

18 0.5199163 445 hunch net-2011-09-28-Somebody’s Eating Your Lunch

19 0.51104146 290 hunch net-2008-02-27-The Stats Handicap

20 0.50839514 36 hunch net-2005-03-05-Funding Research