hunch_net hunch_net-2005 hunch_net-2005-42 knowledge-graph by maker-knowledge-mining

42 hunch net-2005-03-17-Going all the Way, Sometimes

meta infos for this blog

Source: html

Introduction: At many points in research, you face a choice: should I keep on improving some old piece of technology or should I do something new? For example: Should I refine bounds to make them tighter? Should I take some learning theory and turn it into a learning algorithm? Should I implement the learning algorithm? Should I test the learning algorithm widely? Should I release the algorithm as source code? Should I go see what problems people actually need to solve? The universal temptation of people attracted to research is doing something new. That is sometimes the right decision, but is also often not. I’d like to discuss some reasons why not. Expertise Once expertise are developed on some subject, you are the right person to refine them. What is the real problem? Continually improving a piece of technology is a mechanism forcing you to confront this question. In many cases, this confrontation is uncomfortable because you discover that your method has fundamen

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 At many points in research, you face a choice: should I keep on improving some old piece of technology or should I do something new? [sent-1, score-0.635]

2 For example: Should I refine bounds to make them tighter? [sent-2, score-0.239]

3 Should I go see what problems people actually need to solve? [sent-7, score-0.099]

4 The universal temptation of people attracted to research is doing something new. [sent-8, score-0.27]

5 That is sometimes the right decision, but is also often not. [sent-9, score-0.081]

6 Expertise Once expertise are developed on some subject, you are the right person to refine them. [sent-11, score-0.432]

7 Continually improving a piece of technology is a mechanism forcing you to confront this question. [sent-13, score-0.698]

8 In many cases, this confrontation is uncomfortable because you discover that your method has fundamental flaws with respect to solving the real problem. [sent-14, score-0.641]

9 Not going all the way means you never discover this, except the hard way—people lose interest in your work. [sent-15, score-0.929]

10 Virtues of breadth When you go all the way, you gain breadth, with a deeper understanding about which problems are important and why. [sent-16, score-0.505]

11 This can be invaluable in focusing your future research. [sent-17, score-0.237]

12 More Tangible Accomplishment Going all the way means that you can point to your peers and say “I solved it”. [sent-18, score-0.256]

13 Going all the way is sometimes problematic in research. [sent-19, score-0.344]

14 For example, a paper with a theory, an algorithm, and experimental results invites defeat-in-detail: a reviewer can disagree with any one of these components and eliminate it from consideration. [sent-20, score-0.417]

15 Another issue is that academia doesn’t directly reward implementing and releasing algorithms. [sent-21, score-0.401]

16 A third issue is that you will almost certainly discover topics of interest which don’t fit your home conference(s). [sent-22, score-0.734]

17 It is also very difficult to publish a paper with the title “an incremental improvement on X (which makes it work great in practice)”. [sent-23, score-0.284]

18 Along with this advice, it is important to remember to fail fast, where appropriate. [sent-24, score-0.08]

19 When you discover that an idea is not workable, quickly quitting it and moving on is a real virtue. [sent-25, score-0.535]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('discover', 0.322), ('refine', 0.239), ('expertise', 0.193), ('breadth', 0.188), ('going', 0.167), ('piece', 0.164), ('way', 0.163), ('improving', 0.148), ('algorithm', 0.145), ('technology', 0.144), ('confront', 0.129), ('deeper', 0.129), ('invaluable', 0.129), ('invites', 0.129), ('issue', 0.127), ('real', 0.113), ('forcing', 0.113), ('focusing', 0.108), ('home', 0.108), ('uncomfortable', 0.103), ('flaws', 0.103), ('releasing', 0.103), ('problematic', 0.1), ('face', 0.1), ('incremental', 0.1), ('moving', 0.1), ('publish', 0.1), ('temptation', 0.1), ('go', 0.099), ('eliminate', 0.097), ('components', 0.097), ('interest', 0.097), ('disagree', 0.094), ('tighter', 0.094), ('means', 0.093), ('universal', 0.091), ('implement', 0.091), ('implementing', 0.091), ('gain', 0.089), ('release', 0.089), ('lose', 0.087), ('title', 0.084), ('sometimes', 0.081), ('reward', 0.08), ('along', 0.08), ('remember', 0.08), ('third', 0.08), ('something', 0.079), ('advice', 0.079), ('widely', 0.079)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 42 hunch net-2005-03-17-Going all the Way, Sometimes

2 0.14160083 332 hunch net-2008-12-23-Use of Learning Theory

Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example

3 0.12250744 22 hunch net-2005-02-18-What it means to do research.

Introduction: I want to try to describe what doing research means, especially from the point of view of an undergraduate. The shift from a class-taking mentality to a research mentality is very significant and not easy. Problem Posing Posing the right problem is often as important as solving them. Many people can get by in research by solving problems others have posed, but that’s not sufficient for really inspiring research. For learning in particular, there is a strong feeling that we just haven’t figured out which questions are the right ones to ask. You can see this, because the answers we have do not seem convincing. Gambling your life When you do research, you think very hard about new ways of solving problems, new problems, and new solutions. Many conversations are of the form “I wonder what would happen if…” These processes can be short (days or weeks) or years-long endeavours. The worst part is that you’ll only know if you were succesful at the end of the process (and some

4 0.11745714 343 hunch net-2009-02-18-Decision by Vetocracy

Introduction: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. This is most easily seen by comparison: Paper Banditron Offset Tree Notes Problem Scope Multiclass problems where only the loss of one choice can be probed. Strictly greater: Cost sensitive multiclass problems where only the loss of one choice can be probed. Often generalizations don’t matter. That’s not the case here, since every plausible application I’ve thought of involves loss functions substantially different from 0/1. What’s new Analysis and Experiments Algorithm, Analysis, and Experiments As far as I know, the essence of the more general problem was first stated and analyzed with the EXP4 algorithm (page 16) (1998). It’s also the time horizon 1 simplification of the Reinforcement Learning setting for the random trajectory method (page 15) (2002). The Banditron algorithm itself is functionally identi

5 0.11687458 454 hunch net-2012-01-30-ICML Posters and Scope

Introduction: Normally, I don’t indulge in posters for ICML , but this year is naturally an exception for me. If you want one, there are a small number left here , if you sign up before February. It also seems worthwhile to give some sense of the scope and reviewing criteria for ICML for authors considering submitting papers. At ICML, the (very large) program committee does the reviewing which informs final decisions by area chairs on most papers. Program chairs setup the process, deal with exceptions or disagreements, and provide advice for the reviewing process. Providing advice is tricky (and easily misleading) because a conference is a community, and in the end the aggregate interests of the community determine the conference. Nevertheless, as a program chair this year it seems worthwhile to state the overall philosophy I have and what I plan to encourage (and occasionally discourage). At the highest level, I believe ICML exists to further research into machine learning, which I gene

6 0.11235071 420 hunch net-2010-12-26-NIPS 2010

7 0.11128455 110 hunch net-2005-09-10-“Failure” is an option

8 0.11099737 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

9 0.10544439 146 hunch net-2006-01-06-MLTV

10 0.10419772 95 hunch net-2005-07-14-What Learning Theory might do

11 0.10255519 235 hunch net-2007-03-03-All Models of Learning have Flaws

12 0.10186768 233 hunch net-2007-02-16-The Forgetting

13 0.099609576 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project

14 0.098544911 358 hunch net-2009-06-01-Multitask Poisoning

15 0.098224103 35 hunch net-2005-03-04-The Big O and Constants in Learning

16 0.098078035 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

17 0.096937351 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

18 0.095327333 347 hunch net-2009-03-26-Machine Learning is too easy

19 0.094715185 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

20 0.094475247 44 hunch net-2005-03-21-Research Styles in Machine Learning

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.243), (1, -0.012), (2, -0.012), (3, 0.094), (4, -0.023), (5, 0.0), (6, 0.028), (7, 0.004), (8, 0.035), (9, 0.056), (10, 0.004), (11, 0.046), (12, 0.062), (13, 0.022), (14, 0.022), (15, -0.023), (16, 0.095), (17, 0.021), (18, -0.022), (19, -0.043), (20, -0.012), (21, 0.043), (22, -0.057), (23, -0.026), (24, -0.028), (25, -0.047), (26, -0.043), (27, -0.003), (28, -0.046), (29, -0.091), (30, -0.062), (31, 0.042), (32, 0.006), (33, 0.039), (34, -0.018), (35, -0.064), (36, 0.04), (37, -0.043), (38, 0.032), (39, 0.067), (40, -0.007), (41, -0.019), (42, -0.014), (43, -0.03), (44, 0.141), (45, 0.08), (46, -0.079), (47, -0.058), (48, 0.023), (49, -0.005)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97131521 42 hunch net-2005-03-17-Going all the Way, Sometimes

2 0.71923465 22 hunch net-2005-02-18-What it means to do research.

3 0.707008 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

4 0.70405847 231 hunch net-2007-02-10-Best Practices for Collaboration

Introduction: Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet. Enable your collaborator . A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think. Avoid Duplication . Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the e

5 0.66856676 202 hunch net-2006-08-10-Precision is not accuracy

Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:

6 0.66840613 454 hunch net-2012-01-30-ICML Posters and Scope

7 0.66580886 162 hunch net-2006-03-09-Use of Notation

8 0.6519568 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

9 0.65153855 177 hunch net-2006-05-05-An ICML reject

10 0.64942199 351 hunch net-2009-05-02-Wielding a New Abstraction

11 0.64873546 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer

12 0.64148962 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

13 0.63944525 52 hunch net-2005-04-04-Grounds for Rejection

14 0.61893463 44 hunch net-2005-03-21-Research Styles in Machine Learning

15 0.6173349 358 hunch net-2009-06-01-Multitask Poisoning

16 0.60698223 332 hunch net-2008-12-23-Use of Learning Theory

17 0.60124862 73 hunch net-2005-05-17-A Short Guide to PhD Graduate Study

18 0.59712058 1 hunch net-2005-01-19-Why I decided to run a weblog.

19 0.59678185 31 hunch net-2005-02-26-Problem: Reductions and Relative Ranking Metrics

20 0.59345555 347 hunch net-2009-03-26-Machine Learning is too easy

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(10, 0.014), (27, 0.138), (53, 0.023), (55, 0.027), (94, 0.685), (95, 0.025)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98042154 42 hunch net-2005-03-17-Going all the Way, Sometimes

2 0.95044875 35 hunch net-2005-03-04-The Big O and Constants in Learning

Introduction: The notation g(n) = O(f(n)) means that in the limit as n approaches infinity there exists a constant C such that the g(n) is less than Cf(n) . In learning theory, there are many statements about learning algorithms of the form “under assumptions x , y , and z , the classifier learned has an error rate of at most O(f(m)) “. There is one very good reason to use O(): it helps you understand the big picture and neglect the minor details which are not important in the big picture. However, there are some important reasons not to do this as well. Unspeedup In algorithm analysis, the use of O() for time complexity is pervasive and well-justified. Determining the exact value of C is inherently computer architecture dependent. (The “C” for x86 processors might differ from the “C” on PowerPC processors.) Since many learning theorists come from a CS theory background, the O() notation is applied to generalization error. The O() abstraction breaks here—you can not genera

3 0.94800794 81 hunch net-2005-06-13-Wikis for Summer Schools and Workshops

Introduction: Chicago ’05 ended a couple of weeks ago. This was the sixth Machine Learning Summer School , and the second one that used a wiki . (The first was Berder ’04, thanks to Gunnar Raetsch.) Wikis are relatively easy to set up, greatly aid social interaction, and should be used a lot more at summer schools and workshops. They can even be used as the meeting’s webpage, as a permanent record of its participants’ collaborations — see for example the wiki/website for last year’s NVO Summer School . A basic wiki is a collection of editable webpages, maintained by software called a wiki engine . The engine used at both Berder and Chicago was TikiWiki — it is well documented and gets you something running fast. It uses PHP and MySQL, but doesn’t require you to know either. Tikiwiki has far more features than most wikis, as it is really a full Content Management System . (My thanks to Sebastian Stark for pointing this out.) Here are the features we found most useful: Bulletin boa

4 0.94346821 115 hunch net-2005-09-26-Prediction Bounds as the Mathematics of Science

Introduction: “Science” has many meanings, but one common meaning is “the scientific method ” which is a principled method for investigating the world using the following steps: Form a hypothesis about the world. Use the hypothesis to make predictions. Run experiments to confirm or disprove the predictions. The ordering of these steps is very important to the scientific method. In particular, predictions must be made before experiments are run. Given that we all believe in the scientific method of investigation, it may be surprising to learn that cheating is very common. This happens for many reasons, some innocent and some not. Drug studies. Pharmaceutical companies make predictions about the effects of their drugs and then conduct blind clinical studies to determine their effect. Unfortunately, they have also been caught using some of the more advanced techniques for cheating here : including “reprobleming”, “data set selection”, and probably “overfitting by review”

5 0.94219404 346 hunch net-2009-03-18-Parallel ML primitives

Introduction: Previously, we discussed parallel machine learning a bit. As parallel ML is rather difficult, I’d like to describe my thinking at the moment, and ask for advice from the rest of the world. This is particularly relevant right now, as I’m attending a workshop tomorrow on parallel ML. Parallelizing slow algorithms seems uncompelling. Parallelizing many algorithms also seems uncompelling, because the effort required to parallelize is substantial. This leaves the question: Which one fast algorithm is the best to parallelize? What is a substantially different second? One compellingly fast simple algorithm is online gradient descent on a linear representation. This is the core of Leon’s sgd code and Vowpal Wabbit . Antoine Bordes showed a variant was competitive in the large scale learning challenge . It’s also a decades old primitive which has been reused in many algorithms, and continues to be reused. It also applies to online learning rather than just online optimiz

6 0.88015109 120 hunch net-2005-10-10-Predictive Search is Coming

7 0.83580804 276 hunch net-2007-12-10-Learning Track of International Planning Competition

8 0.7781294 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

9 0.70993155 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

10 0.7029382 229 hunch net-2007-01-26-Parallel Machine Learning Problems

11 0.69260615 441 hunch net-2011-08-15-Vowpal Wabbit 6.0

12 0.65477854 13 hunch net-2005-02-04-JMLG

13 0.64129764 408 hunch net-2010-08-24-Alex Smola starts a blog

14 0.63697618 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

15 0.63581908 73 hunch net-2005-05-17-A Short Guide to PhD Graduate Study

16 0.62894839 471 hunch net-2012-08-24-Patterns for research in machine learning

17 0.62375015 200 hunch net-2006-08-03-AOL’s data drop

18 0.61581153 178 hunch net-2006-05-08-Big machine learning

19 0.61361545 253 hunch net-2007-07-06-Idempotent-capable Predictors

20 0.60780269 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning