hunch_net hunch_net-2006 hunch_net-2006-202 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
sentIndex sentText sentNum sentScore
1 In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. [sent-1, score-0.442]
2 The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. [sent-2, score-0.322]
3 Partly, this is just a statement about where we are with respect to machine learning. [sent-3, score-0.197]
4 A very natural and simple theoretical statement is often obscured by complexifications. [sent-8, score-0.226]
5 Common sources of complexification include: Generalization By trying to make a statement that applies in the most general possible setting, your theorem becomes excessively hard to read. [sent-9, score-0.63]
6 Specialization Your theorem relies upon so many assumptions that it is hard for a simple reader to hold them all in their head. [sent-10, score-0.674]
7 Obscuration Your theorem relies upon cumbersome notation full of subsubsuperscripts, badly named variables, etc… There are several reasons why complexification occurs. [sent-11, score-0.893]
8 Excessive generalization often happens when authors have an idea and want to completely exploit it. [sent-12, score-0.432]
9 Excessive specialization often happens when authors have some algorithm they really want to prove works. [sent-14, score-0.394]
10 Some of the worst obscurations come from using an old standard notation which has simply been pushed to far. [sent-17, score-0.348]
11 After doing research for awhile, you realize that these complexifications are counterproductive. [sent-18, score-0.362]
12 Type (1) complexifications make it double hard for others to do follow-on work: your paper is hard to read and you have eliminated the possibility. [sent-19, score-0.7]
13 Type (2) complexifications look like “the tail wags the dog”—the math isn’t really working until it guides the algorithm design. [sent-20, score-0.78]
14 The worst reason, I’ve saved for last: it’s that the reviewing process emphasizes precision over accuracy. [sent-24, score-0.428]
15 Imagine shooting a math gun at a machine learning target. [sent-25, score-0.692]
16 A high precision math gun will very carefully guide the bullets to strike a fixed location—even though the location may have little to do with the target. [sent-26, score-0.89]
17 An accurate math gun will point at the correct target. [sent-27, score-0.653]
18 A precision/accuracy tradeoff is often encountered: we don’t know how to think about the actual machine learning problem, so instead we very precisely think about another not-quite-right problem. [sent-28, score-0.301]
19 A reviewer almost invariably prefers the more precise (but less accurate) paper because precision is the easy thing to check and think about. [sent-29, score-0.466]
20 The hard fix for this is more time spent by everyone thinking about what the real machine learning problems are. [sent-31, score-0.302]
wordName wordTfidf (topN-words)
[('complexifications', 0.362), ('gun', 0.271), ('math', 0.271), ('complexification', 0.181), ('obscuration', 0.181), ('relies', 0.181), ('precision', 0.18), ('notation', 0.18), ('type', 0.15), ('excessive', 0.14), ('specialization', 0.14), ('hard', 0.132), ('statement', 0.127), ('theorem', 0.116), ('accurate', 0.111), ('encountered', 0.108), ('generalization', 0.108), ('fix', 0.1), ('often', 0.099), ('worst', 0.098), ('groups', 0.098), ('location', 0.098), ('assumptions', 0.086), ('upon', 0.085), ('cumbersome', 0.08), ('formalisms', 0.08), ('prescription', 0.08), ('shooting', 0.08), ('invariably', 0.08), ('saved', 0.08), ('tail', 0.08), ('theoreticians', 0.08), ('authors', 0.08), ('happens', 0.075), ('prefers', 0.074), ('excessively', 0.074), ('eliminated', 0.074), ('mature', 0.074), ('reader', 0.074), ('exploit', 0.07), ('emphasizes', 0.07), ('pushed', 0.07), ('strike', 0.07), ('named', 0.07), ('machine', 0.07), ('guides', 0.067), ('questionable', 0.067), ('think', 0.066), ('thing', 0.066), ('motivate', 0.062)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999958 202 hunch net-2006-08-10-Precision is not accuracy
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
2 0.18650302 162 hunch net-2006-03-09-Use of Notation
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
3 0.12413627 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions
Introduction: A type of prediction problem is specified by the type of samples produced by a data source (Example: X x {0,1} , X x [0,1] , X x {1,2,3,4,5} , etc…) and a loss function (0/1 loss, squared error loss, cost sensitive losses, etc…). For simplicity, we’ll assume that all losses have a minimum of zero. For this post, we can think of a learning reduction as A mapping R from samples of one type T (like multiclass classification) to another type T’ (like binary classification). A mapping Q from predictors for type T’ to predictors for type T . The simplest sort of learning reduction is a “loss reduction”. The idea in a loss reduction is to prove a statement of the form: Theorem For all base predictors b , for all distributions D over examples of type T : E (x,y) ~ D L T (y,Q(b,x)) <= f(E (x’,y’)~R(D) L T’ (y’,b(x’))) Here L T is the loss for the type T problem and L T’ is the loss for the type T’ problem. Also, R(D) is the distribution ov
4 0.12403981 70 hunch net-2005-05-12-Math on the Web
Introduction: Andrej Bauer has setup a Mathematics and Computation Blog. As a first step he has tried to address the persistent and annoying problem of math on the web. As a basic tool for precisely stating and transfering understanding of technical subjects, mathematics is very necessary. Despite this necessity, every mechanism for expressing mathematics on the web seems unnaturally clumsy. Here are some of the methods and their drawbacks: MathML This was supposed to be the answer, but it has two severe drawbacks: “Internet Explorer” doesn’t read it and the language is an example of push-XML-to-the-limit which no one would ever consider writing in. (In contrast, html is easy to write in.) It’s also very annoying that math fonts must be installed independent of the browser, even for mozilla based browsers. Create inline images. This has several big drawbacks: font size is fixed for all viewers, you can’t cut & paste inside the images, and you can’t hyperlink from (say) symbol to de
5 0.11806416 83 hunch net-2005-06-18-Lower Bounds for Learning Reductions
Introduction: Learning reductions transform a solver of one type of learning problem into a solver of another type of learning problem. When we analyze these for robustness we can make statement of the form “Reduction R has the property that regret r (or loss) on subproblems of type A implies regret at most f ( r ) on the original problem of type B “. A lower bound for a learning reduction would have the form “for all reductions R , there exists a learning problem of type B and learning algorithm for problems of type A where regret r on induced problems implies at least regret f ( r ) for B “. The pursuit of lower bounds is often questionable because, unlike upper bounds, they do not yield practical algorithms. Nevertheless, they may be helpful as a tool for thinking about what is learnable and how learnable it is. This has already come up here and here . At the moment, there is no coherent theory of lower bounds for learning reductions, and we have little understa
6 0.11644432 454 hunch net-2012-01-30-ICML Posters and Scope
7 0.11585537 170 hunch net-2006-04-06-Bounds greater than 1
8 0.11474198 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
9 0.10135854 104 hunch net-2005-08-22-Do you believe in induction?
10 0.099565864 288 hunch net-2008-02-10-Complexity Illness
11 0.099063545 35 hunch net-2005-03-04-The Big O and Constants in Learning
12 0.098526835 343 hunch net-2009-02-18-Decision by Vetocracy
13 0.097825326 22 hunch net-2005-02-18-What it means to do research.
14 0.095347457 332 hunch net-2008-12-23-Use of Learning Theory
15 0.09354689 437 hunch net-2011-07-10-ICML 2011 and the future
16 0.093162328 461 hunch net-2012-04-09-ICML author feedback is open
17 0.09204106 347 hunch net-2009-03-26-Machine Learning is too easy
18 0.091444671 484 hunch net-2013-06-16-Representative Reviewing
19 0.09131401 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
20 0.089603886 187 hunch net-2006-06-25-Presentation of Proofs is Hard.
topicId topicWeight
[(0, 0.223), (1, -0.006), (2, 0.056), (3, 0.074), (4, -0.035), (5, -0.03), (6, 0.068), (7, 0.047), (8, 0.036), (9, -0.015), (10, -0.008), (11, -0.018), (12, 0.064), (13, 0.023), (14, 0.048), (15, -0.012), (16, 0.014), (17, -0.008), (18, -0.017), (19, -0.002), (20, 0.039), (21, 0.035), (22, -0.046), (23, -0.11), (24, -0.026), (25, -0.056), (26, -0.005), (27, -0.009), (28, -0.051), (29, -0.061), (30, -0.061), (31, 0.063), (32, -0.045), (33, 0.097), (34, -0.07), (35, -0.146), (36, -0.022), (37, 0.058), (38, 0.011), (39, -0.016), (40, -0.001), (41, 0.042), (42, -0.016), (43, -0.044), (44, -0.038), (45, 0.03), (46, 0.02), (47, 0.105), (48, 0.034), (49, 0.019)]
simIndex simValue blogId blogTitle
same-blog 1 0.94327652 202 hunch net-2006-08-10-Precision is not accuracy
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
2 0.80869502 162 hunch net-2006-03-09-Use of Notation
Introduction: For most people, a mathematical notation is like a language: you learn it and stick with it. For people doing mathematical research, however, this is not enough: they must design new notations for new problems. The design of good notation is both hard and worthwhile since a bad initial notation can retard a line of research greatly. Before we had mathematical notation, equations were all written out in language. Since words have multiple meanings and variable precedences, long equations written out in language can be extraordinarily difficult and sometimes fundamentally ambiguous. A good representative example of this is the legalese in the tax code. Since we want greater precision and clarity, we adopt mathematical notation. One fundamental thing to understand about mathematical notation, is that humans as logic verifiers, are barely capable. This is the fundamental reason why one notation can be much better than another. This observation is easier to miss than you might
3 0.7555986 187 hunch net-2006-06-25-Presentation of Proofs is Hard.
Introduction: When presenting part of the Reinforcement Learning theory tutorial at ICML 2006 , I was forcibly reminded of this. There are several difficulties. When creating the presentation, the correct level of detail is tricky. With too much detail, the proof takes too much time and people may be lost to boredom. With too little detail, the steps of the proof involve too-great a jump. This is very difficult to judge. What may be an easy step in the careful thought of a quiet room is not so easy when you are occupied by the process of presentation. What may be easy after having gone over this (and other) proofs is not so easy to follow in the first pass by a viewer. These problems seem only correctable by process of repeated test-and-revise. When presenting the proof, simply speaking with sufficient precision is substantially harder than in normal conversation (where precision is not so critical). Practice can help here. When presenting the proof, going at the right p
4 0.66993558 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
Introduction: Bob Williamson and I are the learning theory PC members at NIPS this year. This is some attempt to state the standards and tests I applied to the papers. I think it is a good idea to talk about this for two reasons: Making community standards a matter of public record seems healthy. It give us a chance to debate what is and is not the right standard. It might even give us a bit more consistency across the years. It may save us all time. There are a number of papers submitted which just aren’t there yet. Avoiding submitting is the right decision in this case. There are several criteria for judging a paper. All of these were active this year. Some criteria are uncontroversial while others may be so. The paper must have a theorem establishing something new for which it is possible to derive high confidence in the correctness of the results. A surprising number of papers fail this test. This criteria seems essential to the definition of “theory”. Missing theo
5 0.65044868 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms
Introduction: The ideal of theoretical algorithm analysis is to construct an algorithm with accompanying optimality theorems proving that it is a useful algorithm. This ideal often fails, particularly for learning algorithms and theory. The general form of a theorem is: If preconditions Then postconditions When we design learning algorithms it is very common to come up with precondition assumptions such as “the data is IID”, “the learning problem is drawn from a known distribution over learning problems”, or “there is a perfect classifier”. All of these example preconditions can be false for real-world problems in ways that are not easily detectable. This means that algorithms derived and justified by these very common forms of analysis may be prone to catastrophic failure in routine (mis)application. We can hope for better. Several different kinds of learning algorithm analysis have been developed some of which have fewer preconditions. Simply demanding that these forms of analysi
6 0.63313681 104 hunch net-2005-08-22-Do you believe in induction?
7 0.63229519 57 hunch net-2005-04-16-Which Assumptions are Reasonable?
8 0.62890363 55 hunch net-2005-04-10-Is the Goal Understanding or Prediction?
9 0.61344689 42 hunch net-2005-03-17-Going all the Way, Sometimes
10 0.59581625 52 hunch net-2005-04-04-Grounds for Rejection
11 0.5949952 454 hunch net-2012-01-30-ICML Posters and Scope
12 0.59188461 70 hunch net-2005-05-12-Math on the Web
13 0.58813244 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?
14 0.58605248 98 hunch net-2005-07-27-Not goal metrics
15 0.58092362 256 hunch net-2007-07-20-Motivation should be the Responsibility of the Reviewer
16 0.5805217 241 hunch net-2007-04-28-The Coming Patent Apocalypse
17 0.57555604 91 hunch net-2005-07-10-Thinking the Unthought
18 0.57546467 22 hunch net-2005-02-18-What it means to do research.
19 0.5636971 231 hunch net-2007-02-10-Best Practices for Collaboration
20 0.55026424 351 hunch net-2009-05-02-Wielding a New Abstraction
topicId topicWeight
[(3, 0.014), (10, 0.029), (27, 0.191), (38, 0.068), (53, 0.085), (55, 0.096), (56, 0.281), (77, 0.013), (94, 0.1), (95, 0.036)]
simIndex simValue blogId blogTitle
1 0.96897036 187 hunch net-2006-06-25-Presentation of Proofs is Hard.
Introduction: When presenting part of the Reinforcement Learning theory tutorial at ICML 2006 , I was forcibly reminded of this. There are several difficulties. When creating the presentation, the correct level of detail is tricky. With too much detail, the proof takes too much time and people may be lost to boredom. With too little detail, the steps of the proof involve too-great a jump. This is very difficult to judge. What may be an easy step in the careful thought of a quiet room is not so easy when you are occupied by the process of presentation. What may be easy after having gone over this (and other) proofs is not so easy to follow in the first pass by a viewer. These problems seem only correctable by process of repeated test-and-revise. When presenting the proof, simply speaking with sufficient precision is substantially harder than in normal conversation (where precision is not so critical). Practice can help here. When presenting the proof, going at the right p
2 0.91648346 307 hunch net-2008-07-04-More Presentation Preparation
Introduction: We’ve discussed presentation preparation before , but I have one more thing to add: transitioning . For a research presentation, it is substantially helpful for the audience if transitions are clear. A common outline for a research presentation in machine leanring is: The problem . Presentations which don’t describe the problem almost immediately lose people, because the context is missing to understand the detail. Prior relevant work . In many cases, a paper builds on some previous bit of work which must be understood in order to understand what the paper does. A common failure mode seems to be spending too much time on prior work. Discuss just the relevant aspects of prior work in the language of your work. Sometimes this is missing when unneeded. What we did . For theory papers in particular, it is often not possible to really cover the details. Prioritizing what you present can be very important. How it worked . Many papers in Machine Learning have some sor
3 0.90764982 250 hunch net-2007-06-23-Machine Learning Jobs are Growing on Trees
Introduction: The consensus of several discussions at ICML is that the number of jobs for people knowing machine learning well substantially exceeds supply. This is my experience as well. Demand comes from many places, but I’ve seen particularly strong demand from trading companies and internet startups. Like all interest bursts, this one will probably pass because of economic recession or other distractions. Nevertheless, the general outlook for machine learning in business seems to be good. Machine learning is all about optimization when there is uncertainty and lots of data. The quantity of data available is growing quickly as computer-run processes and sensors become more common, and the quality of the data is dropping since there is little editorial control in it’s collection. Machine Learning is a difficult subject to master (*), so those who do should remain in demand over the long term. (*) In fact, it would be reasonable to claim that no one has mastered it—there are just some peo
same-blog 4 0.86887366 202 hunch net-2006-08-10-Precision is not accuracy
Introduction: In my experience, there are two different groups of people who believe the same thing: the mathematics encountered in typical machine learning conference papers is often of questionable value. The two groups who agree on this are applied machine learning people who have given up on math, and mature theoreticians who understand the limits of theory. Partly, this is just a statement about where we are with respect to machine learning. In particular, we have no mechanism capable of generating a prescription for how to solve all learning problems. In the absence of such certainty, people try to come up with formalisms that partially describe and motivate how and why they do things. This is natural and healthy—we might hope that it will eventually lead to just such a mechanism. But, part of this is simply an emphasis on complexity over clarity. A very natural and simple theoretical statement is often obscured by complexifications. Common sources of complexification include:
5 0.8255797 460 hunch net-2012-03-24-David Waltz
Introduction: has died . He lived a full life. I know him personally as a founder of the Center for Computational Learning Systems and the New York Machine Learning Symposium , both of which have sheltered and promoted the advancement of machine learning. I expect much of the New York area machine learning community will miss him, as well as many others around the world.
6 0.74368155 356 hunch net-2009-05-24-2009 ICML discussion site
7 0.68910623 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
8 0.68342495 249 hunch net-2007-06-21-Presentation Preparation
9 0.67336905 416 hunch net-2010-10-29-To Vidoelecture or not
10 0.67003638 379 hunch net-2009-11-23-ICML 2009 Workshops (and Tutorials)
11 0.6685257 141 hunch net-2005-12-17-Workshops as Franchise Conferences
12 0.66542 95 hunch net-2005-07-14-What Learning Theory might do
13 0.66535574 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
14 0.65966815 437 hunch net-2011-07-10-ICML 2011 and the future
15 0.65892041 297 hunch net-2008-04-22-Taking the next step
16 0.65675092 464 hunch net-2012-05-03-Microsoft Research, New York City
17 0.65460283 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
18 0.6543712 343 hunch net-2009-02-18-Decision by Vetocracy
19 0.65422279 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models
20 0.65384889 259 hunch net-2007-08-19-Choice of Metrics