hunch_net hunch_net-2005 hunch_net-2005-95 knowledge-graph by maker-knowledge-mining

95 hunch net-2005-07-14-What Learning Theory might do


meta infos for this blog

Source: html

Introduction: I wanted to expand on this post and some of the previous problems/research directions about where learning theory might make large strides. Why theory? The essential reason for theory is “intuition extension”. A very good applied learning person can master some particular application domain yielding the best computer algorithms for solving that problem. A very good theory can take the intuitions discovered by this and other applied learning people and extend them to new domains in a relatively automatic fashion. To do this, we take these basic intuitions and try to find a mathematical model that: Explains the basic intuitions. Makes new testable predictions about how to learn. Succeeds in so learning. This is “intuition extension”: taking what we have learned somewhere else and applying it in new domains. It is fundamentally useful to everyone because it increases the level of automation in solving problems. Where next for learning theory? I like the a


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I wanted to expand on this post and some of the previous problems/research directions about where learning theory might make large strides. [sent-1, score-0.319]

2 A very good applied learning person can master some particular application domain yielding the best computer algorithms for solving that problem. [sent-4, score-0.234]

3 A very good theory can take the intuitions discovered by this and other applied learning people and extend them to new domains in a relatively automatic fashion. [sent-5, score-1.094]

4 To do this, we take these basic intuitions and try to find a mathematical model that: Explains the basic intuitions. [sent-6, score-0.863]

5 At some point the physics model arose: you try to build mathematical models of what is happening and then make predictions based on the models. [sent-14, score-0.8]

6 We have some formalisms which are of some use in addressing novel learning problems, but the overall process of doing machine learning is not very close to “automatic”. [sent-17, score-0.336]

7 The good news is that over the last 20 years a much richer set of positive examples of succesful applied machine learning has developed. [sent-18, score-0.439]

8 Thus, there are many good intuitions from which we can hope to generalize. [sent-19, score-0.263]

9 Here are a few specific issues: What is the “right” mathematical model of learning? [sent-21, score-0.436]

10 (in analogy, What is the “right” mathematical model of physical phenomena? [sent-22, score-0.506]

11 Examples of this include: What is the “right” model of active learning ? [sent-25, score-0.287]

12 What is the “right” model of Reinforcement learning ? [sent-27, score-0.287]

13 Again, we know very little in comparison to what we want to know—a fully automatic general RL solver. [sent-28, score-0.251]

14 How do we refine the empirical observations and intuitions of applied learning? [sent-32, score-0.587]

15 At a minimum, information used to create a Bayesian prior often does not come in the form of a Bayesian prior, and so some translation system must be developed. [sent-35, score-0.207]

16 Some form of structure seems necessary , but the right form is still unclear. [sent-37, score-0.328]

17 How do we take existing theoretical insights and translate them into practical algorithms? [sent-39, score-0.322]

18 The method of linear projection into spaces has been studied theoretically . [sent-40, score-0.206]

19 The online learning setting seems theoretically compelling and, at least sometimes, empirically validated. [sent-42, score-0.413]

20 Getting from here to there of course will require a bit of work, some of which might be greatly aided by mathematical consideration. [sent-45, score-0.309]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('intuitions', 0.263), ('analogy', 0.237), ('mathematical', 0.23), ('model', 0.206), ('extension', 0.181), ('automatic', 0.179), ('theory', 0.163), ('right', 0.158), ('applied', 0.153), ('intuition', 0.135), ('theoretically', 0.131), ('physics', 0.125), ('prior', 0.122), ('succesful', 0.115), ('bayesian', 0.113), ('partially', 0.104), ('compelling', 0.102), ('empirically', 0.099), ('predictions', 0.091), ('extend', 0.09), ('formalisms', 0.09), ('richer', 0.09), ('testable', 0.09), ('take', 0.09), ('useful', 0.087), ('empirical', 0.087), ('form', 0.085), ('refine', 0.084), ('automation', 0.084), ('novel', 0.084), ('theoretical', 0.082), ('learning', 0.081), ('aided', 0.079), ('succeeds', 0.079), ('moderately', 0.079), ('somewhere', 0.079), ('phenomena', 0.075), ('occasionally', 0.075), ('translate', 0.075), ('expand', 0.075), ('domains', 0.075), ('insights', 0.075), ('projection', 0.075), ('try', 0.074), ('models', 0.074), ('systems', 0.073), ('wildly', 0.072), ('know', 0.072), ('design', 0.072), ('physical', 0.07)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 95 hunch net-2005-07-14-What Learning Theory might do

Introduction: I wanted to expand on this post and some of the previous problems/research directions about where learning theory might make large strides. Why theory? The essential reason for theory is “intuition extension”. A very good applied learning person can master some particular application domain yielding the best computer algorithms for solving that problem. A very good theory can take the intuitions discovered by this and other applied learning people and extend them to new domains in a relatively automatic fashion. To do this, we take these basic intuitions and try to find a mathematical model that: Explains the basic intuitions. Makes new testable predictions about how to learn. Succeeds in so learning. This is “intuition extension”: taking what we have learned somewhere else and applying it in new domains. It is fundamentally useful to everyone because it increases the level of automation in solving problems. Where next for learning theory? I like the a

2 0.26173189 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

3 0.20711625 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

Introduction: I don’t consider myself a “Bayesian”, but I do try hard to understand why Bayesian learning works. For the purposes of this post, Bayesian learning is a simple process of: Specify a prior over world models. Integrate using Bayes law with respect to all observed information to compute a posterior over world models. Predict according to the posterior. Bayesian learning has many advantages over other learning programs: Interpolation Bayesian learning methods interpolate all the way to pure engineering. When faced with any learning problem, there is a choice of how much time and effort a human vs. a computer puts in. (For example, the mars rover pathfinding algorithms are almost entirely engineered.) When creating an engineered system, you build a model of the world and then find a good controller in that model. Bayesian methods interpolate to this extreme because the Bayesian prior can be a delta function on one model of the world. What this means is that a recipe

4 0.20385425 332 hunch net-2008-12-23-Use of Learning Theory

Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example

5 0.17369002 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

Introduction: Bob Williamson and I are the learning theory PC members at NIPS this year. This is some attempt to state the standards and tests I applied to the papers. I think it is a good idea to talk about this for two reasons: Making community standards a matter of public record seems healthy. It give us a chance to debate what is and is not the right standard. It might even give us a bit more consistency across the years. It may save us all time. There are a number of papers submitted which just aren’t there yet. Avoiding submitting is the right decision in this case. There are several criteria for judging a paper. All of these were active this year. Some criteria are uncontroversial while others may be so. The paper must have a theorem establishing something new for which it is possible to derive high confidence in the correctness of the results. A surprising number of papers fail this test. This criteria seems essential to the definition of “theory”. Missing theo

6 0.17216493 90 hunch net-2005-07-07-The Limits of Learning Theory

7 0.16847232 235 hunch net-2007-03-03-All Models of Learning have Flaws

8 0.15380429 237 hunch net-2007-04-02-Contextual Scaling

9 0.14502963 454 hunch net-2012-01-30-ICML Posters and Scope

10 0.14016341 135 hunch net-2005-12-04-Watchword: model

11 0.13927282 347 hunch net-2009-03-26-Machine Learning is too easy

12 0.13604133 89 hunch net-2005-07-04-The Health of COLT

13 0.13600641 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

14 0.12600394 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

15 0.12422533 353 hunch net-2009-05-08-Computability in Artificial Intelligence

16 0.1230561 276 hunch net-2007-12-10-Learning Track of International Planning Competition

17 0.11949526 165 hunch net-2006-03-23-The Approximation Argument

18 0.11866391 28 hunch net-2005-02-25-Problem: Online Learning

19 0.1180876 360 hunch net-2009-06-15-In Active Learning, the question changes

20 0.1178461 22 hunch net-2005-02-18-What it means to do research.


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.305), (1, 0.087), (2, -0.051), (3, 0.068), (4, 0.045), (5, -0.079), (6, 0.029), (7, 0.071), (8, 0.159), (9, -0.079), (10, 0.03), (11, -0.086), (12, -0.058), (13, 0.089), (14, 0.094), (15, 0.0), (16, 0.092), (17, -0.064), (18, 0.109), (19, -0.1), (20, -0.043), (21, -0.019), (22, -0.012), (23, -0.029), (24, 0.029), (25, -0.018), (26, -0.012), (27, -0.033), (28, -0.036), (29, -0.003), (30, 0.013), (31, 0.068), (32, -0.02), (33, -0.0), (34, -0.047), (35, 0.016), (36, 0.034), (37, -0.182), (38, -0.051), (39, -0.072), (40, -0.079), (41, -0.003), (42, -0.076), (43, -0.005), (44, 0.059), (45, 0.079), (46, 0.054), (47, -0.07), (48, 0.036), (49, -0.034)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9632237 95 hunch net-2005-07-14-What Learning Theory might do

Introduction: I wanted to expand on this post and some of the previous problems/research directions about where learning theory might make large strides. Why theory? The essential reason for theory is “intuition extension”. A very good applied learning person can master some particular application domain yielding the best computer algorithms for solving that problem. A very good theory can take the intuitions discovered by this and other applied learning people and extend them to new domains in a relatively automatic fashion. To do this, we take these basic intuitions and try to find a mathematical model that: Explains the basic intuitions. Makes new testable predictions about how to learn. Succeeds in so learning. This is “intuition extension”: taking what we have learned somewhere else and applying it in new domains. It is fundamentally useful to everyone because it increases the level of automation in solving problems. Where next for learning theory? I like the a

2 0.83424133 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

3 0.78659379 135 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

4 0.73566371 235 hunch net-2007-03-03-All Models of Learning have Flaws

Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c

5 0.71241641 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

Introduction: I don’t consider myself a “Bayesian”, but I do try hard to understand why Bayesian learning works. For the purposes of this post, Bayesian learning is a simple process of: Specify a prior over world models. Integrate using Bayes law with respect to all observed information to compute a posterior over world models. Predict according to the posterior. Bayesian learning has many advantages over other learning programs: Interpolation Bayesian learning methods interpolate all the way to pure engineering. When faced with any learning problem, there is a choice of how much time and effort a human vs. a computer puts in. (For example, the mars rover pathfinding algorithms are almost entirely engineered.) When creating an engineered system, you build a model of the world and then find a good controller in that model. Bayesian methods interpolate to this extreme because the Bayesian prior can be a delta function on one model of the world. What this means is that a recipe

6 0.68669057 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

7 0.6725964 168 hunch net-2006-04-02-Mad (Neuro)science

8 0.64809936 347 hunch net-2009-03-26-Machine Learning is too easy

9 0.64658731 237 hunch net-2007-04-02-Contextual Scaling

10 0.64375031 90 hunch net-2005-07-07-The Limits of Learning Theory

11 0.64313203 97 hunch net-2005-07-23-Interesting papers at ACL

12 0.64042699 3 hunch net-2005-01-24-The Humanloop Spectrum of Machine Learning

13 0.6280185 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

14 0.62579989 332 hunch net-2008-12-23-Use of Learning Theory

15 0.60871416 68 hunch net-2005-05-10-Learning Reductions are Reductionist

16 0.60735816 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

17 0.60597336 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

18 0.58876383 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

19 0.58799362 351 hunch net-2009-05-02-Wielding a New Abstraction

20 0.587403 276 hunch net-2007-12-10-Learning Track of International Planning Competition


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.019), (10, 0.041), (24, 0.164), (27, 0.216), (38, 0.078), (53, 0.091), (54, 0.011), (55, 0.101), (83, 0.021), (94, 0.139), (95, 0.05)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.91726774 96 hunch net-2005-07-21-Six Months

Introduction: This is the 6 month point in the “run a research blog” experiment, so it seems like a good point to take stock and assess. One fundamental question is: “Is it worth it?” The idea of running a research blog will never become widely popular and useful unless it actually aids research. On the negative side, composing ideas for a post and maintaining a blog takes a significant amount of time. On the positive side, the process might yield better research because there is an opportunity for better, faster feedback implying better, faster thinking. My answer at the moment is a provisional “yes”. Running the blog has been incidentally helpful in several ways: It is sometimes educational. example More often, the process of composing thoughts well enough to post simply aids thinking. This has resulted in a couple solutions to problems of interest (and perhaps more over time). If you really want to solve a problem, letting the world know is helpful. This isn’t necessarily

same-blog 2 0.9126693 95 hunch net-2005-07-14-What Learning Theory might do

Introduction: I wanted to expand on this post and some of the previous problems/research directions about where learning theory might make large strides. Why theory? The essential reason for theory is “intuition extension”. A very good applied learning person can master some particular application domain yielding the best computer algorithms for solving that problem. A very good theory can take the intuitions discovered by this and other applied learning people and extend them to new domains in a relatively automatic fashion. To do this, we take these basic intuitions and try to find a mathematical model that: Explains the basic intuitions. Makes new testable predictions about how to learn. Succeeds in so learning. This is “intuition extension”: taking what we have learned somewhere else and applying it in new domains. It is fundamentally useful to everyone because it increases the level of automation in solving problems. Where next for learning theory? I like the a

3 0.85929668 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

Introduction: Many people in Machine Learning don’t fully understand the impact of computation, as demonstrated by a lack of big-O analysis of new learning algorithms. This is important—some current active research programs are fundamentally flawed w.r.t. computation, and other research programs are directly motivated by it. When considering a learning algorithm, I think about the following questions: How does the learning algorithm scale with the number of examples m ? Any algorithm using all of the data is at least O(m) , but in many cases this is O(m 2 ) (naive nearest neighbor for self-prediction) or unknown (k-means or many other optimization algorithms). The unknown case is very common, and it can mean (for example) that the algorithm isn’t convergent or simply that the amount of computation isn’t controlled. The above question can also be asked for test cases. In some applications, test-time performance is of great importance. How does the algorithm scale with the number of

4 0.8563481 12 hunch net-2005-02-03-Learning Theory, by assumption

Introduction: One way to organize learning theory is by assumption (in the assumption = axiom sense ), from no assumptions to many assumptions. As you travel down this list, the statements become stronger, but the scope of applicability decreases. No assumptions Online learning There exist a meta prediction algorithm which compete well with the best element of any set of prediction algorithms. Universal Learning Using a “bias” of 2 - description length of turing machine in learning is equivalent to all other computable biases up to some constant. Reductions The ability to predict well on classification problems is equivalent to the ability to predict well on many other learning problems. Independent and Identically Distributed (IID) Data Performance Prediction Based upon past performance, you can predict future performance. Uniform Convergence Performance prediction works even after choosing classifiers based on the data from large sets of classifiers.

5 0.84925449 98 hunch net-2005-07-27-Not goal metrics

Introduction: One of the confusing things about research is that progress is very hard to measure. One of the consequences of being in a hard-to-measure environment is that the wrong things are often measured. Lines of Code The classical example of this phenomenon is the old lines-of-code-produced metric for programming. It is easy to imagine systems for producing many lines of code with very little work that accomplish very little. Paper count In academia, a “paper count” is an analog of “lines of code”, and it suffers from the same failure modes. The obvious failure mode here is that we end up with a large number of uninteresting papers since people end up spending a lot of time optimizing this metric. Complexity Another metric, is “complexity” (in the eye of a reviewer) of a paper. There is a common temptation to make a method appear more complex than it is in order for reviewers to judge it worthy of publication. The failure mode here is unclean thinking. Simple effective m

6 0.84233022 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

7 0.83775115 297 hunch net-2008-04-22-Taking the next step

8 0.83708608 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

9 0.83430517 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

10 0.83363008 423 hunch net-2011-02-02-User preferences for search engines

11 0.83341694 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

12 0.83336639 437 hunch net-2011-07-10-ICML 2011 and the future

13 0.83295304 237 hunch net-2007-04-02-Contextual Scaling

14 0.8325966 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

15 0.83210981 343 hunch net-2009-02-18-Decision by Vetocracy

16 0.83208346 371 hunch net-2009-09-21-Netflix finishes (and starts)

17 0.8311348 207 hunch net-2006-09-12-Incentive Compatible Reviewing

18 0.83108026 235 hunch net-2007-03-03-All Models of Learning have Flaws

19 0.82930851 351 hunch net-2009-05-02-Wielding a New Abstraction

20 0.82923424 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?