hunch_net hunch_net-2006 hunch_net-2006-158 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: There are several different flavors of Machine Learning classes. Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. Others avoid the zoo by not covering the full scope of machine learning. This is my view of what makes a good machine learning class, along with why. I’d like to specifically invite comment on whether things are missing, misemphasized, or misplaced. Phase Subject Why? Introduction What is a machine learning problem? A good understanding of the characteristics of machine learning problems seems essential. Characteristics include: a data source, some hope the data is predictive, and a need for generalization. This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem?” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step i
sentIndex sentText sentNum sentScore
1 Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. [sent-2, score-0.216]
2 Others avoid the zoo by not covering the full scope of machine learning. [sent-3, score-0.396]
3 This is my view of what makes a good machine learning class, along with why. [sent-4, score-0.274]
4 Introduction What is a machine learning problem? [sent-7, score-0.274]
5 A good understanding of the characteristics of machine learning problems seems essential. [sent-8, score-0.484]
6 This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem? [sent-10, score-0.425]
7 ” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step in solving such problems. [sent-11, score-0.212]
8 Introduction Example algorithm 1 To really understand machine learning, a couple learning algorithms must be understood in detail. [sent-13, score-0.517]
9 The reason why the number is “2″ and not “1″ or “3″ is that 2 is the minimum number required to make people naturally aware of the degrees of freedom available in learning algorithm design. [sent-15, score-0.366]
10 Analysis Bias for Learning The need for a good bias is one of the defining characteristics of learning. [sent-16, score-0.506]
11 This statement is generic so it will always apply to one degree or another. [sent-18, score-0.242]
12 This is the boosting observation: that it is possible to bootstrap predictive ability to create a better overall system. [sent-20, score-0.272]
13 Analysis Learning can be transformed This is the reductions observation: that the ability to solve one kind of learning problems implies the ability to solve other kinds of leanring problems. [sent-22, score-0.675]
14 Analysis Learning can be preserved This is the online learning with experts observation: that we can have a master algorithm which preserves the best learning performance of subalgorithms. [sent-24, score-0.439]
15 Analysis Hardness of Learning It turns out that there are several different ways in which machine learning can be hard including computational and information theoretic hardness. [sent-28, score-0.347]
16 An understanding of how and why learning algorithms can fail seems important to understand the process. [sent-30, score-0.288]
17 Applications Vision One example of how learning is applied to solve vision problems. [sent-31, score-0.329]
18 Applications Robotics Ditto for robotics Applications Speech Ditto for speech Applications Businesses Ditto for businesses Where is machine learning going? [sent-33, score-0.661]
19 It should be understood that the field of machine learning is changing rapidly. [sent-35, score-0.366]
20 The emphasis here is on fundamentals: generally applicable mathematical statements and understandings of the learning problem. [sent-36, score-0.343]
wordName wordTfidf (topN-words)
[('ditto', 0.378), ('introduction', 0.269), ('characteristics', 0.21), ('applications', 0.197), ('analysis', 0.196), ('zoo', 0.189), ('statement', 0.177), ('businesses', 0.155), ('identification', 0.14), ('learning', 0.14), ('bias', 0.136), ('machine', 0.134), ('robotics', 0.126), ('emphasis', 0.119), ('observation', 0.115), ('speech', 0.106), ('ability', 0.104), ('vision', 0.101), ('predictive', 0.095), ('need', 0.093), ('understood', 0.092), ('similarly', 0.09), ('solve', 0.088), ('preserves', 0.084), ('integrity', 0.084), ('understandings', 0.084), ('degrees', 0.084), ('language', 0.081), ('leanring', 0.078), ('flavors', 0.078), ('specifics', 0.078), ('algorithms', 0.076), ('algorithm', 0.075), ('transformed', 0.073), ('lay', 0.073), ('bootstrap', 0.073), ('overfit', 0.073), ('theoretic', 0.073), ('avoid', 0.073), ('important', 0.072), ('cut', 0.07), ('concept', 0.07), ('etc', 0.07), ('classification', 0.069), ('freedom', 0.067), ('phase', 0.067), ('defining', 0.067), ('priors', 0.065), ('generic', 0.065), ('insert', 0.065)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning
Introduction: There are several different flavors of Machine Learning classes. Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. Others avoid the zoo by not covering the full scope of machine learning. This is my view of what makes a good machine learning class, along with why. I’d like to specifically invite comment on whether things are missing, misemphasized, or misplaced. Phase Subject Why? Introduction What is a machine learning problem? A good understanding of the characteristics of machine learning problems seems essential. Characteristics include: a data source, some hope the data is predictive, and a need for generalization. This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem?” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step i
2 0.14866036 332 hunch net-2008-12-23-Use of Learning Theory
Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example
3 0.14810413 235 hunch net-2007-03-03-All Models of Learning have Flaws
Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c
4 0.14640154 347 hunch net-2009-03-26-Machine Learning is too easy
Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there
5 0.14055836 14 hunch net-2005-02-07-The State of the Reduction
Introduction: What? Reductions are machines which turn solvers for one problem into solvers for another problem. Why? Reductions are useful for several reasons. Laziness . Reducing a problem to classification make at least 10 learning algorithms available to solve a problem. Inventing 10 learning algorithms is quite a bit of work. Similarly, programming a reduction is often trivial, while programming a learning algorithm is a great deal of work. Crystallization . The problems we often want to solve in learning are worst-case-impossible, but average case feasible. By reducing all problems onto one or a few primitives, we can fine tune these primitives to perform well on real-world problems with greater precision due to the greater number of problems to validate on. Theoretical Organization . By studying what reductions are easy vs. hard vs. impossible, we can learn which problems are roughly equivalent in difficulty and which are much harder. What we know now . Typesafe r
6 0.12612534 12 hunch net-2005-02-03-Learning Theory, by assumption
7 0.12296529 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
8 0.12282889 148 hunch net-2006-01-13-Benchmarks for RL
9 0.12077426 388 hunch net-2010-01-24-Specializations of the Master Problem
10 0.11582124 351 hunch net-2009-05-02-Wielding a New Abstraction
11 0.11463309 56 hunch net-2005-04-14-Families of Learning Theory Statements
12 0.11327409 90 hunch net-2005-07-07-The Limits of Learning Theory
13 0.11258768 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability
14 0.11224008 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions
15 0.10934982 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification
16 0.10603522 237 hunch net-2007-04-02-Contextual Scaling
17 0.10594858 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
18 0.10535058 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms
19 0.10461318 454 hunch net-2012-01-30-ICML Posters and Scope
20 0.10408168 343 hunch net-2009-02-18-Decision by Vetocracy
topicId topicWeight
[(0, 0.267), (1, 0.122), (2, -0.048), (3, 0.014), (4, 0.037), (5, -0.072), (6, 0.048), (7, 0.042), (8, 0.05), (9, -0.028), (10, -0.027), (11, -0.096), (12, 0.069), (13, 0.025), (14, 0.016), (15, 0.066), (16, 0.094), (17, -0.045), (18, -0.047), (19, -0.062), (20, 0.079), (21, -0.036), (22, -0.012), (23, -0.141), (24, -0.025), (25, -0.059), (26, 0.046), (27, -0.001), (28, -0.042), (29, 0.058), (30, -0.013), (31, 0.034), (32, -0.015), (33, -0.012), (34, -0.035), (35, -0.016), (36, -0.013), (37, -0.017), (38, 0.07), (39, 0.011), (40, 0.007), (41, 0.061), (42, -0.027), (43, -0.017), (44, -0.08), (45, -0.006), (46, -0.006), (47, -0.012), (48, 0.038), (49, -0.023)]
simIndex simValue blogId blogTitle
same-blog 1 0.9542523 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning
Introduction: There are several different flavors of Machine Learning classes. Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. Others avoid the zoo by not covering the full scope of machine learning. This is my view of what makes a good machine learning class, along with why. I’d like to specifically invite comment on whether things are missing, misemphasized, or misplaced. Phase Subject Why? Introduction What is a machine learning problem? A good understanding of the characteristics of machine learning problems seems essential. Characteristics include: a data source, some hope the data is predictive, and a need for generalization. This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem?” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step i
2 0.796462 148 hunch net-2006-01-13-Benchmarks for RL
Introduction: A couple years ago, Drew Bagnell and I started the RLBench project to setup a suite of reinforcement learning benchmark problems. We haven’t been able to touch it (due to lack of time) for a year so the project is on hold. Luckily, there are several other projects such as CLSquare and RL-Glue with a similar goal, and we strongly endorse their continued development. I would like to explain why, especially in the context of criticism of other learning benchmarks. For example, sometimes the UCI Machine Learning Repository is criticized. There are two criticisms I know of: Learning algorithms have overfit to the problems in the repository. It is easy to imagine a mechanism for this happening unintentionally. Strong evidence of this would be provided by learning algorithms which perform great on the UCI machine learning repository but very badly (relative to other learning algorithms) on non-UCI learning problems. I have seen little evidence of this but it remains a po
3 0.78886229 347 hunch net-2009-03-26-Machine Learning is too easy
Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there
4 0.77853179 235 hunch net-2007-03-03-All Models of Learning have Flaws
Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c
5 0.75228238 332 hunch net-2008-12-23-Use of Learning Theory
Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example
6 0.74488425 12 hunch net-2005-02-03-Learning Theory, by assumption
7 0.74346185 6 hunch net-2005-01-27-Learning Complete Problems
8 0.72812963 168 hunch net-2006-04-02-Mad (Neuro)science
9 0.72443235 351 hunch net-2009-05-02-Wielding a New Abstraction
10 0.71729565 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms
11 0.7171368 202 hunch net-2006-08-10-Precision is not accuracy
12 0.71612602 28 hunch net-2005-02-25-Problem: Online Learning
13 0.71519202 104 hunch net-2005-08-22-Do you believe in induction?
14 0.71002001 95 hunch net-2005-07-14-What Learning Theory might do
15 0.70099771 253 hunch net-2007-07-06-Idempotent-capable Predictors
16 0.69447213 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity
17 0.69373226 133 hunch net-2005-11-28-A question of quantification
18 0.68733281 348 hunch net-2009-04-02-Asymmophobia
19 0.68355834 152 hunch net-2006-01-30-Should the Input Representation be a Vector?
20 0.68019575 345 hunch net-2009-03-08-Prediction Science
topicId topicWeight
[(3, 0.028), (10, 0.037), (16, 0.04), (27, 0.234), (37, 0.024), (38, 0.023), (40, 0.15), (53, 0.115), (55, 0.077), (64, 0.018), (83, 0.019), (94, 0.09), (95, 0.036)]
simIndex simValue blogId blogTitle
same-blog 1 0.91764635 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning
Introduction: There are several different flavors of Machine Learning classes. Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. Others avoid the zoo by not covering the full scope of machine learning. This is my view of what makes a good machine learning class, along with why. I’d like to specifically invite comment on whether things are missing, misemphasized, or misplaced. Phase Subject Why? Introduction What is a machine learning problem? A good understanding of the characteristics of machine learning problems seems essential. Characteristics include: a data source, some hope the data is predictive, and a need for generalization. This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem?” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step i
2 0.87122238 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem
Introduction: “Deep learning” is used to describe learning architectures which have significant depth (as a circuit). One claim is that shallow architectures (one or two layers) can not concisely represent some functions while a circuit with more depth can concisely represent these same functions. Proving lower bounds on the size of a circuit is substantially harder than upper bounds (which are constructive), but some results are known. Luca Trevisan ‘s class notes detail how XOR is not concisely representable by “AC0″ (= constant depth unbounded fan-in AND, OR, NOT gates). This doesn’t quite prove that depth is necessary for the representations commonly used in learning (such as a thresholded weighted sum), but it is strongly suggestive that this is so. Examples like this are a bit disheartening because existing algorithms for deep learning (deep belief nets, gradient descent on deep neural networks, and a perhaps decision trees depending on who you ask) can’t learn XOR very easily.
3 0.86978698 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
Introduction: Muthu invited me to the workshop on algorithms in the field , with the goal of providing a sense of where near-term research should go. When the time came though, I bargained for a post instead, which provides a chance for many other people to comment. There are several things I didn’t fully understand when I went to Yahoo! about 5 years ago. I’d like to repeat them as people in academia may not yet understand them intuitively. Almost all the big impact algorithms operate in pseudo-linear or better time. Think about caching, hashing, sorting, filtering, etc… and you have a sense of what some of the most heavily used algorithms are. This matters quite a bit to Machine Learning research, because people often work with superlinear time algorithms and languages. Two very common examples of this are graphical models, where inference is often a superlinear operation—think about the n 2 dependence on the number of states in a Hidden Markov Model and Kernelized Support Vecto
4 0.85697794 347 hunch net-2009-03-26-Machine Learning is too easy
Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there
5 0.85436332 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
Introduction: Many people in Machine Learning don’t fully understand the impact of computation, as demonstrated by a lack of big-O analysis of new learning algorithms. This is important—some current active research programs are fundamentally flawed w.r.t. computation, and other research programs are directly motivated by it. When considering a learning algorithm, I think about the following questions: How does the learning algorithm scale with the number of examples m ? Any algorithm using all of the data is at least O(m) , but in many cases this is O(m 2 ) (naive nearest neighbor for self-prediction) or unknown (k-means or many other optimization algorithms). The unknown case is very common, and it can mean (for example) that the algorithm isn’t convergent or simply that the amount of computation isn’t controlled. The above question can also be asked for test cases. In some applications, test-time performance is of great importance. How does the algorithm scale with the number of
6 0.85409981 19 hunch net-2005-02-14-Clever Methods of Overfitting
7 0.85082912 370 hunch net-2009-09-18-Necessary and Sufficient Research
8 0.85055637 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class
9 0.84951049 201 hunch net-2006-08-07-The Call of the Deep
10 0.84931576 95 hunch net-2005-07-14-What Learning Theory might do
11 0.84807485 98 hunch net-2005-07-27-Not goal metrics
12 0.84774595 351 hunch net-2009-05-02-Wielding a New Abstraction
13 0.84682065 194 hunch net-2006-07-11-New Models
14 0.8460421 207 hunch net-2006-09-12-Incentive Compatible Reviewing
15 0.84600353 332 hunch net-2008-12-23-Use of Learning Theory
16 0.8451165 343 hunch net-2009-02-18-Decision by Vetocracy
17 0.84469754 358 hunch net-2009-06-01-Multitask Poisoning
18 0.84394377 134 hunch net-2005-12-01-The Webscience Future
19 0.84335577 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity
20 0.84320712 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models