hunch_net hunch_net-2006 hunch_net-2006-158 knowledge-graph by maker-knowledge-mining

158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

meta infos for this blog

Source: html

Introduction: There are several different flavors of Machine Learning classes. Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. Others avoid the zoo by not covering the full scope of machine learning. This is my view of what makes a good machine learning class, along with why. I’d like to specifically invite comment on whether things are missing, misemphasized, or misplaced. Phase Subject Why? Introduction What is a machine learning problem? A good understanding of the characteristics of machine learning problems seems essential. Characteristics include: a data source, some hope the data is predictive, and a need for generalization. This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem?” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step i

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Many classes are of the ‘zoo’ sort: many different learning algorithms are presented. [sent-2, score-0.216]

2 Others avoid the zoo by not covering the full scope of machine learning. [sent-3, score-0.396]

3 This is my view of what makes a good machine learning class, along with why. [sent-4, score-0.274]

4 Introduction What is a machine learning problem? [sent-7, score-0.274]

5 A good understanding of the characteristics of machine learning problems seems essential. [sent-8, score-0.484]

6 This is probably best taught in a case study manner: lay out the specifics of some problem and then ask “Is this a machine learning problem? [sent-10, score-0.425]

7 ” Introduction Machine Learning Problem Identification Identification and recognition of the type of learning problems is (obviously) a very important step in solving such problems. [sent-11, score-0.212]

8 Introduction Example algorithm 1 To really understand machine learning, a couple learning algorithms must be understood in detail. [sent-13, score-0.517]

9 The reason why the number is “2″ and not “1″ or “3″ is that 2 is the minimum number required to make people naturally aware of the degrees of freedom available in learning algorithm design. [sent-15, score-0.366]

10 Analysis Bias for Learning The need for a good bias is one of the defining characteristics of learning. [sent-16, score-0.506]

11 This statement is generic so it will always apply to one degree or another. [sent-18, score-0.242]

12 This is the boosting observation: that it is possible to bootstrap predictive ability to create a better overall system. [sent-20, score-0.272]

13 Analysis Learning can be transformed This is the reductions observation: that the ability to solve one kind of learning problems implies the ability to solve other kinds of leanring problems. [sent-22, score-0.675]

14 Analysis Learning can be preserved This is the online learning with experts observation: that we can have a master algorithm which preserves the best learning performance of subalgorithms. [sent-24, score-0.439]

15 Analysis Hardness of Learning It turns out that there are several different ways in which machine learning can be hard including computational and information theoretic hardness. [sent-28, score-0.347]

16 An understanding of how and why learning algorithms can fail seems important to understand the process. [sent-30, score-0.288]

17 Applications Vision One example of how learning is applied to solve vision problems. [sent-31, score-0.329]

18 Applications Robotics Ditto for robotics Applications Speech Ditto for speech Applications Businesses Ditto for businesses Where is machine learning going? [sent-33, score-0.661]

19 It should be understood that the field of machine learning is changing rapidly. [sent-35, score-0.366]

20 The emphasis here is on fundamentals: generally applicable mathematical statements and understandings of the learning problem. [sent-36, score-0.343]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('ditto', 0.378), ('introduction', 0.269), ('characteristics', 0.21), ('applications', 0.197), ('analysis', 0.196), ('zoo', 0.189), ('statement', 0.177), ('businesses', 0.155), ('identification', 0.14), ('learning', 0.14), ('bias', 0.136), ('machine', 0.134), ('robotics', 0.126), ('emphasis', 0.119), ('observation', 0.115), ('speech', 0.106), ('ability', 0.104), ('vision', 0.101), ('predictive', 0.095), ('need', 0.093), ('understood', 0.092), ('similarly', 0.09), ('solve', 0.088), ('preserves', 0.084), ('integrity', 0.084), ('understandings', 0.084), ('degrees', 0.084), ('language', 0.081), ('leanring', 0.078), ('flavors', 0.078), ('specifics', 0.078), ('algorithms', 0.076), ('algorithm', 0.075), ('transformed', 0.073), ('lay', 0.073), ('bootstrap', 0.073), ('overfit', 0.073), ('theoretic', 0.073), ('avoid', 0.073), ('important', 0.072), ('cut', 0.07), ('concept', 0.07), ('etc', 0.07), ('classification', 0.069), ('freedom', 0.067), ('phase', 0.067), ('defining', 0.067), ('priors', 0.065), ('generic', 0.065), ('insert', 0.065)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

2 0.14866036 332 hunch net-2008-12-23-Use of Learning Theory

Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example

3 0.14810413 235 hunch net-2007-03-03-All Models of Learning have Flaws

Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c

4 0.14640154 347 hunch net-2009-03-26-Machine Learning is too easy

Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there

5 0.14055836 14 hunch net-2005-02-07-The State of the Reduction

Introduction: What? Reductions are machines which turn solvers for one problem into solvers for another problem. Why? Reductions are useful for several reasons. Laziness . Reducing a problem to classification make at least 10 learning algorithms available to solve a problem. Inventing 10 learning algorithms is quite a bit of work. Similarly, programming a reduction is often trivial, while programming a learning algorithm is a great deal of work. Crystallization . The problems we often want to solve in learning are worst-case-impossible, but average case feasible. By reducing all problems onto one or a few primitives, we can fine tune these primitives to perform well on real-world problems with greater precision due to the greater number of problems to validate on. Theoretical Organization . By studying what reductions are easy vs. hard vs. impossible, we can learn which problems are roughly equivalent in difficulty and which are much harder. What we know now . Typesafe r

6 0.12612534 12 hunch net-2005-02-03-Learning Theory, by assumption

7 0.12296529 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

8 0.12282889 148 hunch net-2006-01-13-Benchmarks for RL

9 0.12077426 388 hunch net-2010-01-24-Specializations of the Master Problem

10 0.11582124 351 hunch net-2009-05-02-Wielding a New Abstraction

11 0.11463309 56 hunch net-2005-04-14-Families of Learning Theory Statements

12 0.11327409 90 hunch net-2005-07-07-The Limits of Learning Theory

13 0.11258768 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability

14 0.11224008 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

15 0.10934982 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

16 0.10603522 237 hunch net-2007-04-02-Contextual Scaling

17 0.10594858 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

18 0.10535058 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms

19 0.10461318 454 hunch net-2012-01-30-ICML Posters and Scope

20 0.10408168 343 hunch net-2009-02-18-Decision by Vetocracy

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.267), (1, 0.122), (2, -0.048), (3, 0.014), (4, 0.037), (5, -0.072), (6, 0.048), (7, 0.042), (8, 0.05), (9, -0.028), (10, -0.027), (11, -0.096), (12, 0.069), (13, 0.025), (14, 0.016), (15, 0.066), (16, 0.094), (17, -0.045), (18, -0.047), (19, -0.062), (20, 0.079), (21, -0.036), (22, -0.012), (23, -0.141), (24, -0.025), (25, -0.059), (26, 0.046), (27, -0.001), (28, -0.042), (29, 0.058), (30, -0.013), (31, 0.034), (32, -0.015), (33, -0.012), (34, -0.035), (35, -0.016), (36, -0.013), (37, -0.017), (38, 0.07), (39, 0.011), (40, 0.007), (41, 0.061), (42, -0.027), (43, -0.017), (44, -0.08), (45, -0.006), (46, -0.006), (47, -0.012), (48, 0.038), (49, -0.023)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.9542523 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

2 0.796462 148 hunch net-2006-01-13-Benchmarks for RL

Introduction: A couple years ago, Drew Bagnell and I started the RLBench project to setup a suite of reinforcement learning benchmark problems. We haven’t been able to touch it (due to lack of time) for a year so the project is on hold. Luckily, there are several other projects such as CLSquare and RL-Glue with a similar goal, and we strongly endorse their continued development. I would like to explain why, especially in the context of criticism of other learning benchmarks. For example, sometimes the UCI Machine Learning Repository is criticized. There are two criticisms I know of: Learning algorithms have overfit to the problems in the repository. It is easy to imagine a mechanism for this happening unintentionally. Strong evidence of this would be provided by learning algorithms which perform great on the UCI machine learning repository but very badly (relative to other learning algorithms) on non-UCI learning problems. I have seen little evidence of this but it remains a po

3 0.78886229 347 hunch net-2009-03-26-Machine Learning is too easy

4 0.77853179 235 hunch net-2007-03-03-All Models of Learning have Flaws

5 0.75228238 332 hunch net-2008-12-23-Use of Learning Theory

6 0.74488425 12 hunch net-2005-02-03-Learning Theory, by assumption

7 0.74346185 6 hunch net-2005-01-27-Learning Complete Problems

8 0.72812963 168 hunch net-2006-04-02-Mad (Neuro)science

9 0.72443235 351 hunch net-2009-05-02-Wielding a New Abstraction

10 0.71729565 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms

11 0.7171368 202 hunch net-2006-08-10-Precision is not accuracy

12 0.71612602 28 hunch net-2005-02-25-Problem: Online Learning

13 0.71519202 104 hunch net-2005-08-22-Do you believe in induction?

14 0.71002001 95 hunch net-2005-07-14-What Learning Theory might do

15 0.70099771 253 hunch net-2007-07-06-Idempotent-capable Predictors

16 0.69447213 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

17 0.69373226 133 hunch net-2005-11-28-A question of quantification

18 0.68733281 348 hunch net-2009-04-02-Asymmophobia

19 0.68355834 152 hunch net-2006-01-30-Should the Input Representation be a Vector?

20 0.68019575 345 hunch net-2009-03-08-Prediction Science

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.028), (10, 0.037), (16, 0.04), (27, 0.234), (37, 0.024), (38, 0.023), (40, 0.15), (53, 0.115), (55, 0.077), (64, 0.018), (83, 0.019), (94, 0.09), (95, 0.036)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.91764635 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

2 0.87122238 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem

Introduction: “Deep learning” is used to describe learning architectures which have significant depth (as a circuit). One claim is that shallow architectures (one or two layers) can not concisely represent some functions while a circuit with more depth can concisely represent these same functions. Proving lower bounds on the size of a circuit is substantially harder than upper bounds (which are constructive), but some results are known. Luca Trevisan ‘s class notes detail how XOR is not concisely representable by “AC0″ (= constant depth unbounded fan-in AND, OR, NOT gates). This doesn’t quite prove that depth is necessary for the representations commonly used in learning (such as a thresholded weighted sum), but it is strongly suggestive that this is so. Examples like this are a bit disheartening because existing algorithms for deep learning (deep belief nets, gradient descent on deep neural networks, and a perhaps decision trees depending on who you ask) can’t learn XOR very easily.

3 0.86978698 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

Introduction: Muthu invited me to the workshop on algorithms in the field , with the goal of providing a sense of where near-term research should go. When the time came though, I bargained for a post instead, which provides a chance for many other people to comment. There are several things I didn’t fully understand when I went to Yahoo! about 5 years ago. I’d like to repeat them as people in academia may not yet understand them intuitively. Almost all the big impact algorithms operate in pseudo-linear or better time. Think about caching, hashing, sorting, filtering, etc… and you have a sense of what some of the most heavily used algorithms are. This matters quite a bit to Machine Learning research, because people often work with superlinear time algorithms and languages. Two very common examples of this are graphical models, where inference is often a superlinear operation—think about the n 2 dependence on the number of states in a Hidden Markov Model and Kernelized Support Vecto

4 0.85697794 347 hunch net-2009-03-26-Machine Learning is too easy

5 0.85436332 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

Introduction: Many people in Machine Learning don’t fully understand the impact of computation, as demonstrated by a lack of big-O analysis of new learning algorithms. This is important—some current active research programs are fundamentally flawed w.r.t. computation, and other research programs are directly motivated by it. When considering a learning algorithm, I think about the following questions: How does the learning algorithm scale with the number of examples m ? Any algorithm using all of the data is at least O(m) , but in many cases this is O(m 2 ) (naive nearest neighbor for self-prediction) or unknown (k-means or many other optimization algorithms). The unknown case is very common, and it can mean (for example) that the algorithm isn’t convergent or simply that the amount of computation isn’t controlled. The above question can also be asked for test cases. In some applications, test-time performance is of great importance. How does the algorithm scale with the number of

6 0.85409981 19 hunch net-2005-02-14-Clever Methods of Overfitting

7 0.85082912 370 hunch net-2009-09-18-Necessary and Sufficient Research

8 0.85055637 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

9 0.84951049 201 hunch net-2006-08-07-The Call of the Deep

10 0.84931576 95 hunch net-2005-07-14-What Learning Theory might do

11 0.84807485 98 hunch net-2005-07-27-Not goal metrics

12 0.84774595 351 hunch net-2009-05-02-Wielding a New Abstraction

13 0.84682065 194 hunch net-2006-07-11-New Models

14 0.8460421 207 hunch net-2006-09-12-Incentive Compatible Reviewing

15 0.84600353 332 hunch net-2008-12-23-Use of Learning Theory

16 0.8451165 343 hunch net-2009-02-18-Decision by Vetocracy

17 0.84469754 358 hunch net-2009-06-01-Multitask Poisoning

18 0.84394377 134 hunch net-2005-12-01-The Webscience Future

19 0.84335577 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

20 0.84320712 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models