hunch_net hunch_net-2005 hunch_net-2005-2 knowledge-graph by maker-knowledge-mining

2 hunch net-2005-01-24-Holy grails of machine learning?

meta infos for this blog

Source: html

Introduction: Let me kick things off by posing this question to ML researchers: What do you think are some important holy grails of machine learning? For example: – “A classifier with SVM-level performance but much more scalable” – “Practical confidence bounds (or learning bounds) for classification” – “A reinforcement learning algorithm that can handle the ___ problem” – “Understanding theoretically why ___ works so well in practice” etc. I pose this question because I believe that when goals are stated explicitly and well (thus providing clarity as well as opening up the problems to more people), rather than left implicit, they are likely to be achieved much more quickly. I would also like to know more about the internal goals of the various machine learning sub-areas (theory, kernel methods, graphical models, reinforcement learning, etc) as stated by people in these respective areas. This could help people cross sub-areas.

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Let me kick things off by posing this question to ML researchers: What do you think are some important holy grails of machine learning? [sent-1, score-0.687]

2 I pose this question because I believe that when goals are stated explicitly and well (thus providing clarity as well as opening up the problems to more people), rather than left implicit, they are likely to be achieved much more quickly. [sent-3, score-2.41]

3 I would also like to know more about the internal goals of the various machine learning sub-areas (theory, kernel methods, graphical models, reinforcement learning, etc) as stated by people in these respective areas. [sent-4, score-1.485]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('goals', 0.273), ('stated', 0.264), ('opening', 0.223), ('posing', 0.223), ('reinforcement', 0.217), ('bounds', 0.2), ('clarity', 0.195), ('kick', 0.195), ('pose', 0.186), ('scalable', 0.179), ('implicit', 0.173), ('thus', 0.167), ('theoretically', 0.162), ('internal', 0.154), ('achieved', 0.147), ('cross', 0.142), ('question', 0.14), ('graphical', 0.137), ('confidence', 0.134), ('handle', 0.134), ('left', 0.13), ('well', 0.13), ('providing', 0.126), ('kernel', 0.123), ('explicitly', 0.119), ('likely', 0.11), ('practical', 0.107), ('researchers', 0.106), ('practice', 0.105), ('people', 0.104), ('classifier', 0.103), ('let', 0.103), ('ml', 0.101), ('etc', 0.093), ('much', 0.091), ('classification', 0.091), ('models', 0.091), ('help', 0.09), ('performance', 0.088), ('methods', 0.087), ('works', 0.087), ('various', 0.082), ('believe', 0.082), ('understanding', 0.079), ('theory', 0.067), ('learning', 0.066), ('machine', 0.065), ('important', 0.064), ('could', 0.064), ('rather', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 2 hunch net-2005-01-24-Holy grails of machine learning?

2 0.18123776 41 hunch net-2005-03-15-The State of Tight Bounds

Introduction: What? Bounds are mathematical formulas relating observations to future error rates assuming that data is drawn independently. In classical statistics, they are calld confidence intervals. Why? Good Judgement . In many applications of learning, it is desirable to know how well the learned predictor works in the future. This helps you decide if the problem is solved or not. Learning Essence . The form of some of these bounds helps you understand what the essence of learning is. Algorithm Design . Some of these bounds suggest, motivate, or even directly imply learning algorithms. What We Know Now There are several families of bounds, based on how information is used. Testing Bounds . These are methods which use labeled data not used in training to estimate the future error rate. Examples include the test set bound , progressive validation also here and here , train and test bounds , and cross-validation (but see the big open problem ). These tec

3 0.15438792 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

Introduction: At an intuitive level, the question here is “Can reinforcement learning be solved with classification?” Problem Construct a reinforcement learning algorithm with near-optimal expected sum of rewards in the direct experience model given access to a classifier learning algorithm which has a small error rate or regret on all posed classification problems. The definition of “posed” here is slightly murky. I consider a problem “posed” if there is an algorithm for constructing labeled classification examples. Past Work There exists a reduction of reinforcement learning to classification given a generative model. A generative model is an inherently stronger assumption than the direct experience model. Other work on learning reductions may be important. Several algorithms for solving reinforcement learning in the direct experience model exist. Most, such as E 3 , Factored-E 3 , and metric-E 3 and Rmax require that the observation be the state. Recent work

4 0.13495059 332 hunch net-2008-12-23-Use of Learning Theory

Introduction: I’ve had serious conversations with several people who believe that the theory in machine learning is “only useful for getting papers published”. That’s a compelling statement, as I’ve seen many papers where the algorithm clearly came first, and the theoretical justification for it came second, purely as a perceived means to improve the chance of publication. Naturally, I disagree and believe that learning theory has much more substantial applications. Even in core learning algorithm design, I’ve found learning theory to be useful, although it’s application is more subtle than many realize. The most straightforward applications can fail, because (as expectation suggests) worst case bounds tend to be loose in practice (*). In my experience, considering learning theory when designing an algorithm has two important effects in practice: It can help make your algorithm behave right at a crude level of analysis, leaving finer details to tuning or common sense. The best example

5 0.12299822 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

Introduction: One prescription for solving a problem well is: State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution. Think about solutions to the stated problem. Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution. When a problem can not be stated succinctly, we wonder if the problem is even understood. (And when a problem is not understood, we wonder if a solution can be meaningful.) Reinforcement learning does step (1) well. It provides a clean simple language to state general AI problems. In reinforcement learning there is a set of actions A , a set of observations O , and a reward r . The reinforcement learning problem, in general, is defined by a conditional measure D( o, r | (o,r,a) * ) which produces an observation o and a reward r given a history (o,r,a) * . The goal in reinforcement learning is to find a policy pi:(o,r,a) * -> a

6 0.11829712 22 hunch net-2005-02-18-What it means to do research.

7 0.11163518 347 hunch net-2009-03-26-Machine Learning is too easy

8 0.10711751 3 hunch net-2005-01-24-The Humanloop Spectrum of Machine Learning

9 0.10424498 148 hunch net-2006-01-13-Benchmarks for RL

10 0.1008238 289 hunch net-2008-02-17-The Meaning of Confidence

11 0.10036343 360 hunch net-2009-06-15-In Active Learning, the question changes

12 0.098014772 26 hunch net-2005-02-21-Problem: Cross Validation

13 0.096483737 213 hunch net-2006-10-08-Incompatibilities between classical confidence intervals and learning.

14 0.094654061 95 hunch net-2005-07-14-What Learning Theory might do

15 0.093169473 170 hunch net-2006-04-06-Bounds greater than 1

16 0.09178327 351 hunch net-2009-05-02-Wielding a New Abstraction

17 0.091530301 456 hunch net-2012-02-24-ICML+50%

18 0.090914309 454 hunch net-2012-01-30-ICML Posters and Scope

19 0.087877221 183 hunch net-2006-06-14-Explorations of Exploration

20 0.086962201 235 hunch net-2007-03-03-All Models of Learning have Flaws

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.2), (1, 0.07), (2, -0.009), (3, 0.016), (4, 0.041), (5, -0.064), (6, 0.075), (7, 0.008), (8, 0.036), (9, -0.088), (10, 0.039), (11, 0.062), (12, 0.019), (13, 0.077), (14, 0.078), (15, -0.01), (16, 0.091), (17, 0.042), (18, -0.081), (19, 0.007), (20, 0.056), (21, 0.013), (22, -0.093), (23, 0.012), (24, -0.027), (25, -0.067), (26, -0.114), (27, 0.001), (28, -0.069), (29, 0.08), (30, 0.033), (31, -0.069), (32, -0.062), (33, -0.142), (34, -0.039), (35, 0.137), (36, 0.058), (37, -0.083), (38, 0.041), (39, -0.058), (40, 0.081), (41, -0.048), (42, -0.048), (43, -0.018), (44, 0.021), (45, 0.032), (46, 0.01), (47, 0.083), (48, -0.056), (49, 0.025)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94446766 2 hunch net-2005-01-24-Holy grails of machine learning?

2 0.63446558 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

3 0.61580461 41 hunch net-2005-03-15-The State of Tight Bounds

4 0.55943567 31 hunch net-2005-02-26-Problem: Reductions and Relative Ranking Metrics

Introduction: This, again, is something of a research direction rather than a single problem. There are several metrics people care about which depend upon the relative ranking of examples and there are sometimes good reasons to care about such metrics. Examples include AROC , “F1″, the proportion of the time that the top ranked element is in some class, the proportion of the top 10 examples in some class ( google ‘s problem), the lowest ranked example of some class, and the “sort distance” from a predicted ranking to a correct ranking. See here for an example of some of these. Problem What does the ability to classify well imply about performance under these metrics? Past Work Probabilistic classification under squared error can be solved with a classifier. A counterexample shows this does not imply a good AROC. Sample complexity bounds for AROC (and here ). A paper on “ Learning to Order Things “. Difficulty Several of these may be easy. Some of them may be h

5 0.55406243 26 hunch net-2005-02-21-Problem: Cross Validation

Introduction: The essential problem here is the large gap between experimental observation and theoretical understanding. Method K-fold cross validation is a commonly used technique which takes a set of m examples and partitions them into K sets (“folds”) of size m/K . For each fold, a classifier is trained on the other folds and then test on the fold. Problem Assume only independent samples. Derive a classifier from the K classifiers with a small bound on the true error rate. Past Work (I’ll add more as I remember/learn.) Devroye , Rogers, and Wagner analyzed cross validation and found algorithm specific bounds. Not all of this is online, but here is one paper . Michael Kearns and Dana Ron analyzed cross validation and found that under additional stability assumptions the bound for the classifier which learns on all the data is not much worse than for a test set of size m/K . Avrim Blum, Adam Kalai , and myself analyzed cross validation and found tha

6 0.54513222 3 hunch net-2005-01-24-The Humanloop Spectrum of Machine Learning

7 0.52786374 347 hunch net-2009-03-26-Machine Learning is too easy

8 0.51961124 351 hunch net-2009-05-02-Wielding a New Abstraction

9 0.51557857 148 hunch net-2006-01-13-Benchmarks for RL

10 0.51345742 131 hunch net-2005-11-16-The Everything Ensemble Edge

11 0.51267719 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

12 0.50339401 44 hunch net-2005-03-21-Research Styles in Machine Learning

13 0.49091172 18 hunch net-2005-02-12-ROC vs. Accuracy vs. AROC

14 0.48951086 332 hunch net-2008-12-23-Use of Learning Theory

15 0.48104975 183 hunch net-2006-06-14-Explorations of Exploration

16 0.47846046 22 hunch net-2005-02-18-What it means to do research.

17 0.47533962 230 hunch net-2007-02-02-Thoughts regarding “Is machine learning different from statistics?”

18 0.46896595 95 hunch net-2005-07-14-What Learning Theory might do

19 0.45935825 77 hunch net-2005-05-29-Maximum Margin Mismatch?

20 0.45190361 247 hunch net-2007-06-14-Interesting Papers at COLT 2007

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.141), (53, 0.69), (55, 0.047)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.99234676 107 hunch net-2005-09-05-Site Update

Introduction: I tweaked the site in a number of ways today, including: Updating to WordPress 1.5. Installing and heavily tweaking the Geekniche theme. Update: I switched back to a tweaked version of the old theme. Adding the Customizable Post Listings plugin. Installing the StatTraq plugin. Updating some of the links. I particularly recommend looking at the computer research policy blog. Adding threaded comments . This doesn’t thread old comments obviously, but the extra structure may be helpful for new ones. Overall, I think this is an improvement, and it addresses a few of my earlier problems . If you have any difficulties or anything seems “not quite right”, please speak up. A few other tweaks to the site may happen in the near future.

2 0.98846424 56 hunch net-2005-04-14-Families of Learning Theory Statements

Introduction: The diagram above shows a very broad viewpoint of learning theory. arrow Typical statement Examples Past->Past Some prediction algorithm A does almost as well as any of a set of algorithms. Weighted Majority Past->Future Assuming independent samples, past performance predicts future performance. PAC analysis, ERM analysis Future->Future Future prediction performance on subproblems implies future prediction performance using algorithm A . ECOC, Probing A basic question is: Are there other varieties of statements of this type? Avrim noted that there are also “arrows between arrows”: generic methods for transforming between Past->Past statements and Past->Future statements. Are there others?

3 0.98749262 16 hunch net-2005-02-09-Intuitions from applied learning

Introduction: Since learning is far from an exact science, it’s good to pay attention to basic intuitions of applied learning. Here are a few I’ve collected. Integration In Bayesian learning, the posterior is computed by an integral, and the optimal thing to do is to predict according to this integral. This phenomena seems to be far more general. Bagging, Boosting, SVMs, and Neural Networks all take advantage of this idea to some extent. The phenomena is more general: you can average over many different classification predictors to improve performance. Sources: Zoubin , Caruana Differentiation Different pieces of an average should differentiate to achieve good performance by different methods. This is know as the ‘symmetry breaking’ problem for neural networks, and it’s why weights are initialized randomly. Boosting explicitly attempts to achieve good differentiation by creating new, different, learning problems. Sources: Yann LeCun , Phil Long Deep Representation Ha

same-blog 4 0.96186441 2 hunch net-2005-01-24-Holy grails of machine learning?

5 0.95924503 91 hunch net-2005-07-10-Thinking the Unthought

Introduction: One thing common to much research is that the researcher must be the first person ever to have some thought. How do you think of something that has never been thought of? There seems to be no methodical manner of doing this, but there are some tricks. The easiest method is to just have some connection come to you. There is a trick here however: you should write it down and fill out the idea immediately because it can just as easily go away. A harder method is to set aside a block of time and simply think about an idea. Distraction elimination is essential here because thinking about the unthought is hard work which your mind will avoid. Another common method is in conversation. Sometimes the process of verbalizing implies new ideas come up and sometimes whoever you are talking to replies just the right way. This method is dangerous though—you must speak to someone who helps you think rather than someone who occupies your thoughts. Try to rephrase the problem so the a

6 0.94044721 367 hunch net-2009-08-16-Centmail comments

7 0.93896401 145 hunch net-2005-12-29-Deadline Season

8 0.89359945 6 hunch net-2005-01-27-Learning Complete Problems

9 0.75264251 21 hunch net-2005-02-17-Learning Research Programs

10 0.63015985 151 hunch net-2006-01-25-1 year

11 0.62905926 201 hunch net-2006-08-07-The Call of the Deep

12 0.61762762 191 hunch net-2006-07-08-MaxEnt contradicts Bayes Rule?

13 0.61453104 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning

14 0.60871702 265 hunch net-2007-10-14-NIPS workshp: Learning Problem Design

15 0.60731089 141 hunch net-2005-12-17-Workshops as Franchise Conferences

16 0.57539701 152 hunch net-2006-01-30-Should the Input Representation be a Vector?

17 0.56472701 407 hunch net-2010-08-23-Boosted Decision Trees for Deep Learning

18 0.5635469 321 hunch net-2008-10-19-NIPS 2008 workshop on Kernel Learning

19 0.55477899 283 hunch net-2008-01-07-2008 Summer Machine Learning Conference Schedule

20 0.55355924 292 hunch net-2008-03-15-COLT Open Problems