hunch_net hunch_net-2009 hunch_net-2009-360 knowledge-graph by maker-knowledge-mining

360 hunch net-2009-06-15-In Active Learning, the question changes


meta infos for this blog

Source: html

Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. [sent-1, score-0.671]

2 At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. [sent-2, score-0.576]

3 ” The nature of the question has fundamentally changed in my mind. [sent-4, score-0.328]

4 The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. [sent-5, score-0.744]

5 In many situation, the question has now changed to: “is it worth it? [sent-6, score-0.508]

6 ” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? [sent-7, score-0.865]

7 Currently, there are situations where this question could go either way. [sent-8, score-0.402]

8 Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile. [sent-9, score-0.599]

9 At the active learning tutorial , I stated a set of somewhat more precise research questions that I don’t yet have answer to, and which I believe are worth answering. [sent-10, score-1.003]

10 Is active learning possible in a fully adversarial setting? [sent-12, score-0.85]

11 By fully adversarial, I mean when an adversary controls all the algorithms observations. [sent-13, score-0.369]

12 Is there an efficient and effective reduction of active learning to supervised learning? [sent-15, score-1.071]

13 The bootstrap IWAL approach is efficient but not effective in some situations where other approaches can succeed. [sent-16, score-0.54]

14 The algorithm here is a reduction to a special kind of supervised learning where you can specify both examples and constraints. [sent-17, score-0.575]

15 For many supervised learning algorithms, adding constraints seems problematic. [sent-18, score-0.392]

16 Can active learning succeed with alternate labeling oracles? [sent-19, score-0.842]

17 The ones I see people trying to use in practice often differ because they can provide answers of varying specificity and cost, or because some oracles are good for some questions, but not good for others. [sent-20, score-0.427]

18 At this point, there have been several successful applications of active learning, but that’s not the same thing as succeeding with more robust algorithms. [sent-21, score-0.748]

19 Can we succeed empirically with more robust algorithms? [sent-22, score-0.281]

20 And is the empirical cost of additional robustness worth the empirical peace-of-mind that your learning algorithm won’t go astray where other more aggressive approaches may do so? [sent-23, score-0.975]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('active', 0.43), ('supervised', 0.229), ('oracles', 0.199), ('worth', 0.18), ('question', 0.178), ('theoretically', 0.166), ('cost', 0.156), ('changed', 0.15), ('succeed', 0.15), ('questions', 0.149), ('adversarial', 0.137), ('situations', 0.137), ('robust', 0.131), ('fully', 0.125), ('reduction', 0.11), ('empirical', 0.11), ('efficient', 0.11), ('succeeding', 0.106), ('savings', 0.106), ('moved', 0.106), ('effective', 0.104), ('nicolo', 0.1), ('bootstrap', 0.1), ('alternate', 0.095), ('claudio', 0.095), ('controls', 0.091), ('solid', 0.091), ('approaches', 0.089), ('learning', 0.088), ('go', 0.087), ('overhead', 0.085), ('answer', 0.083), ('saying', 0.083), ('varying', 0.083), ('figuring', 0.081), ('successful', 0.081), ('robustness', 0.079), ('labeling', 0.079), ('algorithms', 0.078), ('algorithm', 0.076), ('sanjoy', 0.075), ('adding', 0.075), ('adversary', 0.075), ('differ', 0.074), ('yet', 0.073), ('special', 0.072), ('answers', 0.071), ('possible', 0.07), ('study', 0.07), ('worthwhile', 0.07)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 360 hunch net-2009-06-15-In Active Learning, the question changes

Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.

2 0.36828837 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he

3 0.30489388 127 hunch net-2005-11-02-Progress in Active Learning

Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r

4 0.19508386 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P

5 0.16934831 293 hunch net-2008-03-23-Interactive Machine Learning

Introduction: A new direction of research seems to be arising in machine learning: Interactive Machine Learning. This isn’t a familiar term, although it does include some familiar subjects. What is Interactive Machine Learning? The fundamental requirement is (a) learning algorithms which interact with the world and (b) learn. For our purposes, let’s define learning as efficiently competing with a large set of possible predictors. Examples include: Online learning against an adversary ( Avrim’s Notes ). The interaction is almost trivial: the learning algorithm makes a prediction and then receives feedback. The learning is choosing based upon the advice of many experts. Active Learning . In active learning, the interaction is choosing which examples to label, and the learning is choosing from amongst a large set of hypotheses. Contextual Bandits . The interaction is choosing one of several actions and learning only the value of the chosen action (weaker than active learning

6 0.14019001 45 hunch net-2005-03-22-Active learning

7 0.13958345 388 hunch net-2010-01-24-Specializations of the Master Problem

8 0.13342255 299 hunch net-2008-04-27-Watchword: Supervised Learning

9 0.13230559 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

10 0.13092586 309 hunch net-2008-07-10-Interesting papers, ICML 2008

11 0.13039866 345 hunch net-2009-03-08-Prediction Science

12 0.12976898 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

13 0.12842715 454 hunch net-2012-01-30-ICML Posters and Scope

14 0.12779605 332 hunch net-2008-12-23-Use of Learning Theory

15 0.12483842 183 hunch net-2006-06-14-Explorations of Exploration

16 0.1213557 400 hunch net-2010-06-13-The Good News on Exploration and Learning

17 0.11953881 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

18 0.1180876 95 hunch net-2005-07-14-What Learning Theory might do

19 0.11616437 347 hunch net-2009-03-26-Machine Learning is too easy

20 0.10774988 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.253), (1, 0.101), (2, -0.05), (3, -0.01), (4, 0.15), (5, -0.044), (6, 0.002), (7, -0.034), (8, 0.015), (9, 0.118), (10, 0.194), (11, -0.024), (12, -0.051), (13, 0.216), (14, -0.191), (15, 0.018), (16, -0.01), (17, 0.022), (18, -0.062), (19, 0.009), (20, -0.024), (21, 0.063), (22, 0.135), (23, -0.047), (24, 0.033), (25, -0.23), (26, 0.077), (27, -0.025), (28, 0.017), (29, 0.0), (30, 0.108), (31, 0.035), (32, 0.005), (33, -0.039), (34, 0.018), (35, 0.019), (36, -0.005), (37, -0.05), (38, -0.073), (39, 0.088), (40, 0.006), (41, -0.108), (42, -0.063), (43, -0.047), (44, 0.018), (45, -0.002), (46, 0.095), (47, -0.037), (48, -0.023), (49, 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97012073 360 hunch net-2009-06-15-In Active Learning, the question changes

Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.

2 0.90009862 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he

3 0.84360266 127 hunch net-2005-11-02-Progress in Active Learning

Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r

4 0.73069507 338 hunch net-2009-01-23-An Active Learning Survey

Introduction: Burr Settles wrote a fairly comprehensive survey of active learning . He intends to maintain and update the survey, so send him any suggestions you have.

5 0.69374698 293 hunch net-2008-03-23-Interactive Machine Learning

Introduction: A new direction of research seems to be arising in machine learning: Interactive Machine Learning. This isn’t a familiar term, although it does include some familiar subjects. What is Interactive Machine Learning? The fundamental requirement is (a) learning algorithms which interact with the world and (b) learn. For our purposes, let’s define learning as efficiently competing with a large set of possible predictors. Examples include: Online learning against an adversary ( Avrim’s Notes ). The interaction is almost trivial: the learning algorithm makes a prediction and then receives feedback. The learning is choosing based upon the advice of many experts. Active Learning . In active learning, the interaction is choosing which examples to label, and the learning is choosing from amongst a large set of hypotheses. Contextual Bandits . The interaction is choosing one of several actions and learning only the value of the chosen action (weaker than active learning

6 0.68911827 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

7 0.6632024 45 hunch net-2005-03-22-Active learning

8 0.63139623 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

9 0.62177938 299 hunch net-2008-04-27-Watchword: Supervised Learning

10 0.60774577 400 hunch net-2010-06-13-The Good News on Exploration and Learning

11 0.57973647 309 hunch net-2008-07-10-Interesting papers, ICML 2008

12 0.57965142 183 hunch net-2006-06-14-Explorations of Exploration

13 0.54382104 143 hunch net-2005-12-27-Automated Labeling

14 0.53485501 168 hunch net-2006-04-02-Mad (Neuro)science

15 0.49027687 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

16 0.48141646 161 hunch net-2006-03-05-“Structural” Learning

17 0.48095065 351 hunch net-2009-05-02-Wielding a New Abstraction

18 0.47777477 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy

19 0.47557172 332 hunch net-2008-12-23-Use of Learning Theory

20 0.4730598 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(9, 0.043), (10, 0.037), (15, 0.103), (27, 0.28), (38, 0.059), (53, 0.058), (55, 0.097), (64, 0.021), (94, 0.1), (95, 0.115)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95538187 360 hunch net-2009-06-15-In Active Learning, the question changes

Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.

2 0.92881894 343 hunch net-2009-02-18-Decision by Vetocracy

Introduction: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. This is most easily seen by comparison: Paper Banditron Offset Tree Notes Problem Scope Multiclass problems where only the loss of one choice can be probed. Strictly greater: Cost sensitive multiclass problems where only the loss of one choice can be probed. Often generalizations don’t matter. That’s not the case here, since every plausible application I’ve thought of involves loss functions substantially different from 0/1. What’s new Analysis and Experiments Algorithm, Analysis, and Experiments As far as I know, the essence of the more general problem was first stated and analyzed with the EXP4 algorithm (page 16) (1998). It’s also the time horizon 1 simplification of the Reinforcement Learning setting for the random trajectory method (page 15) (2002). The Banditron algorithm itself is functionally identi

3 0.92648685 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

Introduction: How do you create an optimal environment for research? Here are some essential ingredients that I see. Stability . University-based research is relatively good at this. On any particular day, researchers face choices in what they will work on. A very common tradeoff is between: easy small difficult big For researchers without stability, the ‘easy small’ option wins. This is often “ok”—a series of incremental improvements on the state of the art can add up to something very beneficial. However, it misses one of the big potentials of research: finding entirely new and better ways of doing things. Stability comes in many forms. The prototypical example is tenure at a university—a tenured professor is almost imposssible to fire which means that the professor has the freedom to consider far horizon activities. An iron-clad guarantee of a paycheck is not necessary—industrial research labs have succeeded well with research positions of indefinite duration. Atnt rese

4 0.92322189 36 hunch net-2005-03-05-Funding Research

Introduction: The funding of research (and machine learning research) is an issue which seems to have become more significant in the United States over the last decade. The word “research” is applied broadly here to science, mathematics, and engineering. There are two essential difficulties with funding research: Longshot Paying a researcher is often a big gamble. Most research projects don’t pan out, but a few big payoffs can make it all worthwhile. Information Only Much of research is about finding the right way to think about or do something. The Longshot difficulty means that there is high variance in payoffs. This can be compensated for by funding many different research projects, reducing variance. The Information-Only difficulty means that it’s hard to extract a profit directly from many types of research, so companies have difficulty justifying basic research. (Patents are a mechanism for doing this. They are often extraordinarily clumsy or simply not applicable.) T

5 0.91749138 235 hunch net-2007-03-03-All Models of Learning have Flaws

Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c

6 0.91635358 194 hunch net-2006-07-11-New Models

7 0.91609025 406 hunch net-2010-08-22-KDD 2010

8 0.91474438 371 hunch net-2009-09-21-Netflix finishes (and starts)

9 0.91437918 95 hunch net-2005-07-14-What Learning Theory might do

10 0.91362375 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

11 0.91031325 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

12 0.90900457 370 hunch net-2009-09-18-Necessary and Sufficient Research

13 0.90819567 351 hunch net-2009-05-02-Wielding a New Abstraction

14 0.9070704 12 hunch net-2005-02-03-Learning Theory, by assumption

15 0.9066515 220 hunch net-2006-11-27-Continuizing Solutions

16 0.9066084 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

17 0.90613049 225 hunch net-2007-01-02-Retrospective

18 0.90288293 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

19 0.90258443 347 hunch net-2009-03-26-Machine Learning is too easy

20 0.90222239 464 hunch net-2012-05-03-Microsoft Research, New York City