hunch_net hunch_net-2005 hunch_net-2005-127 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r
sentIndex sentText sentNum sentScore
1 Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . [sent-1, score-0.667]
2 As a refresher, active learning as meant here is: There is a source of unlabeled data. [sent-3, score-0.598]
3 There is an oracle from which labels can be requested for unlabeled data produced by the source. [sent-4, score-0.487]
4 The goal is to perform well with minimal use of the oracle. [sent-5, score-0.177]
5 Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). [sent-6, score-1.206]
6 Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. [sent-7, score-0.643]
7 Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . [sent-9, score-1.123]
8 This was published a year or two ago and I recently learned about it. [sent-10, score-0.207]
9 All of these results are relatively ‘rough’: they don’t necessarily make good algorithms as stated (although the last one has a few experiments). [sent-11, score-0.299]
10 None of these results are directly comparable because the assumptions vary. [sent-12, score-0.462]
11 Comparing the assumptions and the results leads to a number of remaining questions: Do the sufficient and seminecessary conditions apply to the IID only case? [sent-13, score-0.897]
12 Is there a generic algorithm for any hypothesis space that works in the fully adversarial setting? [sent-15, score-0.433]
13 What are special cases of these algorithms which are computationally tractable and useful? [sent-16, score-0.164]
14 The Foundations of Active Learning workshop at NIPS should be a good place to discuss these questions. [sent-17, score-0.078]
wordName wordTfidf (topN-words)
[('active', 0.343), ('adversarial', 0.246), ('iid', 0.221), ('assumptions', 0.218), ('conditions', 0.193), ('sanjoy', 0.18), ('unlabeled', 0.167), ('developed', 0.164), ('classifiers', 0.154), ('results', 0.139), ('progress', 0.137), ('luca', 0.136), ('relying', 0.136), ('sufficient', 0.134), ('learned', 0.122), ('nicolo', 0.119), ('requested', 0.119), ('questions', 0.118), ('remaining', 0.114), ('claudio', 0.114), ('nina', 0.114), ('threshold', 0.109), ('generic', 0.105), ('foundations', 0.105), ('comparable', 0.105), ('pointed', 0.105), ('oracle', 0.102), ('setting', 0.101), ('leads', 0.099), ('produced', 0.099), ('comparing', 0.099), ('case', 0.097), ('minimal', 0.096), ('draft', 0.094), ('rough', 0.092), ('showed', 0.092), ('meant', 0.088), ('special', 0.086), ('published', 0.085), ('alina', 0.085), ('bits', 0.082), ('hypothesis', 0.082), ('none', 0.082), ('stated', 0.081), ('perform', 0.081), ('necessarily', 0.079), ('discuss', 0.078), ('tractable', 0.078), ('update', 0.078), ('entirely', 0.078)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999982 127 hunch net-2005-11-02-Progress in Active Learning
Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r
2 0.30489388 360 hunch net-2009-06-15-In Active Learning, the question changes
Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.
3 0.27394575 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he
4 0.19851036 45 hunch net-2005-03-22-Active learning
Introduction: Often, unlabeled data is easy to come by but labels are expensive. For instance, if you’re building a speech recognizer, it’s easy enough to get raw speech samples — just walk around with a microphone — but labeling even one of these samples is a tedious process in which a human must examine the speech signal and carefully segment it into phonemes. In the field of active learning, the goal is as usual to construct an accurate classifier, but the labels of the data points are initially hidden and there is a charge for each label you want revealed. The hope is that by intelligent adaptive querying, you can get away with significantly fewer labels than you would need in a regular supervised learning framework. Here’s an example. Suppose the data lie on the real line, and the classifiers are simple thresholding functions, H = {h w }: h w (x) = 1 if x > w, and 0 otherwise. VC theory tells us that if the underlying distribution P can be classified perfectly by some hypothesis in H (
5 0.17665508 183 hunch net-2006-06-14-Explorations of Exploration
Introduction: Exploration is one of the big unsolved problems in machine learning. This isn’t for lack of trying—there are many models of exploration which have been analyzed in many different ways by many different groups of people. At some point, it is worthwhile to sit back and see what has been done across these many models. Reinforcement Learning (1) . Reinforcement learning has traditionally focused on Markov Decision Processes where the next state s’ is given by a conditional distribution P(s’|s,a) given the current state s and action a . The typical result here is that certain specific algorithms controlling an agent can behave within e of optimal for horizon T except for poly(1/e,T,S,A) “wasted” experiences (with high probability). This started with E 3 by Satinder Singh and Michael Kearns . Sham Kakade’s thesis has significant discussion. Extensions have typically been of the form “under extra assumptions, we can prove more”, for example Factored-E 3 and
6 0.17558891 7 hunch net-2005-01-31-Watchword: Assumption
7 0.1640573 388 hunch net-2010-01-24-Specializations of the Master Problem
8 0.15647165 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
9 0.15341231 8 hunch net-2005-02-01-NIPS: Online Bayes
10 0.1514042 57 hunch net-2005-04-16-Which Assumptions are Reasonable?
11 0.14372492 12 hunch net-2005-02-03-Learning Theory, by assumption
12 0.12414001 43 hunch net-2005-03-18-Binomial Weighting
13 0.12057876 235 hunch net-2007-03-03-All Models of Learning have Flaws
14 0.1179532 143 hunch net-2005-12-27-Automated Labeling
15 0.11575945 309 hunch net-2008-07-10-Interesting papers, ICML 2008
16 0.11169729 347 hunch net-2009-03-26-Machine Learning is too easy
17 0.10997969 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
18 0.10933144 332 hunch net-2008-12-23-Use of Learning Theory
19 0.10604473 333 hunch net-2008-12-27-Adversarial Academia
20 0.10331739 251 hunch net-2007-06-24-Interesting Papers at ICML 2007
topicId topicWeight
[(0, 0.211), (1, 0.109), (2, -0.022), (3, -0.076), (4, 0.188), (5, -0.015), (6, 0.034), (7, -0.03), (8, 0.034), (9, 0.067), (10, 0.209), (11, 0.007), (12, 0.011), (13, 0.127), (14, -0.155), (15, -0.045), (16, -0.091), (17, 0.047), (18, -0.151), (19, 0.001), (20, -0.047), (21, 0.008), (22, 0.152), (23, -0.058), (24, 0.018), (25, -0.22), (26, 0.061), (27, -0.035), (28, 0.095), (29, -0.046), (30, 0.051), (31, 0.013), (32, -0.11), (33, 0.008), (34, 0.117), (35, 0.027), (36, 0.02), (37, -0.01), (38, 0.064), (39, 0.017), (40, -0.047), (41, -0.043), (42, -0.031), (43, -0.098), (44, 0.056), (45, -0.033), (46, 0.097), (47, 0.018), (48, 0.021), (49, 0.058)]
simIndex simValue blogId blogTitle
same-blog 1 0.97869474 127 hunch net-2005-11-02-Progress in Active Learning
Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r
2 0.81784755 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he
3 0.80699742 360 hunch net-2009-06-15-In Active Learning, the question changes
Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.
4 0.74050498 338 hunch net-2009-01-23-An Active Learning Survey
Introduction: Burr Settles wrote a fairly comprehensive survey of active learning . He intends to maintain and update the survey, so send him any suggestions you have.
5 0.70192337 45 hunch net-2005-03-22-Active learning
Introduction: Often, unlabeled data is easy to come by but labels are expensive. For instance, if you’re building a speech recognizer, it’s easy enough to get raw speech samples — just walk around with a microphone — but labeling even one of these samples is a tedious process in which a human must examine the speech signal and carefully segment it into phonemes. In the field of active learning, the goal is as usual to construct an accurate classifier, but the labels of the data points are initially hidden and there is a charge for each label you want revealed. The hope is that by intelligent adaptive querying, you can get away with significantly fewer labels than you would need in a regular supervised learning framework. Here’s an example. Suppose the data lie on the real line, and the classifiers are simple thresholding functions, H = {h w }: h w (x) = 1 if x > w, and 0 otherwise. VC theory tells us that if the underlying distribution P can be classified perfectly by some hypothesis in H (
6 0.62813973 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
7 0.5956803 183 hunch net-2006-06-14-Explorations of Exploration
8 0.59072948 7 hunch net-2005-01-31-Watchword: Assumption
9 0.58634019 57 hunch net-2005-04-16-Which Assumptions are Reasonable?
10 0.56758463 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
11 0.52695847 143 hunch net-2005-12-27-Automated Labeling
12 0.52424252 126 hunch net-2005-10-26-Fallback Analysis is a Secret to Useful Algorithms
13 0.51511776 293 hunch net-2008-03-23-Interactive Machine Learning
14 0.51418918 309 hunch net-2008-07-10-Interesting papers, ICML 2008
15 0.47555187 12 hunch net-2005-02-03-Learning Theory, by assumption
16 0.4755289 43 hunch net-2005-03-18-Binomial Weighting
17 0.46104822 299 hunch net-2008-04-27-Watchword: Supervised Learning
18 0.45059949 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design
19 0.44056651 332 hunch net-2008-12-23-Use of Learning Theory
20 0.43633085 319 hunch net-2008-10-01-NIPS 2008 workshop on ‘Learning over Empirical Hypothesis Spaces’
topicId topicWeight
[(27, 0.249), (38, 0.021), (53, 0.038), (55, 0.057), (94, 0.058), (95, 0.478)]
simIndex simValue blogId blogTitle
1 0.9854601 390 hunch net-2010-03-12-Netflix Challenge 2 Canceled
Introduction: The second Netflix prize is canceled due to privacy problems . I continue to believe my original assessment of this paper, that the privacy break was somewhat overstated. I still haven’t seen any serious privacy failures on the scale of the AOL search log release . I expect privacy concerns to continue to be a big issue when dealing with data releases by companies or governments. The theory of maintaining privacy while using data is improving, but it is not yet in a state where the limits of what’s possible are clear let alone how to achieve these limits in a manner friendly to a prediction competition.
2 0.98127294 319 hunch net-2008-10-01-NIPS 2008 workshop on ‘Learning over Empirical Hypothesis Spaces’
Introduction: This workshop asks for insights how far we may/can push the theoretical boundary of using data in the design of learning machines. Can we express our classification rule in terms of the sample, or do we have to stick to a core assumption of classical statistical learning theory, namely that the hypothesis space is to be defined independent from the sample? This workshop is particularly interested in – but not restricted to – the ‘luckiness framework’ and the recently introduced notion of ‘compatibility functions’ in a semi-supervised learning context (more information can be found at http://www.kuleuven.be/wehys ).
3 0.98099309 30 hunch net-2005-02-25-Why Papers?
Introduction: Makc asked a good question in comments—”Why bother to make a paper, at all?” There are several reasons for writing papers which may not be immediately obvious to people not in academia. The basic idea is that papers have considerably more utility than the obvious “present an idea”. Papers are a formalized units of work. Academics (especially young ones) are often judged on the number of papers they produce. Papers have a formalized method of citing and crediting other—the bibliography. Academics (especially older ones) are often judged on the number of citations they receive. Papers enable a “more fair” anonymous review. Conferences receive many papers, from which a subset are selected. Discussion forums are inherently not anonymous for anyone who wants to build a reputation for good work. Papers are an excuse to meet your friends. Papers are the content of conferences, but much of what you do is talk to friends about interesting problems while there. Sometimes yo
4 0.97418404 389 hunch net-2010-02-26-Yahoo! ML events
Introduction: Yahoo! is sponsoring two machine learning events that might interest people. The Key Scientific Challenges program (due March 5) for Machine Learning and Statistics offers $5K (plus bonuses) for graduate students working on a core problem of interest to Y! If you are already working on one of these problems, there is no reason not to submit, and if you aren’t you might want to think about it for next year, as I am confident they all press the boundary of the possible in Machine Learning. There are 7 days left. The Learning to Rank challenge (due May 31) offers an $8K first prize for the best ranking algorithm on a real (and really used) dataset for search ranking, with presentations at an ICML workshop. Unlike the Netflix competition, there are prizes for 2nd, 3rd, and 4th place, perhaps avoiding the heartbreak the ensemble encountered. If you think you know how to rank, you should give it a try, and we might all learn something. There are 3 months left.
same-blog 5 0.96370602 127 hunch net-2005-11-02-Progress in Active Learning
Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r
6 0.95885491 456 hunch net-2012-02-24-ICML+50%
7 0.93280661 479 hunch net-2013-01-31-Remote large scale learning class participation
8 0.92051315 344 hunch net-2009-02-22-Effective Research Funding
9 0.91005355 373 hunch net-2009-10-03-Static vs. Dynamic multiclass prediction
10 0.79375815 105 hunch net-2005-08-23-(Dis)similarities between academia and open source programmers
11 0.75947046 462 hunch net-2012-04-20-Both new: STOC workshops and NEML
12 0.74634409 234 hunch net-2007-02-22-Create Your Own ICML Workshop
13 0.72306263 7 hunch net-2005-01-31-Watchword: Assumption
14 0.68587279 36 hunch net-2005-03-05-Funding Research
15 0.68425524 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class
16 0.68135232 445 hunch net-2011-09-28-Somebody’s Eating Your Lunch
17 0.68068123 464 hunch net-2012-05-03-Microsoft Research, New York City
18 0.67385632 275 hunch net-2007-11-29-The Netflix Crack
19 0.6720829 455 hunch net-2012-02-20-Berkeley Streaming Data Workshop
20 0.66838205 360 hunch net-2009-06-15-In Active Learning, the question changes