hunch_net hunch_net-2007 hunch_net-2007-279 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P
sentIndex sentText sentNum sentScore
1 The financial people were there in greater force than previously. [sent-2, score-0.325]
2 Two Sigma sponsored NIPS while DRW Trading had a booth. [sent-3, score-0.135]
3 The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. [sent-4, score-0.899]
4 This is very different from the situation we often think of where the world is oblivious to our learning. [sent-5, score-0.083]
5 This may present new and convincing applications for the learning-against-an-adversary work common at COLT . [sent-6, score-0.217]
6 Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . [sent-8, score-0.173]
7 The basic idea is that active learning can be done via reduction to a form of supervised learning problem. [sent-9, score-0.63]
8 This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. [sent-10, score-0.615]
9 Filterboost is an online boosting algorithm which I think of as the boost-by-filtration approaches in the first boosting paper updated for an adaboost-like structure. [sent-12, score-0.879]
10 These kinds of approaches are doubtless helpful for large scale learning problems which are becoming more common. [sent-13, score-0.507]
11 Peter Bartlett , Elad Hazan , and Sasha Rakhlin had a paper on Adaptive Online Learning . [sent-14, score-0.173]
12 This paper refines earlier results for online learning against an adversary via gradient descent, which is plausibly of great use in practice. [sent-15, score-0.935]
13 I missed the workshop starting this effort at last year’s NIPS due to workshop overload, but open source machine learning is definitely of great and sound interest to the community. [sent-17, score-0.853]
wordName wordTfidf (topN-words)
[('filterboost', 0.328), ('adversary', 0.192), ('active', 0.183), ('nips', 0.177), ('paper', 0.173), ('boosting', 0.172), ('workshop', 0.17), ('supervised', 0.146), ('mess', 0.146), ('bradley', 0.146), ('bartlett', 0.146), ('monteleoni', 0.146), ('online', 0.139), ('claire', 0.135), ('joseph', 0.135), ('sasha', 0.135), ('sponsored', 0.135), ('great', 0.133), ('doubtless', 0.121), ('financial', 0.121), ('force', 0.121), ('overload', 0.117), ('approaches', 0.114), ('applications', 0.114), ('elad', 0.112), ('hazan', 0.112), ('via', 0.111), ('updated', 0.109), ('trading', 0.109), ('dasgupta', 0.106), ('peter', 0.103), ('convincing', 0.103), ('agnostic', 0.103), ('missed', 0.1), ('hsu', 0.1), ('definitely', 0.098), ('schapire', 0.098), ('adaptive', 0.098), ('benefits', 0.096), ('robert', 0.096), ('sanjoy', 0.096), ('learning', 0.095), ('earlier', 0.092), ('becoming', 0.091), ('daniel', 0.091), ('adversarial', 0.087), ('sound', 0.087), ('kinds', 0.086), ('greater', 0.083), ('situation', 0.083)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000004 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P
2 0.20152755 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he
3 0.19508386 360 hunch net-2009-06-15-In Active Learning, the question changes
Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.
4 0.16817065 251 hunch net-2007-06-24-Interesting Papers at ICML 2007
Introduction: Here are a few of the papers I enjoyed at ICML. Steffen Bickel , Michael Brüeckner, Tobias Scheffer , Discriminative Learning for Differing Training and Test Distributions There is a nice trick in this paper: they predict the probability that an unlabeled sample is in the training set vs. the test set, and then use this prediction to importance weight labeled samples in the training set. This paper uses a specific parametric model, but the approach is easily generalized. Steve Hanneke A Bound on the Label Complexity of Agnostic Active Learning This paper bounds the number of labels required by the A 2 algorithm for active learning in the agnostic case. Last year we figured out agnostic active learning was possible. This year, it’s quantified. Hopefull soon, it will be practical. Sylvian Gelly , David Silver Combining Online and Offline Knowledge in UCT . This paper is about techniques for improving MoGo with various sorts of learning. MoGo has a fair
5 0.15647165 127 hunch net-2005-11-02-Progress in Active Learning
Introduction: Several bits of progress have been made since Sanjoy pointed out the significant lack of theoretical understanding of active learning . This is an update on the progress I know of. As a refresher, active learning as meant here is: There is a source of unlabeled data. There is an oracle from which labels can be requested for unlabeled data produced by the source. The goal is to perform well with minimal use of the oracle. Here is what I’ve learned: Sanjoy has developed sufficient and semi-necessary conditions for active learning given the assumptions of IID data and “realizability” (that one of the classifiers is a correct classifier). Nina , Alina , and I developed an algorithm for active learning relying on only the assumption of IID data. A draft is here . Nicolo , Claudio , and Luca showed that it is possible to do active learning in an entirely adversarial setting for linear threshold classifiers here . This was published a year or two ago and I r
6 0.14902316 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
7 0.1462139 385 hunch net-2009-12-27-Interesting things at NIPS 2009
8 0.14047515 293 hunch net-2008-03-23-Interactive Machine Learning
9 0.1391871 281 hunch net-2007-12-21-Vowpal Wabbit Code Release
10 0.13226046 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class
11 0.12763478 375 hunch net-2009-10-26-NIPS workshops
12 0.12383645 199 hunch net-2006-07-26-Two more UAI papers of interest
13 0.12240991 420 hunch net-2010-12-26-NIPS 2010
14 0.12135119 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy
15 0.11869878 309 hunch net-2008-07-10-Interesting papers, ICML 2008
16 0.11833879 403 hunch net-2010-07-18-ICML & COLT 2010
17 0.11699794 186 hunch net-2006-06-24-Online convex optimization at COLT
18 0.11671025 258 hunch net-2007-08-12-Exponentiated Gradient
19 0.11631461 179 hunch net-2006-05-16-The value of the orthodox view of Boosting
20 0.1111051 252 hunch net-2007-07-01-Watchword: Online Learning
topicId topicWeight
[(0, 0.252), (1, -0.011), (2, -0.081), (3, -0.106), (4, 0.177), (5, 0.104), (6, -0.044), (7, -0.089), (8, -0.068), (9, 0.115), (10, 0.204), (11, -0.052), (12, -0.031), (13, 0.041), (14, -0.055), (15, 0.027), (16, 0.004), (17, 0.048), (18, -0.029), (19, -0.009), (20, -0.103), (21, 0.047), (22, 0.052), (23, -0.035), (24, 0.063), (25, -0.038), (26, 0.057), (27, 0.091), (28, -0.0), (29, 0.019), (30, 0.027), (31, 0.035), (32, -0.123), (33, -0.038), (34, -0.024), (35, -0.031), (36, -0.08), (37, -0.061), (38, -0.014), (39, 0.095), (40, 0.047), (41, -0.047), (42, 0.02), (43, -0.003), (44, -0.032), (45, -0.074), (46, -0.056), (47, -0.046), (48, 0.017), (49, -0.008)]
simIndex simValue blogId blogTitle
same-blog 1 0.95305538 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P
2 0.71686763 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear
3 0.67535686 309 hunch net-2008-07-10-Interesting papers, ICML 2008
Introduction: Here are some papers from ICML 2008 that I found interesting. Risi Kondor and Karsten Borgwardt , The Skew Spectrum of Graphs . This paper is about a new family of functions on graphs which is invariant under node label permutation. They show that these quantities appear to yield good features for learning. Sanjoy Dasgupta and Daniel Hsu . Hierarchical sampling for active learning. This is the first published practical consistent active learning algorithm. The abstract is also pretty impressive. Lihong Li , Michael Littman , and Thomas Walsh Knows What It Knows: A Framework For Self-Aware Learning. This is an attempt to create learning algorithms that know when they err, (other work includes Vovk ). It’s not yet clear to me what the right model for feature-dependent confidence intervals is. Novi Quadrianto , Alex Smola , TIberio Caetano , and Quoc Viet Le Estimating Labels from Label Proportions . This is an example of learning in a speciali
4 0.66799152 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he
5 0.65939248 385 hunch net-2009-12-27-Interesting things at NIPS 2009
Introduction: Several papers at NIPS caught my attention. Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. Elad Hazan and Satyen Kale On Stochastic and Worst-Case Models of Investing . At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. They have some basic analysis and also a nice application to FMRI brain reading. Sh
6 0.6586889 360 hunch net-2009-06-15-In Active Learning, the question changes
7 0.64710504 251 hunch net-2007-06-24-Interesting Papers at ICML 2007
8 0.61650461 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
9 0.61209428 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class
10 0.60774922 127 hunch net-2005-11-02-Progress in Active Learning
11 0.59443629 293 hunch net-2008-03-23-Interactive Machine Learning
12 0.58270562 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy
13 0.58230007 384 hunch net-2009-12-24-Top graduates this season
14 0.57266265 281 hunch net-2007-12-21-Vowpal Wabbit Code Release
15 0.5615986 420 hunch net-2010-12-26-NIPS 2010
16 0.55880582 439 hunch net-2011-08-01-Interesting papers at COLT 2011
17 0.53882217 192 hunch net-2006-07-08-Some recent papers
18 0.53553683 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates
19 0.53199434 403 hunch net-2010-07-18-ICML & COLT 2010
20 0.50954318 113 hunch net-2005-09-19-NIPS Workshops
topicId topicWeight
[(10, 0.068), (23, 0.254), (27, 0.225), (38, 0.04), (53, 0.041), (55, 0.136), (94, 0.051), (95, 0.084)]
simIndex simValue blogId blogTitle
same-blog 1 0.87548399 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P
2 0.82997108 101 hunch net-2005-08-08-Apprenticeship Reinforcement Learning for Control
Introduction: Pieter Abbeel presented a paper with Andrew Ng at ICML on Exploration and Apprenticeship Learning in Reinforcement Learning . The basic idea of this algorithm is: Collect data from a human controlling a machine. Build a transition model based upon the experience. Build a policy which optimizes the transition model. Evaluate the policy. If it works well, halt, otherwise add the experience into the pool and go to (2). The paper proves that this technique will converge to some policy with expected performance near human expected performance assuming the world fits certain assumptions (MDP or linear dynamics). This general idea of apprenticeship learning (i.e. incorporating data from an expert) seems very compelling because (a) humans often learn this way and (b) much harder problems can be solved. For (a), the notion of teaching is about transferring knowledge from an expert to novices, often via demonstration. To see (b), note that we can create intricate rei
3 0.76037413 347 hunch net-2009-03-26-Machine Learning is too easy
Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there
4 0.73514593 454 hunch net-2012-01-30-ICML Posters and Scope
Introduction: Normally, I don’t indulge in posters for ICML , but this year is naturally an exception for me. If you want one, there are a small number left here , if you sign up before February. It also seems worthwhile to give some sense of the scope and reviewing criteria for ICML for authors considering submitting papers. At ICML, the (very large) program committee does the reviewing which informs final decisions by area chairs on most papers. Program chairs setup the process, deal with exceptions or disagreements, and provide advice for the reviewing process. Providing advice is tricky (and easily misleading) because a conference is a community, and in the end the aggregate interests of the community determine the conference. Nevertheless, as a program chair this year it seems worthwhile to state the overall philosophy I have and what I plan to encourage (and occasionally discourage). At the highest level, I believe ICML exists to further research into machine learning, which I gene
5 0.73482531 343 hunch net-2009-02-18-Decision by Vetocracy
Introduction: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. This is most easily seen by comparison: Paper Banditron Offset Tree Notes Problem Scope Multiclass problems where only the loss of one choice can be probed. Strictly greater: Cost sensitive multiclass problems where only the loss of one choice can be probed. Often generalizations don’t matter. That’s not the case here, since every plausible application I’ve thought of involves loss functions substantially different from 0/1. What’s new Analysis and Experiments Algorithm, Analysis, and Experiments As far as I know, the essence of the more general problem was first stated and analyzed with the EXP4 algorithm (page 16) (1998). It’s also the time horizon 1 simplification of the Reinforcement Learning setting for the random trajectory method (page 15) (2002). The Banditron algorithm itself is functionally identi
6 0.72879374 464 hunch net-2012-05-03-Microsoft Research, New York City
7 0.72677666 225 hunch net-2007-01-02-Retrospective
8 0.72407371 466 hunch net-2012-06-05-ICML acceptance statistics
9 0.72393018 51 hunch net-2005-04-01-The Producer-Consumer Model of Research
10 0.72071862 437 hunch net-2011-07-10-ICML 2011 and the future
11 0.71584606 406 hunch net-2010-08-22-KDD 2010
12 0.71480834 36 hunch net-2005-03-05-Funding Research
13 0.71471673 132 hunch net-2005-11-26-The Design of an Optimal Research Environment
14 0.71463358 89 hunch net-2005-07-04-The Health of COLT
15 0.71197474 194 hunch net-2006-07-11-New Models
16 0.71157217 360 hunch net-2009-06-15-In Active Learning, the question changes
17 0.71114475 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?
18 0.70545381 325 hunch net-2008-11-10-ICML Reviewing Criteria
19 0.70477182 40 hunch net-2005-03-13-Avoiding Bad Reviewing
20 0.7043227 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006