hunch_net hunch_net-2008 hunch_net-2008-310 knowledge-graph by maker-knowledge-mining

310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)


meta infos for this blog

Source: html

Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). [sent-3, score-0.549]

2 This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. [sent-4, score-0.367]

3 This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. [sent-6, score-0.756]

4 The result is applicable to many ranking loss functions and has implications for others. [sent-7, score-0.376]

5 This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. [sent-10, score-0.336]

6 One claim is that learning in this setting can be reduced to IID learning. [sent-11, score-0.201]

7 Due to the relation with Metric-E 3 , I was particularly interested in a couple other papers on reinforcement learning in navigation-like spaces. [sent-12, score-0.211]

8 I also particularly enjoyed Dan Klein ‘s talk, which was the most impressive application of graphical model technology I’ve seen. [sent-13, score-0.529]

9 I also attended the large scale learning challenge workshop and enjoyed Antoine Bordes talk about a fast primal space algorithm that won by a hair over other methods in the wild track. [sent-14, score-0.681]

10 Ronan Collobert ‘s talk was also notable in that they are doing relatively featuritis-free NLP. [sent-15, score-0.282]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('wortman', 0.313), ('talk', 0.181), ('ranking', 0.18), ('active', 0.175), ('enjoyed', 0.155), ('shows', 0.147), ('collective', 0.139), ('ailon', 0.139), ('nir', 0.139), ('antoine', 0.139), ('bordes', 0.139), ('jennifer', 0.139), ('quicksort', 0.139), ('classifications', 0.129), ('primal', 0.129), ('collobert', 0.129), ('ronan', 0.129), ('asymptotic', 0.129), ('mehryar', 0.129), ('mohri', 0.129), ('jenn', 0.129), ('robustly', 0.129), ('wild', 0.129), ('relation', 0.122), ('balcan', 0.122), ('hanneke', 0.122), ('rank', 0.116), ('interacting', 0.116), ('agents', 0.116), ('dan', 0.111), ('kearns', 0.107), ('steve', 0.107), ('nlp', 0.107), ('model', 0.106), ('applicable', 0.104), ('collection', 0.104), ('setting', 0.103), ('notable', 0.101), ('universal', 0.098), ('gap', 0.098), ('reduced', 0.098), ('objects', 0.096), ('behavior', 0.096), ('knowing', 0.094), ('impressive', 0.094), ('implications', 0.092), ('particularly', 0.089), ('attended', 0.087), ('graphical', 0.085), ('michael', 0.08)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999958 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear

2 0.13948026 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

Introduction: Here are a few of the papers I enjoyed at ICML. Steffen Bickel , Michael Brüeckner, Tobias Scheffer , Discriminative Learning for Differing Training and Test Distributions There is a nice trick in this paper: they predict the probability that an unlabeled sample is in the training set vs. the test set, and then use this prediction to importance weight labeled samples in the training set. This paper uses a specific parametric model, but the approach is easily generalized. Steve Hanneke A Bound on the Label Complexity of Agnostic Active Learning This paper bounds the number of labels required by the A 2 algorithm for active learning in the agnostic case. Last year we figured out agnostic active learning was possible. This year, it’s quantified. Hopefull soon, it will be practical. Sylvian Gelly , David Silver Combining Online and Offline Knowledge in UCT . This paper is about techniques for improving MoGo with various sorts of learning. MoGo has a fair

3 0.13562126 304 hunch net-2008-06-27-Reviewing Horror Stories

Introduction: Essentially everyone who writes research papers suffers rejections. They always sting immediately, but upon further reflection many of these rejections come to seem reasonable. Maybe the equations had too many typos or maybe the topic just isn’t as important as was originally thought. A few rejections do not come to seem acceptable, and these form the basis of reviewing horror stories, a great material for conversations. I’ve decided to share three of mine, now all safely a bit distant in the past. Prediction Theory for Classification Tutorial . This is a tutorial about tight sample complexity bounds for classification that I submitted to JMLR . The first decision I heard was a reject which appeared quite unjust to me—for example one of the reviewers appeared to claim that all the content was in standard statistics books. Upon further inquiry, several citations were given, none of which actually covered the content. Later, I was shocked to hear the paper was accepted. App

4 0.13250953 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he

5 0.13230559 360 hunch net-2009-06-15-In Active Learning, the question changes

Introduction: A little over 4 years ago, Sanjoy made a post saying roughly “we should study active learning theoretically, because not much is understood”. At the time, we did not understand basic things such as whether or not it was possible to PAC-learn with an active algorithm without making strong assumptions about the noise rate. In other words, the fundamental question was “can we do it?” The nature of the question has fundamentally changed in my mind. The answer is to the previous question is “yes”, both information theoretically and computationally, most places where supervised learning could be applied. In many situation, the question has now changed to: “is it worth it?” Is the programming and computational overhead low enough to make the label cost savings of active learning worthwhile? Currently, there are situations where this question could go either way. Much of the challenge for the future is in figuring out how to make active learning easier or more worthwhile.

6 0.11341457 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates

7 0.11245746 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

8 0.10997969 127 hunch net-2005-11-02-Progress in Active Learning

9 0.10583309 406 hunch net-2010-08-22-KDD 2010

10 0.10229196 385 hunch net-2009-12-27-Interesting things at NIPS 2009

11 0.10034648 389 hunch net-2010-02-26-Yahoo! ML events

12 0.099391297 403 hunch net-2010-07-18-ICML & COLT 2010

13 0.098380938 45 hunch net-2005-03-22-Active learning

14 0.097729243 420 hunch net-2010-12-26-NIPS 2010

15 0.090084583 346 hunch net-2009-03-18-Parallel ML primitives

16 0.089207567 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

17 0.087439314 309 hunch net-2008-07-10-Interesting papers, ICML 2008

18 0.085905217 321 hunch net-2008-10-19-NIPS 2008 workshop on Kernel Learning

19 0.085857756 444 hunch net-2011-09-07-KDD and MUCMD 2011

20 0.082411967 388 hunch net-2010-01-24-Specializations of the Master Problem


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.194), (1, 0.036), (2, 0.001), (3, -0.09), (4, 0.082), (5, 0.046), (6, -0.042), (7, -0.033), (8, -0.001), (9, -0.019), (10, 0.164), (11, 0.04), (12, -0.135), (13, 0.08), (14, -0.07), (15, -0.033), (16, -0.026), (17, 0.086), (18, 0.03), (19, -0.047), (20, -0.059), (21, 0.027), (22, 0.003), (23, -0.043), (24, 0.018), (25, -0.108), (26, 0.064), (27, -0.033), (28, 0.002), (29, -0.016), (30, -0.011), (31, 0.04), (32, -0.024), (33, -0.02), (34, 0.014), (35, -0.035), (36, -0.033), (37, 0.015), (38, 0.061), (39, -0.043), (40, 0.101), (41, -0.087), (42, -0.007), (43, -0.009), (44, -0.024), (45, 0.033), (46, -0.005), (47, -0.065), (48, -0.051), (49, 0.017)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94639742 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear

2 0.65623128 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma

3 0.64570171 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he

4 0.63985831 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P

5 0.62202859 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

Introduction: Here are a few of the papers I enjoyed at ICML. Steffen Bickel , Michael Brüeckner, Tobias Scheffer , Discriminative Learning for Differing Training and Test Distributions There is a nice trick in this paper: they predict the probability that an unlabeled sample is in the training set vs. the test set, and then use this prediction to importance weight labeled samples in the training set. This paper uses a specific parametric model, but the approach is easily generalized. Steve Hanneke A Bound on the Label Complexity of Agnostic Active Learning This paper bounds the number of labels required by the A 2 algorithm for active learning in the agnostic case. Last year we figured out agnostic active learning was possible. This year, it’s quantified. Hopefull soon, it will be practical. Sylvian Gelly , David Silver Combining Online and Offline Knowledge in UCT . This paper is about techniques for improving MoGo with various sorts of learning. MoGo has a fair

6 0.62096727 360 hunch net-2009-06-15-In Active Learning, the question changes

7 0.61155713 127 hunch net-2005-11-02-Progress in Active Learning

8 0.59263861 420 hunch net-2010-12-26-NIPS 2010

9 0.58358854 45 hunch net-2005-03-22-Active learning

10 0.5724858 406 hunch net-2010-08-22-KDD 2010

11 0.5464322 439 hunch net-2011-08-01-Interesting papers at COLT 2011

12 0.52667356 188 hunch net-2006-06-30-ICML papers

13 0.52436078 138 hunch net-2005-12-09-Some NIPS papers

14 0.51916212 139 hunch net-2005-12-11-More NIPS Papers

15 0.51703072 144 hunch net-2005-12-28-Yet more nips thoughts

16 0.51434803 299 hunch net-2008-04-27-Watchword: Supervised Learning

17 0.51379269 403 hunch net-2010-07-18-ICML & COLT 2010

18 0.51140237 309 hunch net-2008-07-10-Interesting papers, ICML 2008

19 0.50786883 97 hunch net-2005-07-23-Interesting papers at ACL

20 0.50239807 385 hunch net-2009-12-27-Interesting things at NIPS 2009


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(8, 0.362), (10, 0.038), (27, 0.229), (38, 0.072), (53, 0.07), (55, 0.024), (90, 0.01), (94, 0.047), (95, 0.058)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.87235415 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear

2 0.83120894 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three

Introduction: Following up on Hal Daume’s post and John’s post on cool and interesting things seen at NIPS I’ll post my own little list of neat papers here as well. Of course it’s going to be biased towards what I think is interesting. Also, I have to say that I hadn’t been able to see many papers this year at nips due to myself being too busy, so please feel free to contribute the papers that you liked 1. P. Mudigonda, V. Kolmogorov, P. Torr. An Analysis of Convex Relaxations for MAP Estimation. A surprising paper which shows that many of the more sophisticated convex relaxations that had been proposed recently turns out to be subsumed by the simplest LP relaxation. Be careful next time you try a cool new convex relaxation! 2. D. Sontag, T. Jaakkola. New Outer Bounds on the Marginal Polytope. The title says it all. The marginal polytope is the set of local marginal distributions over subsets of variables that are globally consistent in the sense that there is at least one distributio

3 0.57634825 12 hunch net-2005-02-03-Learning Theory, by assumption

Introduction: One way to organize learning theory is by assumption (in the assumption = axiom sense ), from no assumptions to many assumptions. As you travel down this list, the statements become stronger, but the scope of applicability decreases. No assumptions Online learning There exist a meta prediction algorithm which compete well with the best element of any set of prediction algorithms. Universal Learning Using a “bias” of 2 - description length of turing machine in learning is equivalent to all other computable biases up to some constant. Reductions The ability to predict well on classification problems is equivalent to the ability to predict well on many other learning problems. Independent and Identically Distributed (IID) Data Performance Prediction Based upon past performance, you can predict future performance. Uniform Convergence Performance prediction works even after choosing classifiers based on the data from large sets of classifiers.

4 0.57401341 14 hunch net-2005-02-07-The State of the Reduction

Introduction: What? Reductions are machines which turn solvers for one problem into solvers for another problem. Why? Reductions are useful for several reasons. Laziness . Reducing a problem to classification make at least 10 learning algorithms available to solve a problem. Inventing 10 learning algorithms is quite a bit of work. Similarly, programming a reduction is often trivial, while programming a learning algorithm is a great deal of work. Crystallization . The problems we often want to solve in learning are worst-case-impossible, but average case feasible. By reducing all problems onto one or a few primitives, we can fine tune these primitives to perform well on real-world problems with greater precision due to the greater number of problems to validate on. Theoretical Organization . By studying what reductions are easy vs. hard vs. impossible, we can learn which problems are roughly equivalent in difficulty and which are much harder. What we know now . Typesafe r

5 0.56921959 79 hunch net-2005-06-08-Question: “When is the right time to insert the loss function?”

Introduction: Hal asks a very good question: “When is the right time to insert the loss function?” In particular, should it be used at testing time or at training time? When the world imposes a loss on us, the standard Bayesian recipe is to predict the (conditional) probability of each possibility and then choose the possibility which minimizes the expected loss. In contrast, as the confusion over “loss = money lost” or “loss = the thing you optimize” might indicate, many people ignore the Bayesian approach and simply optimize their loss (or a close proxy for their loss) over the representation on the training set. The best answer I can give is “it’s unclear, but I prefer optimizing the loss at training time”. My experience is that optimizing the loss in the most direct manner possible typically yields best performance. This question is related to a basic principle which both Yann LeCun (applied) and Vladimir Vapnik (theoretical) advocate: “solve the simplest prediction problem that s

6 0.56895971 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class

7 0.56792247 131 hunch net-2005-11-16-The Everything Ensemble Edge

8 0.56631845 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem

9 0.56554168 41 hunch net-2005-03-15-The State of Tight Bounds

10 0.56550431 360 hunch net-2009-06-15-In Active Learning, the question changes

11 0.56493998 432 hunch net-2011-04-20-The End of the Beginning of Active Learning

12 0.5646984 235 hunch net-2007-03-03-All Models of Learning have Flaws

13 0.56401455 332 hunch net-2008-12-23-Use of Learning Theory

14 0.5623908 230 hunch net-2007-02-02-Thoughts regarding “Is machine learning different from statistics?”

15 0.56102788 370 hunch net-2009-09-18-Necessary and Sufficient Research

16 0.56093252 347 hunch net-2009-03-26-Machine Learning is too easy

17 0.5591355 194 hunch net-2006-07-11-New Models

18 0.55777949 158 hunch net-2006-02-24-A Fundamentalist Organization of Machine Learning

19 0.55687195 351 hunch net-2009-05-02-Wielding a New Abstraction

20 0.55659568 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms