hunch_net hunch_net-2009 hunch_net-2009-361 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma
sentIndex sentText sentNum sentScore
1 Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. [sent-2, score-0.221]
2 This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. [sent-3, score-0.939]
3 The definition of regret they deal with seems particularly useful in many situation. [sent-4, score-0.101]
4 This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. [sent-6, score-1.072]
5 There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . [sent-7, score-0.214]
6 This paper talks about how to keep track of the moments in a datastream with very little space and computation. [sent-11, score-0.324]
7 This paper points out that via the integral characterization of proper losses, proper scoring rules can be reduced to binary classification. [sent-14, score-0.941]
8 The results unify and generalize the Probing and Quanting reductions we worked on previously. [sent-15, score-0.082]
9 This paper is also related to Nicolas Lambert ‘s work , which is quite thought provoking in terms of specifying what is learnable. [sent-16, score-0.311]
10 This paper shows that a subset of HMMs can be learned using an SVD-based algorithm. [sent-20, score-0.438]
11 Samory Kpotufe , Escaping the curse of dimensionality with a tree-based regressor at COLT. [sent-21, score-0.082]
12 This paper shows how to directly applying regression in high dimensional vector spaces and have it succeed anyways because the data is naturally low-dimensional. [sent-22, score-0.695]
13 This paper characterizes the ability to learn when an adversary is choosing features in the online setting as the “Littlestone dimension”. [sent-25, score-0.324]
wordName wordTfidf (topN-words)
[('proper', 0.231), ('paper', 0.221), ('shows', 0.217), ('unsupervised', 0.213), ('regression', 0.167), ('shai', 0.154), ('compete', 0.142), ('losses', 0.139), ('changing', 0.131), ('reducing', 0.126), ('uai', 0.111), ('moments', 0.103), ('kwik', 0.103), ('characterizes', 0.103), ('covers', 0.103), ('diuk', 0.103), ('francesco', 0.103), ('littlestone', 0.103), ('orabona', 0.103), ('ping', 0.103), ('szita', 0.103), ('regret', 0.101), ('nicolas', 0.095), ('surrogate', 0.095), ('walsh', 0.095), ('gentile', 0.095), ('lambert', 0.095), ('bounds', 0.092), ('probing', 0.09), ('anyways', 0.09), ('characterization', 0.09), ('carlos', 0.09), ('environments', 0.09), ('nicolo', 0.09), ('williamson', 0.09), ('exploring', 0.09), ('littman', 0.09), ('provoking', 0.09), ('linear', 0.088), ('selective', 0.086), ('compact', 0.086), ('adapt', 0.086), ('claudio', 0.086), ('scoring', 0.086), ('integral', 0.082), ('generalize', 0.082), ('semisupervised', 0.082), ('daume', 0.082), ('regressor', 0.082), ('compressed', 0.082)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999988 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma
2 0.18006451 439 hunch net-2011-08-01-Interesting papers at COLT 2011
Introduction: Since John did not attend COLT this year, I have been volunteered to report back on the hot stuff at this year’s meeting. The conference seemed to have pretty high quality stuff this year, and I found plenty of interesting papers on all the three days. I’m gonna pick some of my favorites going through the program in a chronological order. The first session on matrices seemed interesting for two reasons. First, the papers were quite nice. But more interestingly, this is a topic that has had a lot of presence in Statistics and Compressed sensing literature recently. So it was good to see high-dimensional matrices finally make their entry at COLT. The paper of Ohad and Shai on Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing provides non-trivial guarantees on trace norm regularization in an agnostic setup, while Rina and Nati show how Rademacher averages can be used to get sharper results for matrix completion problems in their paper Concentr
3 0.17310824 309 hunch net-2008-07-10-Interesting papers, ICML 2008
Introduction: Here are some papers from ICML 2008 that I found interesting. Risi Kondor and Karsten Borgwardt , The Skew Spectrum of Graphs . This paper is about a new family of functions on graphs which is invariant under node label permutation. They show that these quantities appear to yield good features for learning. Sanjoy Dasgupta and Daniel Hsu . Hierarchical sampling for active learning. This is the first published practical consistent active learning algorithm. The abstract is also pretty impressive. Lihong Li , Michael Littman , and Thomas Walsh Knows What It Knows: A Framework For Self-Aware Learning. This is an attempt to create learning algorithms that know when they err, (other work includes Vovk ). It’s not yet clear to me what the right model for feature-dependent confidence intervals is. Novi Quadrianto , Alex Smola , TIberio Caetano , and Quoc Viet Le Estimating Labels from Label Proportions . This is an example of learning in a speciali
4 0.17130174 403 hunch net-2010-07-18-ICML & COLT 2010
Introduction: The papers which interested me most at ICML and COLT 2010 were: Thomas Walsh , Kaushik Subramanian , Michael Littman and Carlos Diuk Generalizing Apprenticeship Learning across Hypothesis Classes . This paper formalizes and provides algorithms with guarantees for mixed-mode apprenticeship and traditional reinforcement learning algorithms, allowing RL algorithms that perform better than for either setting alone. István Szita and Csaba Szepesvári Model-based reinforcement learning with nearly tight exploration complexity bounds . This paper and another represent the frontier of best-known algorithm for Reinforcement Learning in a Markov Decision Process. James Martens Deep learning via Hessian-free optimization . About a new not-quite-online second order gradient algorithm for learning deep functional structures. Potentially this is very powerful because while people have often talked about end-to-end learning, it has rarely worked in practice. Chrisoph
5 0.16355248 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
Introduction: This post is by Daniel Hsu and John Langford. In selective sampling style active learning, a learning algorithm chooses which examples to label. We now have an active learning algorithm that is: Efficient in label complexity, unlabeled complexity, and computational complexity. Competitive with supervised learning anywhere that supervised learning works. Compatible with online learning, with any optimization-based learning algorithm, with any loss function, with offline testing, and even with changing learning algorithms. Empirically effective. The basic idea is to combine disagreement region-based sampling with importance weighting : an example is selected to be labeled with probability proportional to how useful it is for distinguishing among near-optimal classifiers, and labeled examples are importance-weighted by the inverse of these probabilities. The combination of these simple ideas removes the sampling bias problem that has plagued many previous he
6 0.15966251 196 hunch net-2006-07-13-Regression vs. Classification as a Primitive
7 0.14902316 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
8 0.12496643 199 hunch net-2006-07-26-Two more UAI papers of interest
9 0.12432893 251 hunch net-2007-06-24-Interesting Papers at ICML 2007
10 0.12327796 406 hunch net-2010-08-22-KDD 2010
11 0.12327261 385 hunch net-2009-12-27-Interesting things at NIPS 2009
12 0.11801328 14 hunch net-2005-02-07-The State of the Reduction
13 0.11473976 384 hunch net-2009-12-24-Top graduates this season
14 0.11245746 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
15 0.11150099 8 hunch net-2005-02-01-NIPS: Online Bayes
16 0.11149186 388 hunch net-2010-01-24-Specializations of the Master Problem
17 0.10550538 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models
18 0.10273871 254 hunch net-2007-07-12-ICML Trends
19 0.10148847 452 hunch net-2012-01-04-Why ICML? and the summer conferences
20 0.1001787 334 hunch net-2009-01-07-Interesting Papers at SODA 2009
topicId topicWeight
[(0, 0.234), (1, 0.052), (2, 0.092), (3, -0.133), (4, 0.158), (5, -0.021), (6, 0.004), (7, -0.143), (8, -0.094), (9, 0.009), (10, 0.14), (11, -0.012), (12, -0.232), (13, 0.006), (14, 0.009), (15, 0.04), (16, -0.064), (17, 0.077), (18, 0.069), (19, -0.051), (20, 0.079), (21, 0.01), (22, 0.005), (23, 0.042), (24, 0.045), (25, -0.053), (26, 0.001), (27, -0.025), (28, -0.013), (29, 0.033), (30, -0.009), (31, -0.001), (32, 0.022), (33, 0.044), (34, -0.008), (35, -0.015), (36, -0.042), (37, 0.01), (38, 0.028), (39, 0.045), (40, 0.028), (41, 0.008), (42, 0.033), (43, 0.054), (44, -0.039), (45, 0.003), (46, -0.083), (47, -0.014), (48, 0.011), (49, -0.149)]
simIndex simValue blogId blogTitle
same-blog 1 0.96376377 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma
2 0.75838232 439 hunch net-2011-08-01-Interesting papers at COLT 2011
Introduction: Since John did not attend COLT this year, I have been volunteered to report back on the hot stuff at this year’s meeting. The conference seemed to have pretty high quality stuff this year, and I found plenty of interesting papers on all the three days. I’m gonna pick some of my favorites going through the program in a chronological order. The first session on matrices seemed interesting for two reasons. First, the papers were quite nice. But more interestingly, this is a topic that has had a lot of presence in Statistics and Compressed sensing literature recently. So it was good to see high-dimensional matrices finally make their entry at COLT. The paper of Ohad and Shai on Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing provides non-trivial guarantees on trace norm regularization in an agnostic setup, while Rina and Nati show how Rademacher averages can be used to get sharper results for matrix completion problems in their paper Concentr
3 0.67095399 309 hunch net-2008-07-10-Interesting papers, ICML 2008
Introduction: Here are some papers from ICML 2008 that I found interesting. Risi Kondor and Karsten Borgwardt , The Skew Spectrum of Graphs . This paper is about a new family of functions on graphs which is invariant under node label permutation. They show that these quantities appear to yield good features for learning. Sanjoy Dasgupta and Daniel Hsu . Hierarchical sampling for active learning. This is the first published practical consistent active learning algorithm. The abstract is also pretty impressive. Lihong Li , Michael Littman , and Thomas Walsh Knows What It Knows: A Framework For Self-Aware Learning. This is an attempt to create learning algorithms that know when they err, (other work includes Vovk ). It’s not yet clear to me what the right model for feature-dependent confidence intervals is. Novi Quadrianto , Alex Smola , TIberio Caetano , and Quoc Viet Le Estimating Labels from Label Proportions . This is an example of learning in a speciali
4 0.6684348 403 hunch net-2010-07-18-ICML & COLT 2010
Introduction: The papers which interested me most at ICML and COLT 2010 were: Thomas Walsh , Kaushik Subramanian , Michael Littman and Carlos Diuk Generalizing Apprenticeship Learning across Hypothesis Classes . This paper formalizes and provides algorithms with guarantees for mixed-mode apprenticeship and traditional reinforcement learning algorithms, allowing RL algorithms that perform better than for either setting alone. István Szita and Csaba Szepesvári Model-based reinforcement learning with nearly tight exploration complexity bounds . This paper and another represent the frontier of best-known algorithm for Reinforcement Learning in a Markov Decision Process. James Martens Deep learning via Hessian-free optimization . About a new not-quite-online second order gradient algorithm for learning deep functional structures. Potentially this is very powerful because while people have often talked about end-to-end learning, it has rarely worked in practice. Chrisoph
5 0.65081137 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)
Introduction: Here are a few papers from COLT 2008 that I found interesting. Maria-Florina Balcan , Steve Hanneke , and Jenn Wortman , The True Sample Complexity of Active Learning . This paper shows that in an asymptotic setting, active learning is always better than supervised learning (although the gap may be small). This is evidence that the only thing in the way of universal active learning is us knowing how to do it properly. Nir Ailon and Mehryar Mohri , An Efficient Reduction of Ranking to Classification . This paper shows how to robustly rank n objects with n log(n) classifications using a quicksort based algorithm. The result is applicable to many ranking loss functions and has implications for others. Michael Kearns and Jennifer Wortman . Learning from Collective Behavior . This is about learning in a new model, where the goal is to predict how a collection of interacting agents behave. One claim is that learning in this setting can be reduced to IID lear
6 0.64752215 385 hunch net-2009-12-27-Interesting things at NIPS 2009
7 0.64424455 199 hunch net-2006-07-26-Two more UAI papers of interest
8 0.63053268 188 hunch net-2006-06-30-ICML papers
9 0.60336417 406 hunch net-2010-08-22-KDD 2010
10 0.60240602 251 hunch net-2007-06-24-Interesting Papers at ICML 2007
11 0.57142884 139 hunch net-2005-12-11-More NIPS Papers
12 0.5541063 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three
13 0.55017424 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
14 0.54227406 140 hunch net-2005-12-14-More NIPS Papers II
15 0.54080415 189 hunch net-2006-07-05-more icml papers
16 0.53391331 248 hunch net-2007-06-19-How is Compressed Sensing going to change Machine Learning ?
17 0.51544374 384 hunch net-2009-12-24-Top graduates this season
18 0.51092839 334 hunch net-2009-01-07-Interesting Papers at SODA 2009
19 0.50882429 247 hunch net-2007-06-14-Interesting Papers at COLT 2007
20 0.50502872 192 hunch net-2006-07-08-Some recent papers
topicId topicWeight
[(3, 0.013), (9, 0.388), (27, 0.207), (38, 0.079), (53, 0.041), (55, 0.094), (94, 0.041), (95, 0.05)]
simIndex simValue blogId blogTitle
1 0.96399325 399 hunch net-2010-05-20-Google Predict
Introduction: Slashdot points out Google Predict . I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage.
same-blog 2 0.9080615 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma
3 0.85632449 490 hunch net-2013-11-09-Graduates and Postdocs
Introduction: Several strong graduates are on the job market this year. Alekh Agarwal made the most scalable public learning algorithm as an intern two years ago. He has a deep and broad understanding of optimization and learning as well as the ability and will to make things happen programming-wise. I’ve been privileged to have Alekh visiting me in NY where he will be sorely missed. John Duchi created Adagrad which is a commonly helpful improvement over online gradient descent that is seeing wide adoption, including in Vowpal Wabbit . He has a similarly deep and broad understanding of optimization and learning with significant industry experience at Google . Alekh and John have often coauthored together. Stephane Ross visited me a year ago over the summer, implementing many new algorithms and working out the first scale free online update rule which is now the default in Vowpal Wabbit. Stephane is not on the market—Google robbed the cradle successfully I’m sure that
4 0.83725804 330 hunch net-2008-12-07-A NIPS paper
Introduction: I’m skipping NIPS this year in favor of Ada , but I wanted to point out this paper by Andriy Mnih and Geoff Hinton . The basic claim of the paper is that by carefully but automatically constructing a binary tree over words, it’s possible to predict words well with huge computational resource savings over unstructured approaches. I’m interested in this beyond the application to word prediction because it is relevant to the general normalization problem: If you want to predict the probability of one of a large number of events, often you must compute a predicted score for all the events and then normalize, a computationally inefficient operation. The problem comes up in many places using probabilistic models, but I’ve run into it with high-dimensional regression. There are a couple workarounds for this computational bug: Approximate. There are many ways. Often the approximations are uncontrolled (i.e. can be arbitrarily bad), and hence finicky in application. Avoid. Y
5 0.83586711 174 hunch net-2006-04-27-Conferences, Workshops, and Tutorials
Introduction: This is a reminder that many deadlines for summer conference registration are coming up, and attendance is a very good idea. It’s entirely reasonable for anyone to visit a conference once, even when they don’t have a paper. For students, visiting a conference is almost a ‘must’—there is no where else that a broad cross-section of research is on display. Workshops are also a very good idea. ICML has 11 , KDD has 9 , and AAAI has 19 . Workshops provide an opportunity to get a good understanding of some current area of research. They are probably the forum most conducive to starting new lines of research because they are so interactive. Tutorials are a good way to gain some understanding of a long-standing direction of research. They are generally more coherent than workshops. ICML has 7 and AAAI has 15 .
6 0.83437067 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous
7 0.63643223 403 hunch net-2010-07-18-ICML & COLT 2010
8 0.58608693 160 hunch net-2006-03-02-Why do people count for learning?
9 0.58556849 8 hunch net-2005-02-01-NIPS: Online Bayes
10 0.582183 309 hunch net-2008-07-10-Interesting papers, ICML 2008
11 0.5787012 325 hunch net-2008-11-10-ICML Reviewing Criteria
12 0.5733282 360 hunch net-2009-06-15-In Active Learning, the question changes
13 0.56706882 406 hunch net-2010-08-22-KDD 2010
14 0.5598917 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
15 0.55859011 259 hunch net-2007-08-19-Choice of Metrics
16 0.54918677 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006
17 0.54799139 220 hunch net-2006-11-27-Continuizing Solutions
18 0.54366422 258 hunch net-2007-08-12-Exponentiated Gradient
19 0.54321617 36 hunch net-2005-03-05-Funding Research
20 0.54105538 341 hunch net-2009-02-04-Optimal Proxy Loss for Classification