hunch_net hunch_net-2006 hunch_net-2006-192 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: It was a fine time for learning in Pittsburgh. John and Sam mentioned some of my favorites. Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. Definitely an area I’m interested in. Maximum Entropy Distribution Estimation with Generalized Regularization Miroslav DudÃÂk, Robert E. Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. Are there advantages/disadvantages? Higher Order Learning On Graphs> by Sameer Agarwal, Kristin Branson, and Serge Belongie, looks to be interesteding. They seem to poo-poo “tensorization
sentIndex sentText sentNum sentScore
1 Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. [sent-3, score-0.101]
2 Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. [sent-6, score-0.42]
3 I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. [sent-7, score-0.116]
4 The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. [sent-8, score-0.378]
5 They seem to poo-poo “tensorization” of existing graph algorithms. [sent-11, score-0.115]
6 Cover Trees for Nearest Neighbor (Alina Beygelzimer, Sham Kakade, John Langford) finally seems to have gotten published. [sent-12, score-0.326]
7 It’s an embarrassment to the community that it took this long– and a reminder of how diligent one has to be in ensuring good work gets published. [sent-13, score-0.219]
wordName wordTfidf (topN-words)
[('csaba', 0.257), ('finally', 0.187), ('john', 0.146), ('raina', 0.139), ('dud', 0.139), ('miroslav', 0.139), ('belongie', 0.139), ('branson', 0.139), ('daphne', 0.139), ('dekel', 0.139), ('gotten', 0.139), ('serge', 0.139), ('szepesv', 0.139), ('reinforcement', 0.135), ('singer', 0.128), ('head', 0.128), ('ofer', 0.128), ('yoram', 0.128), ('phil', 0.121), ('ri', 0.121), ('rajat', 0.121), ('beygelzimer', 0.121), ('reminder', 0.121), ('langford', 0.121), ('agarwal', 0.121), ('sam', 0.116), ('regular', 0.116), ('em', 0.116), ('saw', 0.116), ('existing', 0.115), ('iteration', 0.111), ('priors', 0.107), ('path', 0.107), ('graphs', 0.104), ('constructing', 0.104), ('multitask', 0.104), ('minimization', 0.104), ('kakade', 0.101), ('ng', 0.101), ('generalized', 0.101), ('checking', 0.101), ('long', 0.098), ('entropy', 0.098), ('regularization', 0.098), ('took', 0.098), ('policies', 0.098), ('estimation', 0.096), ('sham', 0.094), ('definitely', 0.094), ('schapire', 0.094)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 192 hunch net-2006-07-08-Some recent papers
Introduction: It was a fine time for learning in Pittsburgh. John and Sam mentioned some of my favorites. Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. Definitely an area I’m interested in. Maximum Entropy Distribution Estimation with Generalized Regularization Miroslav DudÃÂk, Robert E. Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. Are there advantages/disadvantages? Higher Order Learning On Graphs> by Sameer Agarwal, Kristin Branson, and Serge Belongie, looks to be interesteding. They seem to poo-poo “tensorization
2 0.13429154 185 hunch net-2006-06-16-Regularization = Robustness
Introduction: The Gibbs-Jaynes theorem is a classical result that tells us that the highest entropy distribution (most uncertain, least committed, etc.) subject to expectation constraints on a set of features is an exponential family distribution with the features as sufficient statistics. In math, argmax_p H(p) s.t. E_p[f_i] = c_i is given by e^{\sum \lambda_i f_i}/Z. (Z here is the necessary normalization constraint, and the lambdas are free parameters we set to meet the expectation constraints). A great deal of statistical mechanics flows from this result, and it has proven very fruitful in learning as well. (Motivating work in models in text learning and Conditional Random Fields, for instance. ) The result has been demonstrated a number of ways. One of the most elegant is the “geometric†version here . In the case when the expectation constraints come from data, this tells us that the maximum entropy distribution is exactly the maximum likelihood distribution in the expone
3 0.12989897 403 hunch net-2010-07-18-ICML & COLT 2010
Introduction: The papers which interested me most at ICML and COLT 2010 were: Thomas Walsh , Kaushik Subramanian , Michael Littman and Carlos Diuk Generalizing Apprenticeship Learning across Hypothesis Classes . This paper formalizes and provides algorithms with guarantees for mixed-mode apprenticeship and traditional reinforcement learning algorithms, allowing RL algorithms that perform better than for either setting alone. István Szita and Csaba Szepesvári Model-based reinforcement learning with nearly tight exploration complexity bounds . This paper and another represent the frontier of best-known algorithm for Reinforcement Learning in a Markov Decision Process. James Martens Deep learning via Hessian-free optimization . About a new not-quite-online second order gradient algorithm for learning deep functional structures. Potentially this is very powerful because while people have often talked about end-to-end learning, it has rarely worked in practice. Chrisoph
4 0.091470882 220 hunch net-2006-11-27-Continuizing Solutions
Introduction: This post is about a general technique for problem solving which I’ve never seen taught (in full generality), but which I’ve found very useful. Many problems in computer science turn out to be discretely difficult. The best known version of such problems are NP-hard problems, but I mean ‘discretely difficult’ in a much more general way, which I only know how to capture by examples. ERM In empirical risk minimization, you choose a minimum error rate classifier from a set of classifiers. This is NP hard for common sets, but it can be much harder, depending on the set. Experts In the online learning with experts setting, you try to predict well so as to compete with a set of (adversarial) experts. Here the alternating quantifiers of you and an adversary playing out a game can yield a dynamic programming problem that grows exponentially. Policy Iteration The problem with policy iteration is that you learn a new policy with respect to an old policy, which implies that sim
5 0.090209223 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification
Introduction: At an intuitive level, the question here is “Can reinforcement learning be solved with classification?” Problem Construct a reinforcement learning algorithm with near-optimal expected sum of rewards in the direct experience model given access to a classifier learning algorithm which has a small error rate or regret on all posed classification problems. The definition of “posed” here is slightly murky. I consider a problem “posed” if there is an algorithm for constructing labeled classification examples. Past Work There exists a reduction of reinforcement learning to classification given a generative model. A generative model is an inherently stronger assumption than the direct experience model. Other work on learning reductions may be important. Several algorithms for solving reinforcement learning in the direct experience model exist. Most, such as E 3 , Factored-E 3 , and metric-E 3 and Rmax require that the observation be the state. Recent work
6 0.086497784 404 hunch net-2010-08-20-The Workshop on Cores, Clusters, and Clouds
7 0.083321214 186 hunch net-2006-06-24-Online convex optimization at COLT
8 0.081756912 386 hunch net-2010-01-13-Sam Roweis died
9 0.080711633 8 hunch net-2005-02-01-NIPS: Online Bayes
10 0.078137167 164 hunch net-2006-03-17-Multitask learning is Black-Boxable
11 0.076760747 101 hunch net-2005-08-08-Apprenticeship Reinforcement Learning for Control
12 0.076651506 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class
13 0.074961051 77 hunch net-2005-05-29-Maximum Margin Mismatch?
14 0.073893204 420 hunch net-2010-12-26-NIPS 2010
15 0.072290726 183 hunch net-2006-06-14-Explorations of Exploration
16 0.07051082 309 hunch net-2008-07-10-Interesting papers, ICML 2008
17 0.068706848 188 hunch net-2006-06-30-ICML papers
18 0.066294037 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
19 0.065670274 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011
20 0.062859096 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
topicId topicWeight
[(0, 0.145), (1, 0.006), (2, -0.013), (3, -0.035), (4, 0.091), (5, 0.016), (6, -0.002), (7, -0.04), (8, -0.001), (9, -0.03), (10, 0.044), (11, -0.024), (12, -0.068), (13, 0.033), (14, 0.046), (15, 0.004), (16, -0.019), (17, 0.038), (18, 0.006), (19, 0.007), (20, -0.054), (21, -0.092), (22, -0.032), (23, -0.002), (24, 0.058), (25, 0.071), (26, -0.039), (27, 0.06), (28, 0.033), (29, 0.107), (30, 0.041), (31, -0.074), (32, -0.041), (33, -0.045), (34, -0.028), (35, -0.017), (36, -0.003), (37, 0.116), (38, 0.008), (39, 0.003), (40, -0.025), (41, -0.055), (42, 0.074), (43, 0.03), (44, 0.031), (45, -0.016), (46, 0.027), (47, -0.011), (48, -0.054), (49, -0.028)]
simIndex simValue blogId blogTitle
same-blog 1 0.94517028 192 hunch net-2006-07-08-Some recent papers
Introduction: It was a fine time for learning in Pittsburgh. John and Sam mentioned some of my favorites. Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. Definitely an area I’m interested in. Maximum Entropy Distribution Estimation with Generalized Regularization Miroslav DudÃÂk, Robert E. Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. Are there advantages/disadvantages? Higher Order Learning On Graphs> by Sameer Agarwal, Kristin Branson, and Serge Belongie, looks to be interesteding. They seem to poo-poo “tensorization
2 0.6352402 101 hunch net-2005-08-08-Apprenticeship Reinforcement Learning for Control
Introduction: Pieter Abbeel presented a paper with Andrew Ng at ICML on Exploration and Apprenticeship Learning in Reinforcement Learning . The basic idea of this algorithm is: Collect data from a human controlling a machine. Build a transition model based upon the experience. Build a policy which optimizes the transition model. Evaluate the policy. If it works well, halt, otherwise add the experience into the pool and go to (2). The paper proves that this technique will converge to some policy with expected performance near human expected performance assuming the world fits certain assumptions (MDP or linear dynamics). This general idea of apprenticeship learning (i.e. incorporating data from an expert) seems very compelling because (a) humans often learn this way and (b) much harder problems can be solved. For (a), the notion of teaching is about transferring knowledge from an expert to novices, often via demonstration. To see (b), note that we can create intricate rei
3 0.58157116 77 hunch net-2005-05-29-Maximum Margin Mismatch?
Introduction: John makes a fascinating point about structured classification (and slightly scooped my post!). Maximum Margin Markov Networks (M3N) are an interesting example of the second class of structured classifiers (where the classification of one label depends on the others), and one of my favorite papers. I’m not alone: the paper won the best student paper award at NIPS in 2003. There are some things I find odd about the paper. For instance, it says of probabilistic models “cannot handle high dimensional feature spaces and lack strong theoretical guarrantees.” I’m aware of no such limitations. Also: “Unfortunately, even probabilistic graphical models that are trained discriminatively do not achieve the same level of performance as SVMs, especially when kernel features are used.” This is quite interesting and contradicts my own experience as well as that of a number of people I greatly respect . I wonder what the root cause is: perhaps there is something different abo
4 0.54633111 185 hunch net-2006-06-16-Regularization = Robustness
Introduction: The Gibbs-Jaynes theorem is a classical result that tells us that the highest entropy distribution (most uncertain, least committed, etc.) subject to expectation constraints on a set of features is an exponential family distribution with the features as sufficient statistics. In math, argmax_p H(p) s.t. E_p[f_i] = c_i is given by e^{\sum \lambda_i f_i}/Z. (Z here is the necessary normalization constraint, and the lambdas are free parameters we set to meet the expectation constraints). A great deal of statistical mechanics flows from this result, and it has proven very fruitful in learning as well. (Motivating work in models in text learning and Conditional Random Fields, for instance. ) The result has been demonstrated a number of ways. One of the most elegant is the “geometric†version here . In the case when the expectation constraints come from data, this tells us that the maximum entropy distribution is exactly the maximum likelihood distribution in the expone
5 0.50551492 189 hunch net-2006-07-05-more icml papers
Introduction: Here are a few other papers I enjoyed from ICML06. Topic Models: Dynamic Topic Models David Blei, John Lafferty A nice model for how topics in LDA type models can evolve over time, using a linear dynamical system on the natural parameters and a very clever structured variational approximation (in which the mean field parameters are pseudo-observations of a virtual LDS). Like all Blei papers, he makes it look easy, but it is extremely impressive. Pachinko Allocation Wei Li, Andrew McCallum A very elegant (but computationally challenging) model which induces correlation amongst topics using a multi-level DAG whose interior nodes are “super-topics” and “sub-topics” and whose leaves are the vocabulary words. Makes the slumbering monster of structure learning stir. Sequence Analysis (I missed these talks since I was chairing another session) Online Decoding of Markov Models with Latency Constraints Mukund Narasimhan, Paul Viola, Michael Shilman An “a
6 0.49551049 403 hunch net-2010-07-18-ICML & COLT 2010
7 0.48281154 188 hunch net-2006-06-30-ICML papers
8 0.46256679 144 hunch net-2005-12-28-Yet more nips thoughts
9 0.46239778 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
10 0.45402041 139 hunch net-2005-12-11-More NIPS Papers
11 0.45005882 309 hunch net-2008-07-10-Interesting papers, ICML 2008
12 0.44737846 276 hunch net-2007-12-10-Learning Track of International Planning Competition
13 0.44504222 439 hunch net-2011-08-01-Interesting papers at COLT 2011
14 0.44042155 385 hunch net-2009-12-27-Interesting things at NIPS 2009
15 0.42759708 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification
16 0.41686475 58 hunch net-2005-04-21-Dynamic Programming Generalizations and Their Use
17 0.41598186 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three
18 0.41501969 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS
19 0.41125736 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design
20 0.40867007 114 hunch net-2005-09-20-Workshop Proposal: Atomic Learning
topicId topicWeight
[(9, 0.062), (27, 0.162), (38, 0.022), (53, 0.073), (55, 0.083), (67, 0.507)]
simIndex simValue blogId blogTitle
same-blog 1 0.88102734 192 hunch net-2006-07-08-Some recent papers
Introduction: It was a fine time for learning in Pittsburgh. John and Sam mentioned some of my favorites. Here’s a few more worth checking out: Online Multitask Learning Ofer Dekel, Phil Long, Yoram Singer This is on my reading list. Definitely an area I’m interested in. Maximum Entropy Distribution Estimation with Generalized Regularization Miroslav DudÃÂk, Robert E. Schapire Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path András Antos, Csaba Szepesvári, Rémi Munos Again, on the list to read. I saw Csaba and Remi talk about this and related work at an ICML Workshop on Kernel Reinforcement Learning. The big question in my head is how this compares/contrasts with existing work in reductions to reinforcement learning. Are there advantages/disadvantages? Higher Order Learning On Graphs> by Sameer Agarwal, Kristin Branson, and Serge Belongie, looks to be interesteding. They seem to poo-poo “tensorization
2 0.72943938 296 hunch net-2008-04-21-The Science 2.0 article
Introduction: I found the article about science using modern tools interesting , especially the part about ‘blogophobia’, which in my experience is often a substantial issue: many potential guest posters aren’t quite ready, because of the fear of a permanent public mistake, because it is particularly hard to write about the unknown (the essence of research), and because the system for public credit doesn’t yet really handle blog posts. So far, science has been relatively resistant to discussing research on blogs. Some things need to change to get there. Public tolerance of the occasional mistake is essential, as is a willingness to cite (and credit) blogs as freely as papers. I’ve often run into another reason for holding back myself: I don’t want to overtalk my own research. Nevertheless, I’m slowly changing to the opinion that I’m holding back too much: the real power of a blog in research is that it can be used to confer with many people, and that just makes research work better.
3 0.70688289 180 hunch net-2006-05-21-NIPS paper evaluation criteria
Introduction: John Platt , who is PC-chair for NIPS 2006 has organized a NIPS paper evaluation criteria document with input from the program committee and others. The document contains specific advice about what is appropriate for the various subareas within NIPS. It may be very helpful, because the standards of evaluation for papers varies significantly. This is a bit of an experiment: the hope is that by carefully thinking about and stating what is important, authors can better understand whether and where their work fits. Update: The general submission page and Author instruction including how to submit an appendix .
4 0.54427552 463 hunch net-2012-05-02-ICML: Behind the Scenes
Introduction: This is a rather long post, detailing the ICML 2012 review process. The goal is to make the process more transparent, help authors understand how we came to a decision, and discuss the strengths and weaknesses of this process for future conference organizers. Microsoft’s Conference Management Toolkit (CMT) We chose to use CMT over other conference management software mainly because of its rich toolkit. The interface is sub-optimal (to say the least!) but it has extensive capabilities (to handle bids, author response, resubmissions, etc.), good import/export mechanisms (to process the data elsewhere), excellent technical support (to answer late night emails, add new functionalities). Overall, it was the right choice, although we hope a designer will look at that interface sometime soon! Toronto Matching System (TMS) TMS is now being used by many major conferences in our field (including NIPS and UAI). It is an automated system (developed by Laurent Charlin and Rich Ze
5 0.46089193 252 hunch net-2007-07-01-Watchword: Online Learning
Introduction: It turns out that many different people use the term “Online Learning”, and often they don’t have the same definition in mind. Here’s a list of the possibilities I know of. Online Information Setting Online learning refers to a problem in which unlabeled data comes, a prediction is made, and then feedback is acquired. Online Adversarial Setting Online learning refers to algorithms in the Online Information Setting which satisfy guarantees of the form: “For all possible sequences of observations, the algorithim has regret at most log ( number of strategies) with respect to the best strategy in a set.” This is sometimes called online learning with experts. Online Optimization Constraint Online learning refers to optimizing a predictor via a learning algorithm tunes parameters on a per-example basis. This may or may not be applied in the Online Information Setting, and the strategy may or may not satisfy Adversarial setting theory. Online Computational Constra
6 0.41037065 41 hunch net-2005-03-15-The State of Tight Bounds
7 0.40123907 403 hunch net-2010-07-18-ICML & COLT 2010
8 0.37266579 77 hunch net-2005-05-29-Maximum Margin Mismatch?
9 0.35659432 134 hunch net-2005-12-01-The Webscience Future
10 0.35621729 478 hunch net-2013-01-07-NYU Large Scale Machine Learning Class
11 0.35277271 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
12 0.34925506 151 hunch net-2006-01-25-1 year
13 0.34544057 309 hunch net-2008-07-10-Interesting papers, ICML 2008
14 0.3437283 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
15 0.34371412 203 hunch net-2006-08-18-Report of MLSS 2006 Taipei
16 0.341968 225 hunch net-2007-01-02-Retrospective
17 0.34053344 174 hunch net-2006-04-27-Conferences, Workshops, and Tutorials
18 0.33930168 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous
19 0.33866367 330 hunch net-2008-12-07-A NIPS paper
20 0.33741403 149 hunch net-2006-01-18-Is Multitask Learning Black-Boxable?