hunch_net hunch_net-2009 hunch_net-2009-385 knowledge-graph by maker-knowledge-mining

385 hunch net-2009-12-27-Interesting things at NIPS 2009


meta infos for this blog

Source: html

Introduction: Several papers at NIPS caught my attention. Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. Elad Hazan and Satyen Kale On Stochastic and Worst-Case Models of Investing . At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. They have some basic analysis and also a nice application to FMRI brain reading. Sh


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. [sent-2, score-0.75]

2 This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. [sent-3, score-0.972]

3 At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. [sent-5, score-0.724]

4 Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. [sent-6, score-0.164]

5 They have some basic analysis and also a nice application to FMRI brain reading. [sent-7, score-0.076]

6 Kamalika Chaudhuri , Daniel Hsu , and Yoav Freund , A Parameter Free Hedging Algorithm This paper is about eliminating the learning rate parameter from online learning algorithms. [sent-11, score-0.291]

7 While that’s certainly useful, the approach taken involves a double-exponential rather than a single exponential potential, which is strange and potentially useful in many other places. [sent-12, score-0.212]

8 Bing Bai , Jason Weston , David Grangier , Ronan Collobert , Kunihiko Sadamasa , Yanjun Qi , Corinna Cortes , Polynomial Semantic Indexing This is about an empirically improved algorithm for learning ranking functions based on (query,document) content. [sent-13, score-0.205]

9 The sexy Semantic name is justified because it is not based on syntactic matching of query to document. [sent-14, score-0.331]

10 I also found the future publication models discussion interesting. [sent-15, score-0.102]

11 At the workshops, I was deeply confronted with the problem of too many interesting workshops to attend in the given amount of time. [sent-17, score-0.166]

12 Two talks stood out for me: Carlos Guestrin gave a talk in the interactive machine learning workshop on Turning Down the Noise in the Blogosphere by Khalid El-Arini , Gaurav Veda , Dafna Shahaf , and Carlos Guestrin which I missed at KDD this year. [sent-18, score-0.215]

13 The paper discusses the use exponential weight online learning algorithms to rerank blog posts based on user-specific interests. [sent-19, score-0.369]

14 It comes with a demonstration website where you can test it out. [sent-20, score-0.155]

15 Leslie Valiant gave a talk on representations and operations on concepts in a brain-like fashion. [sent-21, score-0.321]

16 The style of representation and algorithm involves distributed representations on sparse graphs, an approach which is relatively unfamiliar. [sent-22, score-0.216]

17 Bloom filters and in machine learning experience with learning through hashing functions has sharpened my intuition a bit. [sent-23, score-0.123]

18 The talk seemed to cover Memorization and Association on a Realistic Neural Model at Neural Computation as well as A First Experimental Demonstration of Massive Knowledge Infusion at KR . [sent-24, score-0.109]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('submodular', 0.282), ('semantic', 0.219), ('guestrin', 0.188), ('online', 0.185), ('optimization', 0.16), ('demonstration', 0.155), ('kale', 0.155), ('satyen', 0.155), ('carlos', 0.146), ('financial', 0.139), ('elad', 0.129), ('hazan', 0.129), ('functions', 0.123), ('involves', 0.11), ('talk', 0.109), ('parameter', 0.106), ('representations', 0.106), ('gave', 0.106), ('models', 0.102), ('exponential', 0.102), ('neural', 0.087), ('justified', 0.083), ('sexy', 0.083), ('fmri', 0.083), ('weston', 0.083), ('modifying', 0.083), ('syntactic', 0.083), ('corinna', 0.083), ('cortes', 0.083), ('chaudhuri', 0.083), ('kamalika', 0.083), ('confronted', 0.083), ('activity', 0.083), ('bing', 0.083), ('hedging', 0.083), ('oliver', 0.083), ('workshops', 0.083), ('based', 0.082), ('label', 0.082), ('jason', 0.077), ('association', 0.077), ('indexing', 0.077), ('collobert', 0.077), ('ronan', 0.077), ('mitchell', 0.077), ('realistic', 0.077), ('investing', 0.077), ('valiant', 0.077), ('bloom', 0.077), ('application', 0.076)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999964 385 hunch net-2009-12-27-Interesting things at NIPS 2009

Introduction: Several papers at NIPS caught my attention. Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. Elad Hazan and Satyen Kale On Stochastic and Worst-Case Models of Investing . At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. They have some basic analysis and also a nice application to FMRI brain reading. Sh

2 0.19300878 186 hunch net-2006-06-24-Online convex optimization at COLT

Introduction: At ICML 2003 , Marty Zinkevich proposed the online convex optimization setting and showed that a particular gradient descent algorithm has regret O(T 0.5 ) with respect to the best predictor where T is the number of rounds. This seems to be a nice model for online learning, and there has been some significant follow-up work. At COLT 2006 Elad Hazan , Adam Kalai , Satyen Kale , and Amit Agarwal presented a modification which takes a Newton step guaranteeing O(log T) regret when the first and second derivatives are bounded. Then they applied these algorithms to portfolio management at ICML 2006 (with Robert Schapire ) yielding some very fun graphs.

3 0.1462139 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

Introduction: I learned a number of things at NIPS . The financial people were there in greater force than previously. Two Sigma sponsored NIPS while DRW Trading had a booth. The adversarial machine learning workshop had a number of talks about interesting applications where an adversary really is out to try and mess up your learning algorithm. This is very different from the situation we often think of where the world is oblivious to our learning. This may present new and convincing applications for the learning-against-an-adversary work common at COLT . There were several interesing papers. Sanjoy Dasgupta , Daniel Hsu , and Claire Monteleoni had a paper on General Agnostic Active Learning . The basic idea is that active learning can be done via reduction to a form of supervised learning problem. This is great, because we have many supervised learning algorithms from which the benefits of active learning may be derived. Joseph Bradley and Robert Schapire had a P

4 0.13716364 403 hunch net-2010-07-18-ICML & COLT 2010

Introduction: The papers which interested me most at ICML and COLT 2010 were: Thomas Walsh , Kaushik Subramanian , Michael Littman and Carlos Diuk Generalizing Apprenticeship Learning across Hypothesis Classes . This paper formalizes and provides algorithms with guarantees for mixed-mode apprenticeship and traditional reinforcement learning algorithms, allowing RL algorithms that perform better than for either setting alone. István Szita and Csaba Szepesvári Model-based reinforcement learning with nearly tight exploration complexity bounds . This paper and another represent the frontier of best-known algorithm for Reinforcement Learning in a Markov Decision Process. James Martens Deep learning via Hessian-free optimization . About a new not-quite-online second order gradient algorithm for learning deep functional structures. Potentially this is very powerful because while people have often talked about end-to-end learning, it has rarely worked in practice. Chrisoph

5 0.13576025 334 hunch net-2009-01-07-Interesting Papers at SODA 2009

Introduction: Several talks seem potentially interesting to ML folks at this year’s SODA. Maria-Florina Balcan , Avrim Blum , and Anupam Gupta , Approximate Clustering without the Approximation . This paper gives reasonable algorithms with provable approximation guarantees for k-median and other notions of clustering. It’s conceptually interesting, because it’s the second example I’ve seen where NP hardness is subverted by changing the problem definition subtle but reasonable way. Essentially, they show that if any near-approximation to an optimal solution is good, then it’s computationally easy to find a near-optimal solution. This subtle shift bears serious thought. A similar one occurred in our ranking paper with respect to minimum feedback arcset. With two known examples, it suggests that many more NP-complete problems might be finessed into irrelevance in this style. Yury Lifshits and Shengyu Zhang , Combinatorial Algorithms for Nearest Neighbors, Near-Duplicates, and Smal

6 0.1294793 252 hunch net-2007-07-01-Watchword: Online Learning

7 0.12327261 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

8 0.11756741 267 hunch net-2007-10-17-Online as the new adjective

9 0.11611968 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

10 0.11295186 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

11 0.10392161 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

12 0.10389647 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011

13 0.10229196 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

14 0.10149378 448 hunch net-2011-10-24-2011 ML symposium and the bears

15 0.1000608 388 hunch net-2010-01-24-Specializations of the Master Problem

16 0.098983444 293 hunch net-2008-03-23-Interactive Machine Learning

17 0.096732065 375 hunch net-2009-10-26-NIPS workshops

18 0.096415795 444 hunch net-2011-09-07-KDD and MUCMD 2011

19 0.095374085 309 hunch net-2008-07-10-Interesting papers, ICML 2008

20 0.092677757 308 hunch net-2008-07-06-To Dual or Not


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.227), (1, 0.051), (2, -0.048), (3, -0.098), (4, 0.146), (5, 0.057), (6, -0.014), (7, -0.123), (8, -0.001), (9, 0.078), (10, 0.072), (11, 0.03), (12, -0.082), (13, -0.075), (14, 0.045), (15, 0.084), (16, -0.036), (17, 0.008), (18, 0.077), (19, 0.021), (20, -0.019), (21, -0.069), (22, -0.004), (23, 0.031), (24, 0.09), (25, 0.033), (26, 0.052), (27, 0.015), (28, 0.0), (29, 0.038), (30, -0.079), (31, 0.049), (32, -0.007), (33, 0.02), (34, -0.041), (35, -0.017), (36, -0.015), (37, -0.035), (38, -0.065), (39, -0.086), (40, 0.045), (41, -0.013), (42, 0.02), (43, 0.068), (44, -0.043), (45, -0.007), (46, -0.069), (47, -0.019), (48, -0.007), (49, -0.079)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95257366 385 hunch net-2009-12-27-Interesting things at NIPS 2009

Introduction: Several papers at NIPS caught my attention. Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. Elad Hazan and Satyen Kale On Stochastic and Worst-Case Models of Investing . At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. They have some basic analysis and also a nice application to FMRI brain reading. Sh

2 0.70762134 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009

Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma

3 0.70580584 186 hunch net-2006-06-24-Online convex optimization at COLT

Introduction: At ICML 2003 , Marty Zinkevich proposed the online convex optimization setting and showed that a particular gradient descent algorithm has regret O(T 0.5 ) with respect to the best predictor where T is the number of rounds. This seems to be a nice model for online learning, and there has been some significant follow-up work. At COLT 2006 Elad Hazan , Adam Kalai , Satyen Kale , and Amit Agarwal presented a modification which takes a Newton step guaranteeing O(log T) regret when the first and second derivatives are bounded. Then they applied these algorithms to portfolio management at ICML 2006 (with Robert Schapire ) yielding some very fun graphs.

4 0.67170346 252 hunch net-2007-07-01-Watchword: Online Learning

Introduction: It turns out that many different people use the term “Online Learning”, and often they don’t have the same definition in mind. Here’s a list of the possibilities I know of. Online Information Setting Online learning refers to a problem in which unlabeled data comes, a prediction is made, and then feedback is acquired. Online Adversarial Setting Online learning refers to algorithms in the Online Information Setting which satisfy guarantees of the form: “For all possible sequences of observations, the algorithim has regret at most log ( number of strategies) with respect to the best strategy in a set.” This is sometimes called online learning with experts. Online Optimization Constraint Online learning refers to optimizing a predictor via a learning algorithm tunes parameters on a per-example basis. This may or may not be applied in the Online Information Setting, and the strategy may or may not satisfy Adversarial setting theory. Online Computational Constra

5 0.66879678 308 hunch net-2008-07-06-To Dual or Not

Introduction: Yoram and Shai ‘s online learning tutorial at ICML brings up a question for me, “Why use the dual ?” The basic setting is learning a weight vector w i so that the function f(x)= sum i w i x i optimizes some convex loss function. The functional view of the dual is that instead of (or in addition to) keeping track of w i over the feature space, you keep track of a vector a j over the examples and define w i = sum j a j x ji . The above view of duality makes operating in the dual appear unnecessary, because in the end a weight vector is always used. The tutorial suggests that thinking about the dual gives a unified algorithmic font for deriving online learning algorithms. I haven’t worked with the dual representation much myself, but I have seen a few examples where it appears helpful. Noise When doing online optimization (i.e. online learning where you are allowed to look at individual examples multiple times), the dual representation may be helpfu

6 0.64866143 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

7 0.63424194 279 hunch net-2007-12-19-Cool and interesting things seen at NIPS

8 0.62439358 267 hunch net-2007-10-17-Online as the new adjective

9 0.61541647 309 hunch net-2008-07-10-Interesting papers, ICML 2008

10 0.60963786 403 hunch net-2010-07-18-ICML & COLT 2010

11 0.59952158 439 hunch net-2011-08-01-Interesting papers at COLT 2011

12 0.59945786 188 hunch net-2006-06-30-ICML papers

13 0.57564288 378 hunch net-2009-11-15-The Other Online Learning

14 0.56837064 8 hunch net-2005-02-01-NIPS: Online Bayes

15 0.5618102 334 hunch net-2009-01-07-Interesting Papers at SODA 2009

16 0.55485618 310 hunch net-2008-07-15-Interesting papers at COLT (and a bit of UAI & workshops)

17 0.54903322 192 hunch net-2006-07-08-Some recent papers

18 0.53765643 258 hunch net-2007-08-12-Exponentiated Gradient

19 0.53338575 398 hunch net-2010-05-10-Aggregation of estimators, sparsity in high dimension and computational feasibility

20 0.53060675 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(3, 0.022), (10, 0.023), (27, 0.211), (30, 0.014), (38, 0.059), (48, 0.011), (49, 0.012), (53, 0.039), (55, 0.091), (64, 0.015), (66, 0.332), (77, 0.015), (83, 0.013), (94, 0.041), (95, 0.031)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.8539905 374 hunch net-2009-10-10-ALT 2009

Introduction: I attended ALT (“Algorithmic Learning Theory”) for the first time this year. My impression is ALT = 0.5 COLT, by attendance and also by some more intangible “what do I get from it?” measure. There are many differences which can’t quite be described this way though. The program for ALT seems to be substantially more diverse than COLT, which is both a weakness and a strength. One paper that might interest people generally is: Alexey Chernov and Vladimir Vovk , Prediction with Expert Evaluators’ Advice . The basic observation here is that in the online learning with experts setting you can simultaneously compete with several compatible loss functions simultaneously. Restated, debating between competing with log loss and squared loss is a waste of breath, because it’s almost free to compete with them both simultaneously. This might interest anyone who has run into “which loss function?” debates that come up periodically.

same-blog 2 0.8511315 385 hunch net-2009-12-27-Interesting things at NIPS 2009

Introduction: Several papers at NIPS caught my attention. Elad Hazan and Satyen Kale , Online Submodular Optimization They define an algorithm for online optimization of submodular functions with regret guarantees. This places submodular optimization roughly on par with online convex optimization as tractable settings for online learning. Elad Hazan and Satyen Kale On Stochastic and Worst-Case Models of Investing . At it’s core, this is yet another example of modifying worst-case online learning to deal with variance, but the application to financial models is particularly cool and it seems plausibly superior other common approaches for financial modeling. Mark Palatucci , Dean Pomerlau , Tom Mitchell , and Geoff Hinton Zero Shot Learning with Semantic Output Codes The goal here is predicting a label in a multiclass supervised setting where the label never occurs in the training data. They have some basic analysis and also a nice application to FMRI brain reading. Sh

3 0.76388383 191 hunch net-2006-07-08-MaxEnt contradicts Bayes Rule?

Introduction: A few weeks ago I read this . David Blei and I spent some time thinking hard about this a few years back (thanks to Kary Myers for pointing us to it): In short I was thinking that “bayesian belief updating” and “maximum entropy” were two othogonal principles. But it appear that they are not, and that they can even be in conflict ! Example (from Kass 1996); consider a Die (6 sides), consider prior knowledge E[X]=3.5. Maximum entropy leads to P(X)= (1/6, 1/6, 1/6, 1/6, 1/6, 1/6). Now consider a new piece of evidence A=”X is an odd number” Bayesian posterior P(X|A)= P(A|X) P(X) = (1/3, 0, 1/3, 0, 1/3, 0). But MaxEnt with the constraints E[X]=3.5 and E[Indicator function of A]=1 leads to (.22, 0, .32, 0, .47, 0) !! (note that E[Indicator function of A]=P(A)) Indeed, for MaxEnt, because there is no more ‘6′, big numbers must be more probable to ensure an average of 3.5. For bayesian updating, P(X|A) doesn’t have to have a 3.5

4 0.59015518 343 hunch net-2009-02-18-Decision by Vetocracy

Introduction: Few would mistake the process of academic paper review for a fair process, but sometimes the unfairness seems particularly striking. This is most easily seen by comparison: Paper Banditron Offset Tree Notes Problem Scope Multiclass problems where only the loss of one choice can be probed. Strictly greater: Cost sensitive multiclass problems where only the loss of one choice can be probed. Often generalizations don’t matter. That’s not the case here, since every plausible application I’ve thought of involves loss functions substantially different from 0/1. What’s new Analysis and Experiments Algorithm, Analysis, and Experiments As far as I know, the essence of the more general problem was first stated and analyzed with the EXP4 algorithm (page 16) (1998). It’s also the time horizon 1 simplification of the Reinforcement Learning setting for the random trajectory method (page 15) (2002). The Banditron algorithm itself is functionally identi

5 0.58589244 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

6 0.58031714 230 hunch net-2007-02-02-Thoughts regarding “Is machine learning different from statistics?”

7 0.57809931 320 hunch net-2008-10-14-Who is Responsible for a Bad Review?

8 0.57804847 406 hunch net-2010-08-22-KDD 2010

9 0.57744044 225 hunch net-2007-01-02-Retrospective

10 0.57678455 259 hunch net-2007-08-19-Choice of Metrics

11 0.57654732 360 hunch net-2009-06-15-In Active Learning, the question changes

12 0.57636714 309 hunch net-2008-07-10-Interesting papers, ICML 2008

13 0.57568818 325 hunch net-2008-11-10-ICML Reviewing Criteria

14 0.57539463 51 hunch net-2005-04-01-The Producer-Consumer Model of Research

15 0.57510328 235 hunch net-2007-03-03-All Models of Learning have Flaws

16 0.57481891 220 hunch net-2006-11-27-Continuizing Solutions

17 0.57432818 98 hunch net-2005-07-27-Not goal metrics

18 0.57360005 378 hunch net-2009-11-15-The Other Online Learning

19 0.57299435 351 hunch net-2009-05-02-Wielding a New Abstraction

20 0.57286024 95 hunch net-2005-07-14-What Learning Theory might do