hunch_net hunch_net-2005 hunch_net-2005-82 knowledge-graph by maker-knowledge-mining

82 hunch net-2005-06-17-Reopening RL->Classification


meta infos for this blog

Source: html

Introduction: In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. A positive answer might have many practical implications in the long run.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. [sent-1, score-0.861]

2 This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . [sent-2, score-0.83]

3 The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. [sent-3, score-1.63]

4 The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). [sent-4, score-1.096]

5 So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. [sent-5, score-0.485]

6 Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. [sent-6, score-0.92]

7 A positive answer might have many practical implications in the long run. [sent-7, score-0.665]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('robots', 0.231), ('reductions', 0.215), ('greg', 0.214), ('simulate', 0.214), ('difficulty', 0.211), ('turned', 0.193), ('ends', 0.185), ('stating', 0.179), ('caveat', 0.173), ('feasible', 0.168), ('problem', 0.168), ('case', 0.164), ('analyzing', 0.16), ('unless', 0.156), ('implications', 0.153), ('realize', 0.15), ('wasn', 0.15), ('rl', 0.15), ('unlike', 0.144), ('algorithmic', 0.144), ('dependent', 0.142), ('positive', 0.133), ('unclear', 0.133), ('direction', 0.127), ('helps', 0.125), ('reduce', 0.124), ('policy', 0.121), ('otherwise', 0.112), ('practical', 0.111), ('human', 0.111), ('optimal', 0.111), ('reduction', 0.111), ('greatly', 0.111), ('essential', 0.105), ('evidence', 0.104), ('might', 0.101), ('previous', 0.098), ('open', 0.098), ('whether', 0.098), ('run', 0.097), ('still', 0.095), ('either', 0.095), ('classification', 0.095), ('method', 0.09), ('interest', 0.087), ('solving', 0.086), ('solution', 0.086), ('answer', 0.085), ('long', 0.082), ('due', 0.076)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0 82 hunch net-2005-06-17-Reopening RL->Classification

Introduction: In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. A positive answer might have many practical implications in the long run.

2 0.15867627 83 hunch net-2005-06-18-Lower Bounds for Learning Reductions

Introduction: Learning reductions transform a solver of one type of learning problem into a solver of another type of learning problem. When we analyze these for robustness we can make statement of the form “Reduction R has the property that regret r (or loss) on subproblems of type A implies regret at most f ( r ) on the original problem of type B “. A lower bound for a learning reduction would have the form “for all reductions R , there exists a learning problem of type B and learning algorithm for problems of type A where regret r on induced problems implies at least regret f ( r ) for B “. The pursuit of lower bounds is often questionable because, unlike upper bounds, they do not yield practical algorithms. Nevertheless, they may be helpful as a tool for thinking about what is learnable and how learnable it is. This has already come up here and here . At the moment, there is no coherent theory of lower bounds for learning reductions, and we have little understa

3 0.158136 14 hunch net-2005-02-07-The State of the Reduction

Introduction: What? Reductions are machines which turn solvers for one problem into solvers for another problem. Why? Reductions are useful for several reasons. Laziness . Reducing a problem to classification make at least 10 learning algorithms available to solve a problem. Inventing 10 learning algorithms is quite a bit of work. Similarly, programming a reduction is often trivial, while programming a learning algorithm is a great deal of work. Crystallization . The problems we often want to solve in learning are worst-case-impossible, but average case feasible. By reducing all problems onto one or a few primitives, we can fine tune these primitives to perform well on real-world problems with greater precision due to the greater number of problems to validate on. Theoretical Organization . By studying what reductions are easy vs. hard vs. impossible, we can learn which problems are roughly equivalent in difficulty and which are much harder. What we know now . Typesafe r

4 0.13153504 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

Introduction: A type of prediction problem is specified by the type of samples produced by a data source (Example: X x {0,1} , X x [0,1] , X x {1,2,3,4,5} , etc…) and a loss function (0/1 loss, squared error loss, cost sensitive losses, etc…). For simplicity, we’ll assume that all losses have a minimum of zero. For this post, we can think of a learning reduction as A mapping R from samples of one type T (like multiclass classification) to another type T’ (like binary classification). A mapping Q from predictors for type T’ to predictors for type T . The simplest sort of learning reduction is a “loss reduction”. The idea in a loss reduction is to prove a statement of the form: Theorem For all base predictors b , for all distributions D over examples of type T : E (x,y) ~ D L T (y,Q(b,x)) <= f(E (x’,y’)~R(D) L T’ (y’,b(x’))) Here L T is the loss for the type T problem and L T’ is the loss for the type T’ problem. Also, R(D) is the distribution ov

5 0.11600762 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

Introduction: One prescription for solving a problem well is: State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution. Think about solutions to the stated problem. Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution. When a problem can not be stated succinctly, we wonder if the problem is even understood. (And when a problem is not understood, we wonder if a solution can be meaningful.) Reinforcement learning does step (1) well. It provides a clean simple language to state general AI problems. In reinforcement learning there is a set of actions A , a set of observations O , and a reward r . The reinforcement learning problem, in general, is defined by a conditional measure D( o, r | (o,r,a) * ) which produces an observation o and a reward r given a history (o,r,a) * . The goal in reinforcement learning is to find a policy pi:(o,r,a) * -> a

6 0.11277928 351 hunch net-2009-05-02-Wielding a New Abstraction

7 0.10696791 103 hunch net-2005-08-18-SVM Adaptability

8 0.099687263 235 hunch net-2007-03-03-All Models of Learning have Flaws

9 0.096859984 181 hunch net-2006-05-23-What is the best regret transform reduction from multiclass to binary?

10 0.094653949 358 hunch net-2009-06-01-Multitask Poisoning

11 0.093783505 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

12 0.090478964 78 hunch net-2005-06-06-Exact Online Learning for Classification

13 0.089552902 454 hunch net-2012-01-30-ICML Posters and Scope

14 0.085101701 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

15 0.083845749 332 hunch net-2008-12-23-Use of Learning Theory

16 0.083768837 343 hunch net-2009-02-18-Decision by Vetocracy

17 0.083322287 220 hunch net-2006-11-27-Continuizing Solutions

18 0.081779703 269 hunch net-2007-10-24-Contextual Bandits

19 0.081015088 31 hunch net-2005-02-26-Problem: Reductions and Relative Ranking Metrics

20 0.079853214 91 hunch net-2005-07-10-Thinking the Unthought


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.186), (1, 0.05), (2, -0.0), (3, 0.016), (4, -0.061), (5, -0.024), (6, 0.17), (7, 0.006), (8, -0.06), (9, -0.007), (10, -0.076), (11, -0.067), (12, -0.056), (13, 0.117), (14, 0.027), (15, -0.009), (16, 0.091), (17, -0.063), (18, -0.019), (19, -0.02), (20, -0.062), (21, 0.006), (22, -0.057), (23, -0.073), (24, 0.027), (25, 0.094), (26, 0.013), (27, 0.093), (28, 0.03), (29, 0.071), (30, 0.093), (31, -0.007), (32, -0.03), (33, -0.031), (34, -0.013), (35, -0.005), (36, -0.046), (37, 0.001), (38, 0.038), (39, 0.005), (40, 0.04), (41, 0.073), (42, 0.069), (43, 0.064), (44, 0.003), (45, -0.017), (46, -0.019), (47, -0.008), (48, -0.02), (49, -0.015)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97505915 82 hunch net-2005-06-17-Reopening RL->Classification

Introduction: In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. A positive answer might have many practical implications in the long run.

2 0.69254494 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

Introduction: One prescription for solving a problem well is: State the problem, in the simplest way possible. In particular, this statement should involve no contamination with or anticipation of the solution. Think about solutions to the stated problem. Stating a problem in a succinct and crisp manner tends to invite a simple elegant solution. When a problem can not be stated succinctly, we wonder if the problem is even understood. (And when a problem is not understood, we wonder if a solution can be meaningful.) Reinforcement learning does step (1) well. It provides a clean simple language to state general AI problems. In reinforcement learning there is a set of actions A , a set of observations O , and a reward r . The reinforcement learning problem, in general, is defined by a conditional measure D( o, r | (o,r,a) * ) which produces an observation o and a reward r given a history (o,r,a) * . The goal in reinforcement learning is to find a policy pi:(o,r,a) * -> a

3 0.66811687 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

Introduction: At an intuitive level, the question here is “Can reinforcement learning be solved with classification?” Problem Construct a reinforcement learning algorithm with near-optimal expected sum of rewards in the direct experience model given access to a classifier learning algorithm which has a small error rate or regret on all posed classification problems. The definition of “posed” here is slightly murky. I consider a problem “posed” if there is an algorithm for constructing labeled classification examples. Past Work There exists a reduction of reinforcement learning to classification given a generative model. A generative model is an inherently stronger assumption than the direct experience model. Other work on learning reductions may be important. Several algorithms for solving reinforcement learning in the direct experience model exist. Most, such as E 3 , Factored-E 3 , and metric-E 3 and Rmax require that the observation be the state. Recent work

4 0.66247576 68 hunch net-2005-05-10-Learning Reductions are Reductionist

Introduction: This is about a fundamental motivation for the investigation of reductions in learning. It applies to many pieces of work other than my own. The reductionist approach to problem solving is characterized by taking a problem, decomposing it into as-small-as-possible subproblems, discovering how to solve the subproblems, and then discovering how to use the solutions to the subproblems to solve larger problems. The reductionist approach to solving problems has often payed off very well. Computer science related examples of the reductionist approach include: Reducing computation to the transistor. All of our CPUs are built from transistors. Reducing rendering of images to rendering a triangle (or other simple polygons). Computers can now render near-realistic scenes in real time. The big breakthrough came from learning how to render many triangles quickly. This approach to problem solving extends well beyond computer science. Many fields of science focus on theories mak

5 0.66119534 83 hunch net-2005-06-18-Lower Bounds for Learning Reductions

Introduction: Learning reductions transform a solver of one type of learning problem into a solver of another type of learning problem. When we analyze these for robustness we can make statement of the form “Reduction R has the property that regret r (or loss) on subproblems of type A implies regret at most f ( r ) on the original problem of type B “. A lower bound for a learning reduction would have the form “for all reductions R , there exists a learning problem of type B and learning algorithm for problems of type A where regret r on induced problems implies at least regret f ( r ) for B “. The pursuit of lower bounds is often questionable because, unlike upper bounds, they do not yield practical algorithms. Nevertheless, they may be helpful as a tool for thinking about what is learnable and how learnable it is. This has already come up here and here . At the moment, there is no coherent theory of lower bounds for learning reductions, and we have little understa

6 0.66062683 292 hunch net-2008-03-15-COLT Open Problems

7 0.64273351 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

8 0.63606924 14 hunch net-2005-02-07-The State of the Reduction

9 0.62441123 458 hunch net-2012-03-06-COLT-ICML Open Questions and ICML Instructions

10 0.59911478 103 hunch net-2005-08-18-SVM Adaptability

11 0.58669519 29 hunch net-2005-02-25-Solution: Reinforcement Learning with Classification

12 0.55219692 391 hunch net-2010-03-15-The Efficient Robust Conditional Probability Estimation Problem

13 0.54720384 276 hunch net-2007-12-10-Learning Track of International Planning Competition

14 0.54192859 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

15 0.53932232 358 hunch net-2009-06-01-Multitask Poisoning

16 0.53065073 58 hunch net-2005-04-21-Dynamic Programming Generalizations and Their Use

17 0.52833599 351 hunch net-2009-05-02-Wielding a New Abstraction

18 0.52289081 370 hunch net-2009-09-18-Necessary and Sufficient Research

19 0.519023 429 hunch net-2011-04-06-COLT open questions

20 0.51525289 181 hunch net-2006-05-23-What is the best regret transform reduction from multiclass to binary?


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.243), (34, 0.262), (38, 0.147), (53, 0.099), (55, 0.115), (94, 0.019)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.94435954 82 hunch net-2005-06-17-Reopening RL->Classification

Introduction: In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. A positive answer might have many practical implications in the long run.

2 0.92245096 153 hunch net-2006-02-02-Introspectionism as a Disease

Introduction: In the AI-related parts of machine learning, it is often tempting to examine how you do things in order to imagine how a machine should do things. This is introspection, and it can easily go awry. I will call introspection gone awry introspectionism. Introspectionism is almost unique to AI (and the AI-related parts of machine learning) and it can lead to huge wasted effort in research. It’s easiest to show how introspectionism arises by an example. Suppose we want to solve the problem of navigating a robot from point A to point B given a camera. Then, the following research action plan might seem natural when you examine your own capabilities: Build an edge detector for still images. Build an object recognition system given the edge detector. Build a system to predict distance and orientation to objects given the object recognition system. Build a system to plan a path through the scene you construct from {object identification, distance, orientation} predictions.

3 0.90510434 415 hunch net-2010-10-28-NY ML Symposium 2010

Introduction: About 200 people attended the 2010 NYAS ML Symposium this year. (It was about 170 last year .) I particularly enjoyed several talks. Yann has a new live demo of (limited) real-time object recognition learning. Sanjoy gave a fairly convincing and comprehensible explanation of why a modified form of single-linkage clustering is consistent in higher dimensions, and why consistency is a critical feature for clustering algorithms. I’m curious how well this algorithm works in practice. Matt Hoffman ‘s poster covering online LDA seemed pretty convincing to me as an algorithmic improvement. This year, we allocated more time towards posters & poster spotlights. For next year, we are considering some further changes. The format has traditionally been 4 invited Professor speakers, with posters and poster spotlight for students. Demand from other parties to participate is growing, for example from postdocs and startups in the area. Another growing concern is the fa

4 0.88296264 188 hunch net-2006-06-30-ICML papers

Introduction: Here are some ICML papers which interested me. Arindam Banerjee had a paper which notes that PAC-Bayes bounds, a core theorem in online learning, and the optimality of Bayesian learning statements share a core inequality in their proof. Pieter Abbeel , Morgan Quigley and Andrew Y. Ng have a paper discussing RL techniques for learning given a bad (but not too bad) model of the world. Nina Balcan and Avrim Blum have a paper which discusses how to learn given a similarity function rather than a kernel. A similarity function requires less structure than a kernel, implying that a learning algorithm using a similarity function might be applied in situations where no effective kernel is evident. Nathan Ratliff , Drew Bagnell , and Marty Zinkevich have a paper describing an algorithm which attempts to fuse A * path planning with learning of transition costs based on human demonstration. Papers (2), (3), and (4), all seem like an initial pass at solving in

5 0.75567597 457 hunch net-2012-02-29-Key Scientific Challenges and the Franklin Symposium

Introduction: For graduate students, the Yahoo! Key Scientific Challenges program including in machine learning is on again, due March 9 . The application is easy and the $5K award is high quality “no strings attached” funding. Consider submitting. Those in Washington DC, Philadelphia, and New York, may consider attending the Franklin Institute Symposium April 25 which has several speakers and an award for V . Attendance is free with an RSVP.

6 0.74115628 353 hunch net-2009-05-08-Computability in Artificial Intelligence

7 0.73643589 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

8 0.7354539 131 hunch net-2005-11-16-The Everything Ensemble Edge

9 0.729693 233 hunch net-2007-02-16-The Forgetting

10 0.72250336 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem

11 0.72036099 19 hunch net-2005-02-14-Clever Methods of Overfitting

12 0.71508193 251 hunch net-2007-06-24-Interesting Papers at ICML 2007

13 0.71423835 437 hunch net-2011-07-10-ICML 2011 and the future

14 0.71262383 72 hunch net-2005-05-16-Regret minimizing vs error limiting reductions

15 0.70771867 230 hunch net-2007-02-02-Thoughts regarding “Is machine learning different from statistics?”

16 0.70716226 14 hunch net-2005-02-07-The State of the Reduction

17 0.70430446 194 hunch net-2006-07-11-New Models

18 0.70263213 12 hunch net-2005-02-03-Learning Theory, by assumption

19 0.70148301 44 hunch net-2005-03-21-Research Styles in Machine Learning

20 0.70049101 309 hunch net-2008-07-10-Interesting papers, ICML 2008