hunch_net hunch_net-2005 hunch_net-2005-68 knowledge-graph by maker-knowledge-mining

68 hunch net-2005-05-10-Learning Reductions are Reductionist

meta infos for this blog

Source: html

Introduction: This is about a fundamental motivation for the investigation of reductions in learning. It applies to many pieces of work other than my own. The reductionist approach to problem solving is characterized by taking a problem, decomposing it into as-small-as-possible subproblems, discovering how to solve the subproblems, and then discovering how to use the solutions to the subproblems to solve larger problems. The reductionist approach to solving problems has often payed off very well. Computer science related examples of the reductionist approach include: Reducing computation to the transistor. All of our CPUs are built from transistors. Reducing rendering of images to rendering a triangle (or other simple polygons). Computers can now render near-realistic scenes in real time. The big breakthrough came from learning how to render many triangles quickly. This approach to problem solving extends well beyond computer science. Many fields of science focus on theories mak

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 This is about a fundamental motivation for the investigation of reductions in learning. [sent-1, score-0.261]

2 It applies to many pieces of work other than my own. [sent-2, score-0.139]

3 The reductionist approach to problem solving is characterized by taking a problem, decomposing it into as-small-as-possible subproblems, discovering how to solve the subproblems, and then discovering how to use the solutions to the subproblems to solve larger problems. [sent-3, score-1.772]

4 The reductionist approach to solving problems has often payed off very well. [sent-4, score-0.942]

5 Computer science related examples of the reductionist approach include: Reducing computation to the transistor. [sent-5, score-0.683]

6 Reducing rendering of images to rendering a triangle (or other simple polygons). [sent-7, score-0.789]

7 Computers can now render near-realistic scenes in real time. [sent-8, score-0.354]

8 The big breakthrough came from learning how to render many triangles quickly. [sent-9, score-0.423]

9 This approach to problem solving extends well beyond computer science. [sent-10, score-0.635]

10 Many fields of science focus on theories making predictions about very simple systems. [sent-11, score-0.516]

11 These predictions are then composed to make predictions about where space craft go, how large a cannonball needs to be, etcâ€Ś Obviously this approach has been quite successful. [sent-12, score-0.651]

12 It is an open question whether or not this approach can really succeed at learning. [sent-13, score-0.354]

13 Against : We know that succesful learning requires the incorporation of prior knowledge in fairly arbitrary forms. [sent-14, score-0.784]

14 This suggests that we can not easily decompose the process of learning. [sent-15, score-0.099]

15 For : We know that humans can succeed at general purpose learning. [sent-16, score-0.376]

16 It may be that arbitrary prior knowledge is required to solve arbitrary learning problems, but perhaps there are specific learning algorithms incorporating specific prior knowledge capable of solving the specific problems we encounter. [sent-17, score-2.105]

17 Neutral : We know that learning reductions sometimes work. [sent-18, score-0.197]

18 We donâ€™t yet have a good comparison of how well they work with other approaches. [sent-19, score-0.069]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('reductionist', 0.383), ('render', 0.255), ('rendering', 0.255), ('subproblems', 0.247), ('approach', 0.204), ('arbitrary', 0.195), ('specific', 0.187), ('predictions', 0.171), ('solving', 0.168), ('knowledge', 0.167), ('discovering', 0.161), ('prior', 0.154), ('succeed', 0.15), ('reducing', 0.139), ('solve', 0.119), ('cpus', 0.113), ('payed', 0.113), ('triangle', 0.113), ('reductions', 0.106), ('characterized', 0.105), ('composed', 0.105), ('decomposing', 0.105), ('incorporation', 0.105), ('computer', 0.101), ('breakthrough', 0.099), ('decompose', 0.099), ('scenes', 0.099), ('theories', 0.099), ('science', 0.096), ('extends', 0.095), ('know', 0.091), ('incorporating', 0.088), ('images', 0.085), ('investigation', 0.082), ('simple', 0.081), ('built', 0.077), ('problems', 0.074), ('motivation', 0.073), ('computers', 0.072), ('succesful', 0.072), ('applies', 0.072), ('comparison', 0.069), ('came', 0.069), ('fields', 0.069), ('humans', 0.068), ('pieces', 0.067), ('purpose', 0.067), ('beyond', 0.067), ('obviously', 0.064), ('capable', 0.063)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 68 hunch net-2005-05-10-Learning Reductions are Reductionist

2 0.14791486 14 hunch net-2005-02-07-The State of the Reduction

Introduction: What? Reductions are machines which turn solvers for one problem into solvers for another problem. Why? Reductions are useful for several reasons. Laziness . Reducing a problem to classification make at least 10 learning algorithms available to solve a problem. Inventing 10 learning algorithms is quite a bit of work. Similarly, programming a reduction is often trivial, while programming a learning algorithm is a great deal of work. Crystallization . The problems we often want to solve in learning are worst-case-impossible, but average case feasible. By reducing all problems onto one or a few primitives, we can fine tune these primitives to perform well on real-world problems with greater precision due to the greater number of problems to validate on. Theoretical Organization . By studying what reductions are easy vs. hard vs. impossible, we can learn which problems are roughly equivalent in difficulty and which are much harder. What we know now . Typesafe r

3 0.14000134 161 hunch net-2006-03-05-“Structural” Learning

Introduction: Fernando Pereira pointed out Ando and Zhang ‘s paper on “structural” learning. Structural learning is multitask learning on subproblems created from unlabeled data. The basic idea is to take a look at the unlabeled data and create many supervised problems. On text data, which they test on, these subproblems might be of the form “Given surrounding words predict the middle word”. The hope here is that successfully predicting on these subproblems is relevant to the prediction of your core problem. In the long run, the precise mechanism used (essentially, linear predictors with parameters tied by a common matrix) and the precise problems formed may not be critical. What seems critical is that the hope is realized: the technique provides a significant edge in practice. Some basic questions about this approach are: Are there effective automated mechanisms for creating the subproblems? Is it necessary to use a shared representation?

4 0.13709667 347 hunch net-2009-03-26-Machine Learning is too easy

Introduction: One of the remarkable things about machine learning is how diverse it is. The viewpoints of Bayesian learning, reinforcement learning, graphical models, supervised learning, unsupervised learning, genetic programming, etc… share little enough overlap that many people can and do make their careers within one without touching, or even necessarily understanding the others. There are two fundamental reasons why this is possible. For many problems, many approaches work in the sense that they do something useful. This is true empirically, where for many problems we can observe that many different approaches yield better performance than any constant predictor. It’s also true in theory, where we know that for any set of predictors representable in a finite amount of RAM, minimizing training error over the set of predictors does something nontrivial when there are a sufficient number of examples. There is nothing like a unifying problem defining the field. In many other areas there

5 0.12580627 235 hunch net-2007-03-03-All Models of Learning have Flaws

Introduction: Attempts to abstract and study machine learning are within some given framework or mathematical model. It turns out that all of these models are significantly flawed for the purpose of studying machine learning. I’ve created a table (below) outlining the major flaws in some common models of machine learning. The point here is not simply “woe unto us”. There are several implications which seem important. The multitude of models is a point of continuing confusion. It is common for people to learn about machine learning within one framework which often becomes there “home framework” through which they attempt to filter all machine learning. (Have you met people who can only think in terms of kernels? Only via Bayes Law? Only via PAC Learning?) Explicitly understanding the existence of these other frameworks can help resolve the confusion. This is particularly important when reviewing and particularly important for students. Algorithms which conform to multiple approaches c

6 0.11089843 95 hunch net-2005-07-14-What Learning Theory might do

7 0.10871885 370 hunch net-2009-09-18-Necessary and Sufficient Research

8 0.10638119 228 hunch net-2007-01-15-The Machine Learning Department

9 0.10520246 237 hunch net-2007-04-02-Contextual Scaling

10 0.098649532 28 hunch net-2005-02-25-Problem: Online Learning

11 0.097425327 112 hunch net-2005-09-14-The Predictionist Viewpoint

12 0.09444993 168 hunch net-2006-04-02-Mad (Neuro)science

13 0.090483457 22 hunch net-2005-02-18-What it means to do research.

14 0.089942671 352 hunch net-2009-05-06-Machine Learning to AI

15 0.088082105 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

16 0.08712896 165 hunch net-2006-03-23-The Approximation Argument

17 0.08498238 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability

18 0.081624568 351 hunch net-2009-05-02-Wielding a New Abstraction

19 0.080439709 103 hunch net-2005-08-18-SVM Adaptability

20 0.080231339 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.176), (1, 0.092), (2, -0.037), (3, 0.054), (4, -0.011), (5, -0.068), (6, 0.06), (7, 0.068), (8, 0.057), (9, -0.014), (10, -0.057), (11, -0.091), (12, -0.016), (13, 0.106), (14, -0.009), (15, -0.062), (16, 0.069), (17, -0.08), (18, 0.02), (19, -0.008), (20, 0.012), (21, 0.018), (22, -0.002), (23, 0.011), (24, 0.078), (25, -0.007), (26, 0.038), (27, 0.071), (28, -0.007), (29, 0.067), (30, 0.075), (31, 0.07), (32, 0.009), (33, -0.01), (34, -0.039), (35, 0.032), (36, -0.069), (37, 0.011), (38, -0.011), (39, 0.051), (40, 0.027), (41, -0.015), (42, 0.017), (43, 0.077), (44, 0.094), (45, -0.016), (46, -0.091), (47, -0.047), (48, 0.036), (49, -0.062)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96983749 68 hunch net-2005-05-10-Learning Reductions are Reductionist

2 0.68518817 82 hunch net-2005-06-17-Reopening RL->Classification

Introduction: In research, it’s often the case that solving a problem helps you realize that it wasn’t the right problem to solve. This is the case for the “ reduce RL to classification ” problem with the solution hinted at here and turned into a paper here . The essential difficulty is that the method of stating and analyzing reductions ends up being nonalgorithmic (unlike previous reductions) unless you work with learning from teleoperated robots as Greg Grudic does. The difficulty here is due to the reduction being dependent on the optimal policy (which a human teleoperator might simulate, but which is otherwise unavailable). So, this problem is “open” again with the caveat that this time we want a more algorithmic solution. Whether or not this is feasible at all is still unclear and evidence in either direction would greatly interest me. A positive answer might have many practical implications in the long run.

3 0.65708256 90 hunch net-2005-07-07-The Limits of Learning Theory

Introduction: Suppose we had an infinitely powerful mathematician sitting in a room and proving theorems about learning. Could he solve machine learning? The answer is “no”. This answer is both obvious and sometimes underappreciated. There are several ways to conclude that some bias is necessary in order to succesfully learn. For example, suppose we are trying to solve classification. At prediction time, we observe some features X and want to make a prediction of either 0 or 1 . Bias is what makes us prefer one answer over the other based on past experience. In order to learn we must: Have a bias. Always predicting 0 is as likely as 1 is useless. Have the “right” bias. Predicting 1 when the answer is 0 is also not helpful. The implication of “have a bias” is that we can not design effective learning algorithms with “a uniform prior over all possibilities”. The implication of “have the ‘right’ bias” is that our mathematician fails since “right” is defined wi

4 0.64729244 347 hunch net-2009-03-26-Machine Learning is too easy

5 0.64202547 215 hunch net-2006-10-22-Exemplar programming

Introduction: There are many different abstractions for problem definition and solution. Here are a few examples: Functional programming: a set of functions are defined. The composed execution of these functions yields the solution. Linear programming: a set of constraints and a linear objective function are defined. An LP solver finds the constrained optimum. Quadratic programming: Like linear programming, but the language is a little more flexible (and the solution slower). Convex programming: like quadratic programming, but the language is more flexible (and the solutions even slower). Dynamic programming: a recursive definition of the problem is defined and then solved efficiently via caching tricks. SAT programming: A problem is specified as a satisfiability involving a conjunction of a disjunction of boolean variables. A general engine attempts to find a good satisfying assignment. For example Kautz’s blackbox planner. These abstractions have different tradeoffs betw

6 0.63291025 168 hunch net-2006-04-02-Mad (Neuro)science

7 0.6233685 370 hunch net-2009-09-18-Necessary and Sufficient Research

8 0.6224758 112 hunch net-2005-09-14-The Predictionist Viewpoint

9 0.60377431 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

10 0.60268873 237 hunch net-2007-04-02-Contextual Scaling

11 0.59868485 161 hunch net-2006-03-05-“Structural” Learning

12 0.59243202 114 hunch net-2005-09-20-Workshop Proposal: Atomic Learning

13 0.57669669 100 hunch net-2005-08-04-Why Reinforcement Learning is Important

14 0.57195944 95 hunch net-2005-07-14-What Learning Theory might do

15 0.56668991 153 hunch net-2006-02-02-Introspectionism as a Disease

16 0.56598461 358 hunch net-2009-06-01-Multitask Poisoning

17 0.56315923 149 hunch net-2006-01-18-Is Multitask Learning Black-Boxable?

18 0.55887896 351 hunch net-2009-05-02-Wielding a New Abstraction

19 0.55085856 228 hunch net-2007-01-15-The Machine Learning Department

20 0.55033076 235 hunch net-2007-03-03-All Models of Learning have Flaws

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.172), (38, 0.088), (53, 0.102), (55, 0.075), (80, 0.327), (94, 0.129)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.84796357 222 hunch net-2006-12-05-Recruitment Conferences

Introduction: One of the subsidiary roles of conferences is recruitment. NIPS is optimally placed in time for this because it falls right before the major recruitment season. I personally found job hunting embarrassing, and was relatively inept at it. I expect this is true of many people, because it is not something done often. The basic rule is: make the plausible hirers aware of your interest. Any corporate sponsor is a “plausible”, regardless of whether or not there is a booth. CRA and the acm job center are other reasonable sources. There are substantial differences between the different possibilities. Putting some effort into understanding the distinctions is a good idea, although you should always remember where the other person is coming from.

same-blog 2 0.84758162 68 hunch net-2005-05-10-Learning Reductions are Reductionist

3 0.79952407 146 hunch net-2006-01-06-MLTV

Introduction: As part of a PASCAL project, the Slovenians have been filming various machine learning events and placing them on the web here . This includes, for example, the Chicago 2005 Machine Learning Summer School as well as a number of other summer schools, workshops, and conferences. There are some significant caveats here—for example, I can’t access it from Linux. Based upon the webserver logs, I expect that is a problem for most people—Computer scientists are particularly nonstandard in their choice of computing platform. Nevertheless, the core idea here is excellent and details of compatibility can be fixed later. With modern technology toys, there is no fundamental reason why the process of announcing new work at a conference should happen only once and only for the people who could make it to that room in that conference. The problems solved include: The multitrack vs. single-track debate. (“Sometimes the single track doesn’t interest me” vs. “When it’s multitrack I mis

4 0.74801844 34 hunch net-2005-03-02-Prior, “Prior” and Bias

Introduction: Many different ways of reasoning about learning exist, and many of these suggest that some method of saying “I prefer this predictor to that predictor” is useful and necessary. Examples include Bayesian reasoning, prediction bounds, and online learning. One difficulty which arises is that the manner and meaning of saying “I prefer this predictor to that predictor” differs. Prior (Bayesian) A prior is a probability distribution over a set of distributions which expresses a belief in the probability that some distribution is the distribution generating the data. “Prior” (Prediction bounds & online learning) The “prior” is a measure over a set of classifiers which expresses the degree to which you hope the classifier will predict well. Bias (Regularization, Early termination of neural network training, etc…) The bias is some (often implicitly specified by an algorithm) way of preferring one predictor to another. This only scratches the surface—there are yet more subt

5 0.70201337 141 hunch net-2005-12-17-Workshops as Franchise Conferences

Introduction: Founding a successful new conference is extraordinarily difficult. As a conference founder, you must manage to attract a significant number of good papers—enough to entice the participants into participating next year and to (generally) to grow the conference. For someone choosing to participate in a new conference, there is a very significant decision to make: do you send a paper to some new conference with no guarantee that the conference will work out? Or do you send it to another (possibly less related) conference that you are sure will work? The conference founding problem is a joint agreement problem with a very significant barrier. Workshops are a way around this problem, and workshops attached to conferences are a particularly effective means for this. A workshop at a conference is sure to have people available to speak and attend and is sure to have a large audience available. Presenting work at a workshop is not generally exclusive: it can also be presented at a confe

6 0.5973779 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

7 0.5855794 95 hunch net-2005-07-14-What Learning Theory might do

8 0.58405828 163 hunch net-2006-03-12-Online learning or online preservation of learning?

9 0.58277667 131 hunch net-2005-11-16-The Everything Ensemble Edge

10 0.5820955 297 hunch net-2008-04-22-Taking the next step

11 0.58183569 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

12 0.58090192 19 hunch net-2005-02-14-Clever Methods of Overfitting

13 0.58014798 306 hunch net-2008-07-02-Proprietary Data in Academic Research?

14 0.57924664 237 hunch net-2007-04-02-Contextual Scaling

15 0.57784051 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

16 0.57526141 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

17 0.57475758 253 hunch net-2007-07-06-Idempotent-capable Predictors

18 0.57458568 143 hunch net-2005-12-27-Automated Labeling

19 0.57356364 233 hunch net-2007-02-16-The Forgetting

20 0.57233149 98 hunch net-2005-07-27-Not goal metrics