hunch_net hunch_net-2007 hunch_net-2007-272 knowledge-graph by maker-knowledge-mining

272 hunch net-2007-11-14-BellKor wins Netflix


meta infos for this blog

Source: html

Introduction: … but only the little prize. The BellKor team focused on integrating predictions from many different methods. The base methods consist of: Nearest Neighbor Methods Matrix Factorization Methods (asymmetric and symmetric) Linear Regression on various feature spaces Restricted Boltzman Machines The final predictor was an ensemble (as was reasonable to expect), although it’s a little bit more complicated than just a weighted average—it’s essentially a customized learning algorithm. Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). RBMs are the new approach. The writeup is pretty clear for more details. The contestants are close to reaching the big prize, but the last 1.5% is probably at least as hard as what’s been done. A few new structurally different methods for making predictions may need to be discovered and added into the mixture. In other words, research may be require


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The BellKor team focused on integrating predictions from many different methods. [sent-2, score-0.63]

2 Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). [sent-4, score-1.138]

3 The writeup is pretty clear for more details. [sent-6, score-0.319]

4 The contestants are close to reaching the big prize, but the last 1. [sent-7, score-0.393]

5 5% is probably at least as hard as what’s been done. [sent-8, score-0.079]

6 A few new structurally different methods for making predictions may need to be discovered and added into the mixture. [sent-9, score-0.992]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('asymmetric', 0.316), ('factorization', 0.298), ('methods', 0.265), ('base', 0.205), ('predictions', 0.171), ('customized', 0.17), ('structurally', 0.17), ('consist', 0.17), ('rbms', 0.17), ('symmetric', 0.17), ('writeup', 0.17), ('contestants', 0.158), ('bellkor', 0.158), ('team', 0.142), ('reaching', 0.136), ('approaches', 0.134), ('discovered', 0.132), ('complicated', 0.132), ('integrating', 0.128), ('spaces', 0.124), ('restricted', 0.124), ('little', 0.119), ('ensemble', 0.118), ('matrix', 0.118), ('although', 0.117), ('variant', 0.115), ('prize', 0.113), ('neighbor', 0.11), ('nearest', 0.104), ('focused', 0.101), ('weighted', 0.099), ('machines', 0.099), ('close', 0.099), ('added', 0.099), ('words', 0.098), ('regression', 0.092), ('final', 0.092), ('different', 0.088), ('average', 0.086), ('haven', 0.086), ('predictor', 0.081), ('probably', 0.079), ('required', 0.078), ('pretty', 0.078), ('seen', 0.073), ('feature', 0.073), ('linear', 0.073), ('clear', 0.071), ('relatively', 0.068), ('may', 0.067)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 272 hunch net-2007-11-14-BellKor wins Netflix

Introduction: … but only the little prize. The BellKor team focused on integrating predictions from many different methods. The base methods consist of: Nearest Neighbor Methods Matrix Factorization Methods (asymmetric and symmetric) Linear Regression on various feature spaces Restricted Boltzman Machines The final predictor was an ensemble (as was reasonable to expect), although it’s a little bit more complicated than just a weighted average—it’s essentially a customized learning algorithm. Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). RBMs are the new approach. The writeup is pretty clear for more details. The contestants are close to reaching the big prize, but the last 1.5% is probably at least as hard as what’s been done. A few new structurally different methods for making predictions may need to be discovered and added into the mixture. In other words, research may be require

2 0.14266184 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

3 0.12117239 456 hunch net-2012-02-24-ICML+50%

Introduction: The ICML paper deadline has passed. Joelle and I were surprised to see the number of submissions jump from last year by about 50% to around 900 submissions. A tiny portion of these are immediate rejects(*), so this is a much larger set of papers than expected. The number of workshop submissions also doubled compared to last year, so ICML may grow significantly this year, if we can manage to handle the load well. The prospect of making 900 good decisions is fundamentally daunting, and success will rely heavily on the program committee and area chairs at this point. For those who want to rubberneck a bit more, here’s a breakdown of submissions by primary topic of submitted papers: 66 Reinforcement Learning 52 Supervised Learning 51 Clustering 46 Kernel Methods 40 Optimization Algorithms 39 Feature Selection and Dimensionality Reduction 33 Learning Theory 33 Graphical Models 33 Applications 29 Probabilistic Models 29 NN & Deep Learning 26 Transfer and Multi-Ta

4 0.115925 336 hunch net-2009-01-19-Netflix prize within epsilon

Introduction: The competitors for the Netflix Prize are tantalizingly close winning the million dollar prize. This year, BellKor and Commendo Research sent a combined solution that won the progress prize . Reading the writeups 2 is instructive. Several aspects of solutions are taken for granted including stochastic gradient descent, ensemble prediction, and targeting residuals (a form of boosting). Relatively to last year, it appears that many approaches have added parameterizations, especially for the purpose of modeling through time. The big question is: will they make the big prize? At this point, the level of complexity in entering the competition is prohibitive, so perhaps only the existing competitors will continue to try. (This equation might change drastically if the teams open source their existing solutions, including parameter settings.) One fear is that the progress is asymptoting on the wrong side of the 10% threshold. In the first year, the teams progressed through

5 0.10538612 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

Introduction: One conventional wisdom is that learning algorithms with linear representations are sufficient to solve natural learning problems. This conventional wisdom appears unsupported by empirical evidence as far as I can tell. In nearly all vision, language, robotics, and speech applications I know where machine learning is effectively applied, the approach involves either a linear representation on hand crafted features capturing substantial nonlinearities or learning directly on nonlinear representations. There are a few exceptions to this—for example, if the problem of interest to you is predicting the next word given previous words, n-gram methods have been shown effective. Viewed the right way, n-gram methods are essentially linear predictors on an enormous sparse feature space, learned from an enormous number of examples. Hal’s post here describes some of this in more detail. In contrast, if you go to a machine learning conference, a large number of the new algorithms are v

6 0.091837175 131 hunch net-2005-11-16-The Everything Ensemble Edge

7 0.09008152 466 hunch net-2012-06-05-ICML acceptance statistics

8 0.088252068 371 hunch net-2009-09-21-Netflix finishes (and starts)

9 0.07999593 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class

10 0.079157516 430 hunch net-2011-04-11-The Heritage Health Prize

11 0.077366695 235 hunch net-2007-03-03-All Models of Learning have Flaws

12 0.070147224 236 hunch net-2007-03-15-Alternative Machine Learning Reductions Definitions

13 0.068140998 41 hunch net-2005-03-15-The State of Tight Bounds

14 0.065802164 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem

15 0.064413182 95 hunch net-2005-07-14-What Learning Theory might do

16 0.062130004 348 hunch net-2009-04-02-Asymmophobia

17 0.060227495 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

18 0.059326582 183 hunch net-2006-06-14-Explorations of Exploration

19 0.058278535 22 hunch net-2005-02-18-What it means to do research.

20 0.058074109 63 hunch net-2005-04-27-DARPA project: LAGR


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.137), (1, 0.041), (2, -0.026), (3, -0.003), (4, 0.007), (5, -0.004), (6, -0.062), (7, 0.017), (8, 0.009), (9, -0.036), (10, -0.068), (11, 0.057), (12, -0.015), (13, -0.025), (14, -0.008), (15, 0.01), (16, 0.029), (17, -0.002), (18, -0.005), (19, -0.017), (20, -0.051), (21, -0.035), (22, 0.029), (23, -0.016), (24, -0.105), (25, -0.079), (26, -0.043), (27, -0.021), (28, -0.016), (29, 0.062), (30, 0.014), (31, -0.066), (32, -0.027), (33, -0.074), (34, -0.043), (35, 0.15), (36, -0.026), (37, -0.029), (38, -0.081), (39, 0.002), (40, -0.019), (41, -0.001), (42, -0.01), (43, 0.02), (44, -0.181), (45, 0.031), (46, -0.092), (47, 0.045), (48, 0.02), (49, -0.077)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97094768 272 hunch net-2007-11-14-BellKor wins Netflix

Introduction: … but only the little prize. The BellKor team focused on integrating predictions from many different methods. The base methods consist of: Nearest Neighbor Methods Matrix Factorization Methods (asymmetric and symmetric) Linear Regression on various feature spaces Restricted Boltzman Machines The final predictor was an ensemble (as was reasonable to expect), although it’s a little bit more complicated than just a weighted average—it’s essentially a customized learning algorithm. Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). RBMs are the new approach. The writeup is pretty clear for more details. The contestants are close to reaching the big prize, but the last 1.5% is probably at least as hard as what’s been done. A few new structurally different methods for making predictions may need to be discovered and added into the mixture. In other words, research may be require

2 0.57678884 430 hunch net-2011-04-11-The Heritage Health Prize

Introduction: The Heritage Health Prize is potentially the largest prediction prize yet at $3M, which is sure to get many people interested. Several elements of the competition may be worth discussing. The most straightforward way for HPN to deploy this predictor is in determining who to cover with insurance. This might easily cover the costs of running the contest itself, but the value to the health system of a whole is minimal, as people not covered still exist. While HPN itself is a provider network, they have active relationships with a number of insurance companies, and the right to resell any entrant. It’s worth keeping in mind that the research and development may nevertheless end up being useful in the longer term, especially as entrants also keep the right to their code. The judging metric is something I haven’t seen previously. If a patient has probability 0.5 of being in the hospital 0 days and probability 0.5 of being in the hospital ~53.6 days, the optimal prediction in e

3 0.57144523 336 hunch net-2009-01-19-Netflix prize within epsilon

Introduction: The competitors for the Netflix Prize are tantalizingly close winning the million dollar prize. This year, BellKor and Commendo Research sent a combined solution that won the progress prize . Reading the writeups 2 is instructive. Several aspects of solutions are taken for granted including stochastic gradient descent, ensemble prediction, and targeting residuals (a form of boosting). Relatively to last year, it appears that many approaches have added parameterizations, especially for the purpose of modeling through time. The big question is: will they make the big prize? At this point, the level of complexity in entering the competition is prohibitive, so perhaps only the existing competitors will continue to try. (This equation might change drastically if the teams open source their existing solutions, including parameter settings.) One fear is that the progress is asymptoting on the wrong side of the 10% threshold. In the first year, the teams progressed through

4 0.55930525 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

5 0.54691404 456 hunch net-2012-02-24-ICML+50%

Introduction: The ICML paper deadline has passed. Joelle and I were surprised to see the number of submissions jump from last year by about 50% to around 900 submissions. A tiny portion of these are immediate rejects(*), so this is a much larger set of papers than expected. The number of workshop submissions also doubled compared to last year, so ICML may grow significantly this year, if we can manage to handle the load well. The prospect of making 900 good decisions is fundamentally daunting, and success will rely heavily on the program committee and area chairs at this point. For those who want to rubberneck a bit more, here’s a breakdown of submissions by primary topic of submitted papers: 66 Reinforcement Learning 52 Supervised Learning 51 Clustering 46 Kernel Methods 40 Optimization Algorithms 39 Feature Selection and Dimensionality Reduction 33 Learning Theory 33 Graphical Models 33 Applications 29 Probabilistic Models 29 NN & Deep Learning 26 Transfer and Multi-Ta

6 0.50438994 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

7 0.49314421 466 hunch net-2012-06-05-ICML acceptance statistics

8 0.48116669 371 hunch net-2009-09-21-Netflix finishes (and starts)

9 0.46233007 314 hunch net-2008-08-24-Mass Customized Medicine in the Future?

10 0.45382622 217 hunch net-2006-11-06-Data Linkage Problems

11 0.43508524 348 hunch net-2009-04-02-Asymmophobia

12 0.40820867 270 hunch net-2007-11-02-The Machine Learning Award goes to …

13 0.40523496 2 hunch net-2005-01-24-Holy grails of machine learning?

14 0.40285066 253 hunch net-2007-07-06-Idempotent-capable Predictors

15 0.40197337 327 hunch net-2008-11-16-Observations on Linearity for Reductions to Regression

16 0.39534792 308 hunch net-2008-07-06-To Dual or Not

17 0.39001092 63 hunch net-2005-04-27-DARPA project: LAGR

18 0.38771081 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models

19 0.38117924 131 hunch net-2005-11-16-The Everything Ensemble Edge

20 0.38032812 193 hunch net-2006-07-09-The Stock Prediction Machine Learning Problem


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.162), (53, 0.09), (55, 0.025), (92, 0.408), (94, 0.129), (95, 0.072)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.92754364 272 hunch net-2007-11-14-BellKor wins Netflix

Introduction: … but only the little prize. The BellKor team focused on integrating predictions from many different methods. The base methods consist of: Nearest Neighbor Methods Matrix Factorization Methods (asymmetric and symmetric) Linear Regression on various feature spaces Restricted Boltzman Machines The final predictor was an ensemble (as was reasonable to expect), although it’s a little bit more complicated than just a weighted average—it’s essentially a customized learning algorithm. Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). RBMs are the new approach. The writeup is pretty clear for more details. The contestants are close to reaching the big prize, but the last 1.5% is probably at least as hard as what’s been done. A few new structurally different methods for making predictions may need to be discovered and added into the mixture. In other words, research may be require

2 0.90972078 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

3 0.83542961 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept

Introduction: Last year about this time, we received a conditional accept for the searn paper , which asked us to reference a paper that was not reasonable to cite because there was strictly more relevant work by the same authors that we already cited. We wrote a response explaining this, and didn’t cite it in the final draft, giving the SPC an excuse to reject the paper , leading to unhappiness for all. Later, Sanjoy Dasgupta suggested that an alternative was to talk to the PC chair instead, as soon as you see that a conditional accept is unreasonable. William Cohen and I spoke about this by email, the relevant bit of which is: If an SPC asks for a revision that is inappropriate, the correct action is to contact the chairs as soon as the decision is made, clearly explaining what the problem is, so we can decide whether or not to over-rule the SPC. As you say, this is extra work for us chairs, but that’s part of the job, and we’re willing to do that sort of work to improve the ov

4 0.75375128 203 hunch net-2006-08-18-Report of MLSS 2006 Taipei

Introduction: The 2006 Machine Learning Summer School in Taipei, Taiwan ended on August 4, 2006. It has been a very exciting two weeks for a record crowd of 245 participants (including speakers and organizers) from 18 countries. We had a lineup of speakers that is hard to match up for other similar events (see our WIKI for more information). With this lineup, it is difficult for us as organizers to screw it up too bad. Also, since we have pretty good infrastructure for international meetings and experienced staff at NTUST and Academia Sinica, plus the reputation established by previous MLSS series, it was relatively easy for us to attract registrations and simply enjoyed this two-week long party of machine learning. In the end of MLSS we distributed a survey form for participants to fill in. I will report what we found from this survey, together with the registration data and word-of-mouth from participants. The first question is designed to find out how our participants learned about MLSS

5 0.5452072 293 hunch net-2008-03-23-Interactive Machine Learning

Introduction: A new direction of research seems to be arising in machine learning: Interactive Machine Learning. This isn’t a familiar term, although it does include some familiar subjects. What is Interactive Machine Learning? The fundamental requirement is (a) learning algorithms which interact with the world and (b) learn. For our purposes, let’s define learning as efficiently competing with a large set of possible predictors. Examples include: Online learning against an adversary ( Avrim’s Notes ). The interaction is almost trivial: the learning algorithm makes a prediction and then receives feedback. The learning is choosing based upon the advice of many experts. Active Learning . In active learning, the interaction is choosing which examples to label, and the learning is choosing from amongst a large set of hypotheses. Contextual Bandits . The interaction is choosing one of several actions and learning only the value of the chosen action (weaker than active learning

6 0.52976239 437 hunch net-2011-07-10-ICML 2011 and the future

7 0.51316011 141 hunch net-2005-12-17-Workshops as Franchise Conferences

8 0.5036279 75 hunch net-2005-05-28-Running A Machine Learning Summer School

9 0.49541095 463 hunch net-2012-05-02-ICML: Behind the Scenes

10 0.48886532 461 hunch net-2012-04-09-ICML author feedback is open

11 0.48368156 370 hunch net-2009-09-18-Necessary and Sufficient Research

12 0.47976112 371 hunch net-2009-09-21-Netflix finishes (and starts)

13 0.47745726 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

14 0.47699389 207 hunch net-2006-09-12-Incentive Compatible Reviewing

15 0.47591788 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

16 0.4758471 80 hunch net-2005-06-10-Workshops are not Conferences

17 0.46986017 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

18 0.46829593 79 hunch net-2005-06-08-Question: “When is the right time to insert the loss function?”

19 0.46442604 229 hunch net-2007-01-26-Parallel Machine Learning Problems

20 0.46436593 57 hunch net-2005-04-16-Which Assumptions are Reasonable?