hunch_net hunch_net-2009 hunch_net-2009-362 knowledge-graph by maker-knowledge-mining

362 hunch net-2009-06-26-Netflix nearly done


meta infos for this blog

Source: html

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . [sent-1, score-1.592]

2 This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. [sent-2, score-0.504]

3 This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. [sent-3, score-1.75]

4 Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods. [sent-4, score-1.658]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('team', 0.379), ('ensemble', 0.314), ('netflix', 0.294), ('final', 0.246), ('lesson', 0.227), ('qualifying', 0.227), ('formidable', 0.211), ('test', 0.191), ('yehuda', 0.19), ('performance', 0.18), ('biggest', 0.176), ('regardless', 0.176), ('contest', 0.157), ('outcome', 0.153), ('achieved', 0.15), ('days', 0.147), ('edge', 0.147), ('attended', 0.142), ('presentation', 0.125), ('together', 0.123), ('set', 0.12), ('public', 0.116), ('kdd', 0.116), ('chance', 0.106), ('ml', 0.103), ('sure', 0.1), ('result', 0.095), ('methods', 0.088), ('expect', 0.087), ('come', 0.082), ('isn', 0.081), ('small', 0.078), ('another', 0.068), ('best', 0.065), ('quite', 0.063), ('time', 0.046), ('one', 0.026)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

2 0.27364221 371 hunch net-2009-09-21-Netflix finishes (and starts)

Introduction: I attended the Netflix prize ceremony this morning. The press conference part is covered fine elsewhere , with the basic outcome being that BellKor’s Pragmatic Chaos won over The Ensemble by 15-20 minutes , because they were tied in performance on the ultimate holdout set. I’m sure the individual participants will have many chances to speak about the solution. One of these is Bell at the NYAS ML symposium on Nov. 6 . Several additional details may interest ML people. The degree of overfitting exhibited by the difference in performance on the leaderboard test set and the ultimate hold out set was small, but determining at .02 to .03%. A tie was possible, because the rules cut off measurements below the fourth digit based on significance concerns. In actuality, of course, the scores do differ before rounding, but everyone I spoke to claimed not to know how. The complete dataset has been released on UCI , so each team could compute their own score to whatever accu

3 0.14593488 211 hunch net-2006-10-02-$1M Netflix prediction contest

Introduction: Netflix is running a contest to improve recommender prediction systems. A 10% improvement over their current system yields a $1M prize. Failing that, the best smaller improvement yields a smaller $50K prize. This contest looks quite real, and the $50K prize money is almost certainly achievable with a bit of thought. The contest also comes with a dataset which is apparently 2 orders of magnitude larger than any other public recommendation system datasets.

4 0.14518499 430 hunch net-2011-04-11-The Heritage Health Prize

Introduction: The Heritage Health Prize is potentially the largest prediction prize yet at $3M, which is sure to get many people interested. Several elements of the competition may be worth discussing. The most straightforward way for HPN to deploy this predictor is in determining who to cover with insurance. This might easily cover the costs of running the contest itself, but the value to the health system of a whole is minimal, as people not covered still exist. While HPN itself is a provider network, they have active relationships with a number of insurance companies, and the right to resell any entrant. It’s worth keeping in mind that the research and development may nevertheless end up being useful in the longer term, especially as entrants also keep the right to their code. The judging metric is something I haven’t seen previously. If a patient has probability 0.5 of being in the hospital 0 days and probability 0.5 of being in the hospital ~53.6 days, the optimal prediction in e

5 0.14397426 19 hunch net-2005-02-14-Clever Methods of Overfitting

Introduction: “Overfitting” is traditionally defined as training some flexible representation so that it memorizes the data but fails to predict well in the future. For this post, I will define overfitting more generally as over-representing the performance of systems. There are two styles of general overfitting: overrepresenting performance on particular datasets and (implicitly) overrepresenting performance of a method on future datasets. We should all be aware of these methods, avoid them where possible, and take them into account otherwise. I have used “reproblem” and “old datasets”, and may have participated in “overfitting by review”—some of these are very difficult to avoid. Name Method Explanation Remedy Traditional overfitting Train a complex predictor on too-few examples. Hold out pristine examples for testing. Use a simpler predictor. Get more training examples. Integrate over many predictors. Reject papers which do this. Parameter twe

6 0.14266184 272 hunch net-2007-11-14-BellKor wins Netflix

7 0.13622259 239 hunch net-2007-04-18-$50K Spock Challenge

8 0.1318287 131 hunch net-2005-11-16-The Everything Ensemble Edge

9 0.13113762 275 hunch net-2007-11-29-The Netflix Crack

10 0.1170435 427 hunch net-2011-03-20-KDD Cup 2011

11 0.1113243 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

12 0.10783577 364 hunch net-2009-07-11-Interesting papers at KDD

13 0.10071271 389 hunch net-2010-02-26-Yahoo! ML events

14 0.075333029 129 hunch net-2005-11-07-Prediction Competitions

15 0.071121693 314 hunch net-2008-08-24-Mass Customized Medicine in the Future?

16 0.067994818 444 hunch net-2011-09-07-KDD and MUCMD 2011

17 0.065184757 119 hunch net-2005-10-08-We have a winner

18 0.063368715 307 hunch net-2008-07-04-More Presentation Preparation

19 0.061072126 177 hunch net-2006-05-05-An ICML reject

20 0.061071321 377 hunch net-2009-11-09-NYAS ML Symposium this year.


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.111), (1, 0.007), (2, -0.038), (3, -0.028), (4, -0.031), (5, 0.051), (6, -0.104), (7, 0.019), (8, -0.026), (9, -0.081), (10, -0.126), (11, 0.343), (12, 0.041), (13, -0.009), (14, 0.016), (15, 0.022), (16, 0.06), (17, 0.029), (18, -0.007), (19, -0.017), (20, -0.134), (21, -0.078), (22, 0.002), (23, -0.075), (24, -0.009), (25, -0.155), (26, -0.102), (27, -0.134), (28, 0.016), (29, 0.102), (30, 0.008), (31, -0.085), (32, 0.026), (33, 0.03), (34, 0.045), (35, 0.055), (36, -0.031), (37, -0.011), (38, -0.037), (39, 0.005), (40, 0.047), (41, -0.057), (42, -0.056), (43, 0.056), (44, -0.085), (45, -0.018), (46, -0.045), (47, 0.015), (48, 0.015), (49, 0.018)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99034864 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

2 0.74501175 211 hunch net-2006-10-02-$1M Netflix prediction contest

Introduction: Netflix is running a contest to improve recommender prediction systems. A 10% improvement over their current system yields a $1M prize. Failing that, the best smaller improvement yields a smaller $50K prize. This contest looks quite real, and the $50K prize money is almost certainly achievable with a bit of thought. The contest also comes with a dataset which is apparently 2 orders of magnitude larger than any other public recommendation system datasets.

3 0.707811 371 hunch net-2009-09-21-Netflix finishes (and starts)

Introduction: I attended the Netflix prize ceremony this morning. The press conference part is covered fine elsewhere , with the basic outcome being that BellKor’s Pragmatic Chaos won over The Ensemble by 15-20 minutes , because they were tied in performance on the ultimate holdout set. I’m sure the individual participants will have many chances to speak about the solution. One of these is Bell at the NYAS ML symposium on Nov. 6 . Several additional details may interest ML people. The degree of overfitting exhibited by the difference in performance on the leaderboard test set and the ultimate hold out set was small, but determining at .02 to .03%. A tie was possible, because the rules cut off measurements below the fourth digit based on significance concerns. In actuality, of course, the scores do differ before rounding, but everyone I spoke to claimed not to know how. The complete dataset has been released on UCI , so each team could compute their own score to whatever accu

4 0.62245923 430 hunch net-2011-04-11-The Heritage Health Prize

Introduction: The Heritage Health Prize is potentially the largest prediction prize yet at $3M, which is sure to get many people interested. Several elements of the competition may be worth discussing. The most straightforward way for HPN to deploy this predictor is in determining who to cover with insurance. This might easily cover the costs of running the contest itself, but the value to the health system of a whole is minimal, as people not covered still exist. While HPN itself is a provider network, they have active relationships with a number of insurance companies, and the right to resell any entrant. It’s worth keeping in mind that the research and development may nevertheless end up being useful in the longer term, especially as entrants also keep the right to their code. The judging metric is something I haven’t seen previously. If a patient has probability 0.5 of being in the hospital 0 days and probability 0.5 of being in the hospital ~53.6 days, the optimal prediction in e

5 0.59368956 275 hunch net-2007-11-29-The Netflix Crack

Introduction: A couple security researchers claim to have cracked the netflix dataset . The claims of success appear somewhat overstated to me, but the method of attack is valid and could plausibly be substantially improved so as to reveal the movie preferences of a small fraction of Netflix users. The basic idea is to use a heuristic similarity function between ratings in a public database (from IMDB) and an anonymized database (Netflix) to link ratings in the private database to public identities (in IMDB). They claim to have linked two of a few dozen IMDB users to anonymized netflix users. The claims seem a bit inflated to me, because (a) knowing the IMDB identity isn’t equivalent to knowing the person and (b) the claims of statistical significance are with respect to a model of the world they created (rather than one they created). Overall, this is another example showing that complete privacy is hard . It may be worth remembering that there are some substantial benefits from the Netf

6 0.54200494 239 hunch net-2007-04-18-$50K Spock Challenge

7 0.51193023 427 hunch net-2011-03-20-KDD Cup 2011

8 0.49844 272 hunch net-2007-11-14-BellKor wins Netflix

9 0.46290639 336 hunch net-2009-01-19-Netflix prize within epsilon

10 0.45045406 19 hunch net-2005-02-14-Clever Methods of Overfitting

11 0.43547085 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

12 0.41979223 284 hunch net-2008-01-18-Datasets

13 0.41088042 131 hunch net-2005-11-16-The Everything Ensemble Edge

14 0.38778627 129 hunch net-2005-11-07-Prediction Competitions

15 0.3796632 377 hunch net-2009-11-09-NYAS ML Symposium this year.

16 0.37830403 26 hunch net-2005-02-21-Problem: Cross Validation

17 0.34775919 56 hunch net-2005-04-14-Families of Learning Theory Statements

18 0.34047067 119 hunch net-2005-10-08-We have a winner

19 0.31043369 364 hunch net-2009-07-11-Interesting papers at KDD

20 0.30441615 63 hunch net-2005-04-27-DARPA project: LAGR


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(27, 0.023), (38, 0.095), (53, 0.098), (55, 0.06), (92, 0.576)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95046097 362 hunch net-2009-06-26-Netflix nearly done

Introduction: A $1M qualifying result was achieved on the public Netflix test set by a 3-way ensemble team . This is just in time for Yehuda ‘s presentation at KDD , which I’m sure will be one of the best attended ever. This isn’t quite over—there are a few days for another super-conglomerate team to come together and there is some small chance that the performance is nonrepresentative of the final test set, but I expect not. Regardless of the final outcome, the biggest lesson for ML from the Netflix contest has been the formidable performance edge of ensemble methods.

2 0.70214945 272 hunch net-2007-11-14-BellKor wins Netflix

Introduction: … but only the little prize. The BellKor team focused on integrating predictions from many different methods. The base methods consist of: Nearest Neighbor Methods Matrix Factorization Methods (asymmetric and symmetric) Linear Regression on various feature spaces Restricted Boltzman Machines The final predictor was an ensemble (as was reasonable to expect), although it’s a little bit more complicated than just a weighted average—it’s essentially a customized learning algorithm. Base approaches (1)-(3) seem like relatively well-known approaches (although I haven’t seen the asymmetric factorization variant before). RBMs are the new approach. The writeup is pretty clear for more details. The contestants are close to reaching the big prize, but the last 1.5% is probably at least as hard as what’s been done. A few new structurally different methods for making predictions may need to be discovered and added into the mixture. In other words, research may be require

3 0.62891245 238 hunch net-2007-04-13-What to do with an unreasonable conditional accept

Introduction: Last year about this time, we received a conditional accept for the searn paper , which asked us to reference a paper that was not reasonable to cite because there was strictly more relevant work by the same authors that we already cited. We wrote a response explaining this, and didn’t cite it in the final draft, giving the SPC an excuse to reject the paper , leading to unhappiness for all. Later, Sanjoy Dasgupta suggested that an alternative was to talk to the PC chair instead, as soon as you see that a conditional accept is unreasonable. William Cohen and I spoke about this by email, the relevant bit of which is: If an SPC asks for a revision that is inappropriate, the correct action is to contact the chairs as soon as the decision is made, clearly explaining what the problem is, so we can decide whether or not to over-rule the SPC. As you say, this is extra work for us chairs, but that’s part of the job, and we’re willing to do that sort of work to improve the ov

4 0.54295945 203 hunch net-2006-08-18-Report of MLSS 2006 Taipei

Introduction: The 2006 Machine Learning Summer School in Taipei, Taiwan ended on August 4, 2006. It has been a very exciting two weeks for a record crowd of 245 participants (including speakers and organizers) from 18 countries. We had a lineup of speakers that is hard to match up for other similar events (see our WIKI for more information). With this lineup, it is difficult for us as organizers to screw it up too bad. Also, since we have pretty good infrastructure for international meetings and experienced staff at NTUST and Academia Sinica, plus the reputation established by previous MLSS series, it was relatively easy for us to attract registrations and simply enjoyed this two-week long party of machine learning. In the end of MLSS we distributed a survey form for participants to fill in. I will report what we found from this survey, together with the registration data and word-of-mouth from participants. The first question is designed to find out how our participants learned about MLSS

5 0.27425656 437 hunch net-2011-07-10-ICML 2011 and the future

Introduction: Unfortunately, I ended up sick for much of this ICML. I did manage to catch one interesting paper: Richard Socher , Cliff Lin , Andrew Y. Ng , and Christopher D. Manning Parsing Natural Scenes and Natural Language with Recursive Neural Networks . I invited Richard to share his list of interesting papers, so hopefully we’ll hear from him soon. In the meantime, Paul and Hal have posted some lists. the future Joelle and I are program chairs for ICML 2012 in Edinburgh , which I previously enjoyed visiting in 2005 . This is a huge responsibility, that we hope to accomplish well. A part of this (perhaps the most fun part), is imagining how we can make ICML better. A key and critical constraint is choosing things that can be accomplished. So far we have: Colocation . The first thing we looked into was potential colocations. We quickly discovered that many other conferences precomitted their location. For the future, getting a colocation with ACL or SIGI

6 0.23695248 293 hunch net-2008-03-23-Interactive Machine Learning

7 0.22411627 80 hunch net-2005-06-10-Workshops are not Conferences

8 0.20776117 463 hunch net-2012-05-02-ICML: Behind the Scenes

9 0.20435509 461 hunch net-2012-04-09-ICML author feedback is open

10 0.20310101 141 hunch net-2005-12-17-Workshops as Franchise Conferences

11 0.2003171 21 hunch net-2005-02-17-Learning Research Programs

12 0.19533679 488 hunch net-2013-08-31-Extreme Classification workshop at NIPS

13 0.19338807 145 hunch net-2005-12-29-Deadline Season

14 0.19167377 410 hunch net-2010-09-17-New York Area Machine Learning Events

15 0.19145118 75 hunch net-2005-05-28-Running A Machine Learning Summer School

16 0.18611683 233 hunch net-2007-02-16-The Forgetting

17 0.1859367 292 hunch net-2008-03-15-COLT Open Problems

18 0.17840002 83 hunch net-2005-06-18-Lower Bounds for Learning Reductions

19 0.17771477 16 hunch net-2005-02-09-Intuitions from applied learning

20 0.17727204 367 hunch net-2009-08-16-Centmail comments