hunch_net hunch_net-2008 hunch_net-2008-301 knowledge-graph by maker-knowledge-mining

301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize


meta infos for this blog

Source: html

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I will concentrate here on high level lessons that will hopefully help other practitioners in coming up with developments of a true practical value. [sent-17, score-0.508]

2 Do we want to model the numerical value of ratings, or maybe which movies people rate (regardless of rating value)? [sent-22, score-0.707]

3 Some will want to model certain pieces of metadata associated with the movies, such as interactions with actors, directors, etc. [sent-24, score-0.442]

4 Will we model ratings through a neighborhood model or through a latent factor model? [sent-28, score-1.169]

5 Within a neighborhood model, should we look at relationships between users, between movies, or maybe both? [sent-29, score-0.341]

6 Within latent factor models we also have plenty of further choices – should we stick with the good old SVD, or move to fancier probabilistic models (e. [sent-30, score-0.609]

7   Finally the last level answers the “how? [sent-34, score-0.356]

8 For example, nearest neighbor models can vary from quite simplistic correlation based models, to more sophisticated models that try to derive parameters directly from the data. [sent-37, score-0.471]

9 Most papers appear to concentrate on the third level, designing the best techniques for optimizing a single model or a particular cost function on which they are fixated. [sent-43, score-0.547]

10 This is not very surprising, because the third level is the most technical one and offers the most flexibility. [sent-44, score-0.356]

11 Well, no doubt, that’s wonderful… However, the practical value of these developments is quite limited, especially when using an ensemble of various models, where squeezing the best out of a single model is not really delivered to the bottom line. [sent-47, score-0.793]

12   Concentrating efforts on the second level is more fruitful. [sent-48, score-0.334]

13 For example, user-based neighborhood models were found to be vastly inferior to item (movie) -based ones. [sent-50, score-0.462]

14 Moreover, latent factor models were proven to be more accurate than the neighborhood ones (considering that you use the right latent factor model, which happens to be SVD). [sent-51, score-0.924]

15 So this level is certainly important, receives quite a bit of attention at the literature, but not nearly as important as the first level. [sent-55, score-0.488]

16 For example, going beyond the numerical values of the ratings to analyzing which movies are chosen to be rated has a tremendous effect on prediction accuracy. [sent-58, score-0.599]

17 This first level receives less attention in the literature. [sent-61, score-0.412]

18   In practice, the borders between the three levels that I describe may be quite fuzzy. [sent-64, score-0.339]

19 Moreover, these three levels can be sometimes strongly interlaced with each other, as at the end, a single implementation should fulfill all three levels. [sent-65, score-0.48]

20 The more it relates to the first level, the more interested I become, whereas I tend to almost completely ignore improvements related to the third level (well, that’s after exploring that level enough in the past). [sent-67, score-0.626]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('level', 0.27), ('model', 0.262), ('neighborhood', 0.213), ('movies', 0.213), ('netflix', 0.208), ('svd', 0.184), ('latent', 0.17), ('models', 0.169), ('ratings', 0.161), ('three', 0.156), ('maybe', 0.128), ('movie', 0.128), ('users', 0.124), ('prize', 0.122), ('recommender', 0.111), ('levels', 0.107), ('metadata', 0.104), ('numerical', 0.104), ('squeezing', 0.104), ('factor', 0.101), ('moreover', 0.092), ('developments', 0.092), ('answers', 0.086), ('third', 0.086), ('actors', 0.085), ('receives', 0.085), ('million', 0.085), ('concentrate', 0.08), ('vastly', 0.08), ('deliver', 0.08), ('quite', 0.076), ('associated', 0.076), ('outperform', 0.074), ('dynamics', 0.071), ('ideas', 0.071), ('really', 0.068), ('practical', 0.066), ('behavior', 0.064), ('efforts', 0.064), ('modeling', 0.064), ('ensemble', 0.064), ('chosen', 0.062), ('life', 0.061), ('single', 0.061), ('days', 0.06), ('going', 0.059), ('designing', 0.058), ('question', 0.058), ('attention', 0.057), ('try', 0.057)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher

2 0.18395722 371 hunch net-2009-09-21-Netflix finishes (and starts)

Introduction: I attended the Netflix prize ceremony this morning. The press conference part is covered fine elsewhere , with the basic outcome being that BellKor’s Pragmatic Chaos won over The Ensemble by 15-20 minutes , because they were tied in performance on the ultimate holdout set. I’m sure the individual participants will have many chances to speak about the solution. One of these is Bell at the NYAS ML symposium on Nov. 6 . Several additional details may interest ML people. The degree of overfitting exhibited by the difference in performance on the leaderboard test set and the ultimate hold out set was small, but determining at .02 to .03%. A tie was possible, because the rules cut off measurements below the fourth digit based on significance concerns. In actuality, of course, the scores do differ before rounding, but everyone I spoke to claimed not to know how. The complete dataset has been released on UCI , so each team could compute their own score to whatever accu

3 0.18078065 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

4 0.16493101 275 hunch net-2007-11-29-The Netflix Crack

Introduction: A couple security researchers claim to have cracked the netflix dataset . The claims of success appear somewhat overstated to me, but the method of attack is valid and could plausibly be substantially improved so as to reveal the movie preferences of a small fraction of Netflix users. The basic idea is to use a heuristic similarity function between ratings in a public database (from IMDB) and an anonymized database (Netflix) to link ratings in the private database to public identities (in IMDB). They claim to have linked two of a few dozen IMDB users to anonymized netflix users. The claims seem a bit inflated to me, because (a) knowing the IMDB identity isn’t equivalent to knowing the person and (b) the claims of statistical significance are with respect to a model of the world they created (rather than one they created). Overall, this is another example showing that complete privacy is hard . It may be worth remembering that there are some substantial benefits from the Netf

5 0.14966567 336 hunch net-2009-01-19-Netflix prize within epsilon

Introduction: The competitors for the Netflix Prize are tantalizingly close winning the million dollar prize. This year, BellKor and Commendo Research sent a combined solution that won the progress prize . Reading the writeups 2 is instructive. Several aspects of solutions are taken for granted including stochastic gradient descent, ensemble prediction, and targeting residuals (a form of boosting). Relatively to last year, it appears that many approaches have added parameterizations, especially for the purpose of modeling through time. The big question is: will they make the big prize? At this point, the level of complexity in entering the competition is prohibitive, so perhaps only the existing competitors will continue to try. (This equation might change drastically if the teams open source their existing solutions, including parameter settings.) One fear is that the progress is asymptoting on the wrong side of the 10% threshold. In the first year, the teams progressed through

6 0.14629947 135 hunch net-2005-12-04-Watchword: model

7 0.12722728 97 hunch net-2005-07-23-Interesting papers at ACL

8 0.12600394 95 hunch net-2005-07-14-What Learning Theory might do

9 0.11533953 211 hunch net-2006-10-02-$1M Netflix prediction contest

10 0.1113243 362 hunch net-2009-06-26-Netflix nearly done

11 0.11056939 389 hunch net-2010-02-26-Yahoo! ML events

12 0.10677458 454 hunch net-2012-01-30-ICML Posters and Scope

13 0.099834569 235 hunch net-2007-03-03-All Models of Learning have Flaws

14 0.090682365 332 hunch net-2008-12-23-Use of Learning Theory

15 0.08982119 27 hunch net-2005-02-23-Problem: Reinforcement Learning with Classification

16 0.089520894 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

17 0.089196831 440 hunch net-2011-08-06-Interesting thing at UAI 2011

18 0.088305146 444 hunch net-2011-09-07-KDD and MUCMD 2011

19 0.087928966 364 hunch net-2009-07-11-Interesting papers at KDD

20 0.087521471 51 hunch net-2005-04-01-The Producer-Consumer Model of Research


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.236), (1, 0.02), (2, -0.03), (3, 0.039), (4, 0.014), (5, 0.027), (6, -0.072), (7, 0.019), (8, 0.006), (9, -0.094), (10, -0.066), (11, 0.169), (12, -0.121), (13, -0.022), (14, 0.043), (15, 0.004), (16, 0.043), (17, 0.081), (18, 0.138), (19, -0.147), (20, -0.165), (21, -0.051), (22, -0.037), (23, -0.066), (24, 0.04), (25, -0.008), (26, -0.078), (27, -0.065), (28, -0.064), (29, 0.008), (30, -0.002), (31, -0.016), (32, -0.042), (33, 0.002), (34, -0.078), (35, 0.034), (36, 0.066), (37, -0.144), (38, -0.056), (39, -0.114), (40, -0.12), (41, -0.03), (42, -0.033), (43, 0.011), (44, -0.039), (45, 0.004), (46, 0.117), (47, -0.011), (48, 0.018), (49, 0.045)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.98160881 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher

2 0.71600586 275 hunch net-2007-11-29-The Netflix Crack

Introduction: A couple security researchers claim to have cracked the netflix dataset . The claims of success appear somewhat overstated to me, but the method of attack is valid and could plausibly be substantially improved so as to reveal the movie preferences of a small fraction of Netflix users. The basic idea is to use a heuristic similarity function between ratings in a public database (from IMDB) and an anonymized database (Netflix) to link ratings in the private database to public identities (in IMDB). They claim to have linked two of a few dozen IMDB users to anonymized netflix users. The claims seem a bit inflated to me, because (a) knowing the IMDB identity isn’t equivalent to knowing the person and (b) the claims of statistical significance are with respect to a model of the world they created (rather than one they created). Overall, this is another example showing that complete privacy is hard . It may be worth remembering that there are some substantial benefits from the Netf

3 0.70186478 135 hunch net-2005-12-04-Watchword: model

Introduction: In everyday use a model is a system which explains the behavior of some system, hopefully at the level where some alteration of the model predicts some alteration of the real-world system. In machine learning “model” has several variant definitions. Everyday . The common definition is sometimes used. Parameterized . Sometimes model is a short-hand for “parameterized model”. Here, it refers to a model with unspecified free parameters. In the Bayesian learning approach, you typically have a prior over (everyday) models. Predictive . Even further from everyday use is the predictive model. Examples of this are “my model is a decision tree” or “my model is a support vector machine”. Here, there is no real sense in which an SVM explains the underlying process. For example, an SVM tells us nothing in particular about how alterations to the real-world system would create a change. Which definition is being used at any particular time is important information. For examp

4 0.64500082 194 hunch net-2006-07-11-New Models

Introduction: How should we, as researchers in machine learning, organize ourselves? The most immediate measurable objective of computer science research is publishing a paper. The most difficult aspect of publishing a paper is having reviewers accept and recommend it for publication. The simplest mechanism for doing this is to show theoretical progress on some standard, well-known easily understood problem. In doing this, we often fall into a local minima of the research process. The basic problem in machine learning is that it is very unclear that the mathematical model is the right one for the (or some) real problem. A good mathematical model in machine learning should have one fundamental trait: it should aid the design of effective learning algorithms. To date, our ability to solve interesting learning problems (speech recognition, machine translation, object recognition, etc…) remains limited (although improving), so the “rightness” of our models is in doubt. If our mathematical mod

5 0.61414373 336 hunch net-2009-01-19-Netflix prize within epsilon

Introduction: The competitors for the Netflix Prize are tantalizingly close winning the million dollar prize. This year, BellKor and Commendo Research sent a combined solution that won the progress prize . Reading the writeups 2 is instructive. Several aspects of solutions are taken for granted including stochastic gradient descent, ensemble prediction, and targeting residuals (a form of boosting). Relatively to last year, it appears that many approaches have added parameterizations, especially for the purpose of modeling through time. The big question is: will they make the big prize? At this point, the level of complexity in entering the competition is prohibitive, so perhaps only the existing competitors will continue to try. (This equation might change drastically if the teams open source their existing solutions, including parameter settings.) One fear is that the progress is asymptoting on the wrong side of the 10% threshold. In the first year, the teams progressed through

6 0.60541081 97 hunch net-2005-07-23-Interesting papers at ACL

7 0.53190589 371 hunch net-2009-09-21-Netflix finishes (and starts)

8 0.53034198 95 hunch net-2005-07-14-What Learning Theory might do

9 0.52580577 430 hunch net-2011-04-11-The Heritage Health Prize

10 0.52402502 440 hunch net-2011-08-06-Interesting thing at UAI 2011

11 0.509556 189 hunch net-2006-07-05-more icml papers

12 0.4908137 362 hunch net-2009-06-26-Netflix nearly done

13 0.48405945 77 hunch net-2005-05-29-Maximum Margin Mismatch?

14 0.46344513 270 hunch net-2007-11-02-The Machine Learning Award goes to …

15 0.45667529 211 hunch net-2006-10-02-$1M Netflix prediction contest

16 0.44277883 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three

17 0.43916872 139 hunch net-2005-12-11-More NIPS Papers

18 0.43738958 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006

19 0.42375991 424 hunch net-2011-02-17-What does Watson mean?

20 0.42047724 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(0, 0.01), (3, 0.012), (10, 0.028), (27, 0.173), (30, 0.026), (38, 0.045), (39, 0.239), (53, 0.036), (55, 0.081), (64, 0.024), (77, 0.012), (92, 0.018), (94, 0.107), (95, 0.091)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.89454406 301 hunch net-2008-05-23-Three levels of addressing the Netflix Prize

Introduction: In October 2006, the online movie renter, Netflix, announced the Netflix Prize contest. They published a comprehensive dataset including more than 100 million movie ratings, which were performed by about 480,000 real customers on 17,770 movies.   Competitors in the challenge are required to estimate a few million ratings.   To win the “grand prize,” they need to deliver a 10% improvement in the prediction error compared with the results of Cinematch, Netflix’s proprietary recommender system. Best current results deliver 9.12% improvement , which is quite close to the 10% goal, yet painfully distant.   The Netflix Prize breathed new life and excitement into recommender systems research. The competition allowed the wide research community to access a large scale, real life dataset. Beyond this, the competition changed the rules of the game. Claiming that your nice idea could outperform some mediocre algorithms on some toy dataset is no longer acceptable. Now researcher

2 0.84047204 427 hunch net-2011-03-20-KDD Cup 2011

Introduction: Yehuda points out KDD-Cup 2011 which Markus and Gideon helped setup. This is a prediction and recommendation contest for music. In addition to being a fun chance to show your expertise, there are cash prizes of $5K/$2K/$1K.

3 0.83214974 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

Introduction: The New York ML symposium was last Friday. There were 303 registrations, up a bit from last year . I particularly enjoyed talks by Bill Freeman on vision and ML, Jon Lenchner on strategy in Jeopardy, and Tara N. Sainath and Brian Kingsbury on deep learning for speech recognition . If anyone has suggestions or thoughts for next year, please speak up. I also attended Strata + Hadoop World for the first time. This is primarily a trade conference rather than an academic conference, but I found it pretty interesting as a first time attendee. This is ground zero for the Big data buzzword, and I see now why. It’s about data, and the word “big” is so ambiguous that everyone can lay claim to it. There were essentially zero academic talks. Instead, the focus was on war stories, product announcements, and education. The general level of education is much lower—explaining Machine Learning to the SQL educated is the primary operating point. Nevertheless that’s happening, a

4 0.76078165 71 hunch net-2005-05-14-NIPS

Introduction: NIPS is the big winter conference of learning. Paper due date: June 3rd. (Tweaked thanks to Fei Sha .) Location: Vancouver (main program) Dec. 5-8 and Whistler (workshops) Dec 9-10, BC, Canada NIPS is larger than all of the other learning conferences, partly because it’s the only one at that time of year. I recommend the workshops which are often quite interesting and energetic.

5 0.69843996 371 hunch net-2009-09-21-Netflix finishes (and starts)

Introduction: I attended the Netflix prize ceremony this morning. The press conference part is covered fine elsewhere , with the basic outcome being that BellKor’s Pragmatic Chaos won over The Ensemble by 15-20 minutes , because they were tied in performance on the ultimate holdout set. I’m sure the individual participants will have many chances to speak about the solution. One of these is Bell at the NYAS ML symposium on Nov. 6 . Several additional details may interest ML people. The degree of overfitting exhibited by the difference in performance on the leaderboard test set and the ultimate hold out set was small, but determining at .02 to .03%. A tie was possible, because the rules cut off measurements below the fourth digit based on significance concerns. In actuality, of course, the scores do differ before rounding, but everyone I spoke to claimed not to know how. The complete dataset has been released on UCI , so each team could compute their own score to whatever accu

6 0.69448912 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

7 0.6791085 360 hunch net-2009-06-15-In Active Learning, the question changes

8 0.67899615 343 hunch net-2009-02-18-Decision by Vetocracy

9 0.67771077 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

10 0.67602146 36 hunch net-2005-03-05-Funding Research

11 0.67528462 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

12 0.67314959 95 hunch net-2005-07-14-What Learning Theory might do

13 0.66926491 423 hunch net-2011-02-02-User preferences for search engines

14 0.66776377 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

15 0.66738874 235 hunch net-2007-03-03-All Models of Learning have Flaws

16 0.66675752 464 hunch net-2012-05-03-Microsoft Research, New York City

17 0.66586614 344 hunch net-2009-02-22-Effective Research Funding

18 0.66484457 105 hunch net-2005-08-23-(Dis)similarities between academia and open source programmers

19 0.66482031 109 hunch net-2005-09-08-Online Learning as the Mathematics of Accountability

20 0.66426653 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models