hunch_net hunch_net-2010 hunch_net-2010-399 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Slashdot points out Google Predict . I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage.
sentIndex sentText sentNum sentScore
1 I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. [sent-2, score-1.639]
2 This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. [sent-3, score-1.126]
3 It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. [sent-4, score-1.033]
4 The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage. [sent-5, score-1.449]
wordName wordTfidf (topN-words)
[('differs', 0.343), ('extremely', 0.266), ('amazon', 0.236), ('mlcomp', 0.236), ('privy', 0.236), ('comparisons', 0.206), ('handled', 0.206), ('latter', 0.206), ('storage', 0.197), ('existing', 0.196), ('slashdot', 0.189), ('awkward', 0.182), ('private', 0.171), ('purposes', 0.167), ('imposed', 0.159), ('magnitude', 0.156), ('limits', 0.144), ('designed', 0.137), ('apply', 0.135), ('google', 0.133), ('network', 0.129), ('datasets', 0.121), ('public', 0.12), ('potential', 0.116), ('details', 0.103), ('larger', 0.101), ('mechanism', 0.1), ('points', 0.096), ('order', 0.093), ('applications', 0.092), ('instead', 0.092), ('implies', 0.089), ('predict', 0.087), ('less', 0.079), ('system', 0.076), ('useful', 0.076), ('easy', 0.071), ('since', 0.067), ('simply', 0.064), ('using', 0.058), ('algorithms', 0.054), ('research', 0.052), ('much', 0.048), ('several', 0.045), ('also', 0.043), ('good', 0.038), ('many', 0.029), ('learning', 0.014)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 399 hunch net-2010-05-20-Google Predict
Introduction: Slashdot points out Google Predict . I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage.
2 0.15650511 393 hunch net-2010-04-14-MLcomp: a website for objectively comparing ML algorithms
Introduction: Much of the success and popularity of machine learning has been driven by its practical impact. Of course, the evaluation of empirical work is an integral part of the field. But are the existing mechanisms for evaluating algorithms and comparing results good enough? We ( Percy and Jake ) believe there are currently a number of shortcomings: Incomplete Disclosure: You read a paper that proposes Algorithm A which is shown to outperform SVMs on two datasets. Great. But what about on other datasets? How sensitive is this result? What about compute time – does the algorithm take two seconds on a laptop or two weeks on a 100-node cluster? Lack of Standardization: Algorithm A beats Algorithm B on one version of a dataset. Algorithm B beats Algorithm A on another version yet uses slightly different preprocessing. Though doing a head-on comparison would be ideal, it would be tedious since the programs probably use different dataset formats and have a large array of options
3 0.10323177 271 hunch net-2007-11-05-CMU wins DARPA Urban Challenge
Introduction: The results have been posted , with CMU first , Stanford second , and Virginia Tech Third . Considering that this was an open event (at least for people in the US), this was a very strong showing for research at universities (instead of defense contractors, for example). Some details should become public at the NIPS workshops . Slashdot has a post with many comments.
4 0.091941655 418 hunch net-2010-12-02-Traffic Prediction Problem
Introduction: Slashdot points out the Traffic Prediction Challenge which looks pretty fun. The temporal aspect seems to be very common in many real-world problems and somewhat understudied.
5 0.081956923 80 hunch net-2005-06-10-Workshops are not Conferences
Introduction: … and you should use that fact. A workshop differs from a conference in that it is about a focused group of people worrying about a focused topic. It also differs in that a workshop is typically a “one-time affair” rather than a series. (The Snowbird learning workshop counts as a conference in this respect.) A common failure mode of both organizers and speakers at a workshop is to treat it as a conference. This is “ok”, but it is not really taking advantage of the situation. Here are some things I’ve learned: For speakers: A smaller audience means it can be more interactive. Interactive means a better chance to avoid losing your audience and a more interesting presentation (because you can adapt to your audience). Greater focus amongst the participants means you can get to the heart of the matter more easily, and discuss tradeoffs more carefully. Unlike conferences, relevance is more valued than newness. For organizers: Not everything needs to be in a conference st
6 0.078074567 211 hunch net-2006-10-02-$1M Netflix prediction contest
7 0.076756343 345 hunch net-2009-03-08-Prediction Science
8 0.072251432 208 hunch net-2006-09-18-What is missing for online collaborative research?
9 0.06620878 19 hunch net-2005-02-14-Clever Methods of Overfitting
10 0.063911393 454 hunch net-2012-01-30-ICML Posters and Scope
11 0.063629776 400 hunch net-2010-06-13-The Good News on Exploration and Learning
12 0.06087561 357 hunch net-2009-05-30-Many ways to Learn this summer
13 0.059023429 132 hunch net-2005-11-26-The Design of an Optimal Research Environment
14 0.058661591 296 hunch net-2008-04-21-The Science 2.0 article
15 0.05774264 349 hunch net-2009-04-21-Interesting Presentations at Snowbird
16 0.056629185 344 hunch net-2009-02-22-Effective Research Funding
17 0.053559091 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning
18 0.052646045 134 hunch net-2005-12-01-The Webscience Future
19 0.052187569 229 hunch net-2007-01-26-Parallel Machine Learning Problems
20 0.051939603 1 hunch net-2005-01-19-Why I decided to run a weblog.
topicId topicWeight
[(0, 0.119), (1, 0.003), (2, -0.053), (3, 0.031), (4, -0.017), (5, 0.004), (6, -0.031), (7, 0.014), (8, -0.005), (9, 0.026), (10, -0.062), (11, 0.058), (12, -0.027), (13, -0.018), (14, -0.01), (15, 0.007), (16, -0.03), (17, 0.012), (18, -0.033), (19, 0.01), (20, -0.001), (21, -0.046), (22, -0.049), (23, 0.002), (24, -0.038), (25, 0.014), (26, 0.058), (27, -0.001), (28, -0.074), (29, -0.063), (30, 0.03), (31, 0.06), (32, 0.064), (33, -0.012), (34, 0.043), (35, -0.062), (36, 0.002), (37, -0.099), (38, 0.032), (39, -0.017), (40, -0.025), (41, 0.044), (42, 0.007), (43, -0.039), (44, 0.032), (45, -0.125), (46, 0.034), (47, 0.098), (48, -0.083), (49, 0.05)]
simIndex simValue blogId blogTitle
same-blog 1 0.96080583 399 hunch net-2010-05-20-Google Predict
Introduction: Slashdot points out Google Predict . I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage.
2 0.55856061 446 hunch net-2011-10-03-Monday announcements
Introduction: Various people want to use hunch.net to announce things. I’ve generally resisted this because I feared hunch becoming a pure announcement zone while I am much more interested contentful posts and discussion personally. Nevertheless there is clearly some value and announcements are easy, so I’m planning to summarize announcements on Mondays. D. Sculley points out an interesting Semisupervised feature learning competition, with a deadline of October 17. Lihong Li points out the webscope user interaction dataset which is the first high quality exploration dataset I’m aware of that is publicly available. Seth Rogers points out CrossValidated which looks similar in conception to metaoptimize , but directly using the stackoverflow interface and with a bit more of a statistics twist.
3 0.54210037 393 hunch net-2010-04-14-MLcomp: a website for objectively comparing ML algorithms
Introduction: Much of the success and popularity of machine learning has been driven by its practical impact. Of course, the evaluation of empirical work is an integral part of the field. But are the existing mechanisms for evaluating algorithms and comparing results good enough? We ( Percy and Jake ) believe there are currently a number of shortcomings: Incomplete Disclosure: You read a paper that proposes Algorithm A which is shown to outperform SVMs on two datasets. Great. But what about on other datasets? How sensitive is this result? What about compute time – does the algorithm take two seconds on a laptop or two weeks on a 100-node cluster? Lack of Standardization: Algorithm A beats Algorithm B on one version of a dataset. Algorithm B beats Algorithm A on another version yet uses slightly different preprocessing. Though doing a head-on comparison would be ideal, it would be tedious since the programs probably use different dataset formats and have a large array of options
4 0.5082643 128 hunch net-2005-11-05-The design of a computing cluster
Introduction: This is about the design of a computing cluster from the viewpoint of applied machine learning using current technology. We just built a small one at TTI so this is some evidence of what is feasible and thoughts about the design choices. Architecture There are several architectural choices. AMD Athlon64 based system. This seems to have the cheapest bang/buck. Maximum RAM is typically 2-3GB. AMD Opteron based system. Opterons provide the additional capability to buy an SMP motherboard with two chips, and the motherboards often support 16GB of RAM. The RAM is also the more expensive error correcting type. Intel PIV or Xeon based system. The PIV and Xeon based systems are the intel analog of the above 2. Due to architectural design reasons, these chips tend to run a bit hotter and be a bit more expensive. Dual core chips. Both Intel and AMD have chips that actually have 2 processors embedded in them. In the end, we decided to go with option (2). Roughly speaking,
5 0.4732461 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge
Introduction: The large scale learning challenge for ICML interests me a great deal, although I have concerns about the way it is structured. From the instructions page , several issues come up: Large Definition My personal definition of dataset size is: small A dataset is small if a human could look at the dataset and plausibly find a good solution. medium A dataset is mediumsize if it fits in the RAM of a reasonably priced computer. large A large dataset does not fit in the RAM of a reasonably priced computer. By this definition, all of the datasets are medium sized. This might sound like a pissing match over dataset size, but I believe it is more than that. The fundamental reason for these definitions is that they correspond to transitions in the sorts of approaches which are feasible. From small to medium, the ability to use a human as the learning algorithm degrades. From medium to large, it becomes essential to have learning algorithms that don’t require ran
6 0.45782524 208 hunch net-2006-09-18-What is missing for online collaborative research?
7 0.45473412 223 hunch net-2006-12-06-The Spam Problem
8 0.45357373 349 hunch net-2009-04-21-Interesting Presentations at Snowbird
9 0.4513824 297 hunch net-2008-04-22-Taking the next step
10 0.44994357 1 hunch net-2005-01-19-Why I decided to run a weblog.
11 0.44944373 134 hunch net-2005-12-01-The Webscience Future
12 0.44153661 19 hunch net-2005-02-14-Clever Methods of Overfitting
13 0.44078559 367 hunch net-2009-08-16-Centmail comments
14 0.43950561 306 hunch net-2008-07-02-Proprietary Data in Academic Research?
15 0.43276966 241 hunch net-2007-04-28-The Coming Patent Apocalypse
16 0.41532353 418 hunch net-2010-12-02-Traffic Prediction Problem
17 0.41064924 423 hunch net-2011-02-02-User preferences for search engines
18 0.40562356 345 hunch net-2009-03-08-Prediction Science
19 0.40508658 132 hunch net-2005-11-26-The Design of an Optimal Research Environment
20 0.40395185 76 hunch net-2005-05-29-Bad ideas
topicId topicWeight
[(9, 0.54), (27, 0.151), (38, 0.05), (55, 0.101), (94, 0.024)]
simIndex simValue blogId blogTitle
same-blog 1 0.88470697 399 hunch net-2010-05-20-Google Predict
Introduction: Slashdot points out Google Predict . I’m not privy to the details, but this has the potential to be extremely useful, as in many applications simply having an easy mechanism to apply existing learning algorithms can be extremely helpful. This differs goalwise from MLcomp —instead of public comparisons for research purposes, it’s about private utilization of good existing algorithms. It also differs infrastructurally, since a system designed to do this is much less awkward than using Amazon’s cloud computing. The latter implies that datasets several order of magnitude larger can be handled up to limits imposed by network and storage.
2 0.78122741 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
Introduction: Here’s a list of papers that I found interesting at ICML / COLT / UAI in 2009. Elad Hazan and Comandur Seshadhri Efficient learning algorithms for changing environments at ICML. This paper shows how to adapt learning algorithms that compete with fixed predictors to compete with changing policies. The definition of regret they deal with seems particularly useful in many situation. Hal Daume , Unsupervised Search-based Structured Prediction at ICML. This paper shows a technique for reducing unsupervised learning to supervised learning which (a) make a fast unsupervised learning algorithm and (b) makes semisupervised learning both easy and highly effective. There were two papers with similar results on active learning in the KWIK framework for linear regression, both reducing the sample complexity to . One was Nicolo Cesa-Bianchi , Claudio Gentile , and Francesco Orabona Robust Bounds for Classification via Selective Sampling at ICML and the other was Thoma
3 0.74363399 490 hunch net-2013-11-09-Graduates and Postdocs
Introduction: Several strong graduates are on the job market this year. Alekh Agarwal made the most scalable public learning algorithm as an intern two years ago. He has a deep and broad understanding of optimization and learning as well as the ability and will to make things happen programming-wise. I’ve been privileged to have Alekh visiting me in NY where he will be sorely missed. John Duchi created Adagrad which is a commonly helpful improvement over online gradient descent that is seeing wide adoption, including in Vowpal Wabbit . He has a similarly deep and broad understanding of optimization and learning with significant industry experience at Google . Alekh and John have often coauthored together. Stephane Ross visited me a year ago over the summer, implementing many new algorithms and working out the first scale free online update rule which is now the default in Vowpal Wabbit. Stephane is not on the market—Google robbed the cradle successfully I’m sure that
4 0.73737669 174 hunch net-2006-04-27-Conferences, Workshops, and Tutorials
Introduction: This is a reminder that many deadlines for summer conference registration are coming up, and attendance is a very good idea. It’s entirely reasonable for anyone to visit a conference once, even when they don’t have a paper. For students, visiting a conference is almost a ‘must’—there is no where else that a broad cross-section of research is on display. Workshops are also a very good idea. ICML has 11 , KDD has 9 , and AAAI has 19 . Workshops provide an opportunity to get a good understanding of some current area of research. They are probably the forum most conducive to starting new lines of research because they are so interactive. Tutorials are a good way to gain some understanding of a long-standing direction of research. They are generally more coherent than workshops. ICML has 7 and AAAI has 15 .
5 0.70119452 330 hunch net-2008-12-07-A NIPS paper
Introduction: I’m skipping NIPS this year in favor of Ada , but I wanted to point out this paper by Andriy Mnih and Geoff Hinton . The basic claim of the paper is that by carefully but automatically constructing a binary tree over words, it’s possible to predict words well with huge computational resource savings over unstructured approaches. I’m interested in this beyond the application to word prediction because it is relevant to the general normalization problem: If you want to predict the probability of one of a large number of events, often you must compute a predicted score for all the events and then normalize, a computationally inefficient operation. The problem comes up in many places using probabilistic models, but I’ve run into it with high-dimensional regression. There are a couple workarounds for this computational bug: Approximate. There are many ways. Often the approximations are uncontrolled (i.e. can be arbitrarily bad), and hence finicky in application. Avoid. Y
6 0.69190425 157 hunch net-2006-02-18-Multiplication of Learned Probabilities is Dangerous
7 0.44004992 403 hunch net-2010-07-18-ICML & COLT 2010
8 0.39016277 160 hunch net-2006-03-02-Why do people count for learning?
9 0.38693556 8 hunch net-2005-02-01-NIPS: Online Bayes
10 0.38520014 325 hunch net-2008-11-10-ICML Reviewing Criteria
11 0.37772489 309 hunch net-2008-07-10-Interesting papers, ICML 2008
12 0.35283938 406 hunch net-2010-08-22-KDD 2010
13 0.35275376 360 hunch net-2009-06-15-In Active Learning, the question changes
14 0.34657884 259 hunch net-2007-08-19-Choice of Metrics
15 0.3465178 341 hunch net-2009-02-04-Optimal Proxy Loss for Classification
16 0.3448925 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
17 0.34089923 463 hunch net-2012-05-02-ICML: Behind the Scenes
18 0.33743909 220 hunch net-2006-11-27-Continuizing Solutions
19 0.33548862 258 hunch net-2007-08-12-Exponentiated Gradient
20 0.33360556 204 hunch net-2006-08-28-Learning Theory standards for NIPS 2006