hunch_net hunch_net-2011 hunch_net-2011-436 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Shravan and Alex ‘s LDA code is released . On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting.
sentIndex sentText sentNum sentScore
1 On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting. [sent-2, score-2.024]
wordName wordTfidf (topN-words)
[('lda', 0.54), ('compares', 0.337), ('shravan', 0.337), ('alex', 0.239), ('released', 0.228), ('vw', 0.228), ('surely', 0.21), ('machines', 0.196), ('currently', 0.188), ('across', 0.18), ('effectively', 0.176), ('code', 0.164), ('scale', 0.154), ('single', 0.148), ('sure', 0.148), ('ability', 0.139), ('online', 0.107), ('interesting', 0.099), ('machine', 0.049), ('many', 0.041)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 436 hunch net-2011-06-22-Ultra LDA
Introduction: Shravan and Alex ‘s LDA code is released . On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting.
2 0.15588062 441 hunch net-2011-08-15-Vowpal Wabbit 6.0
Introduction: I just released Vowpal Wabbit 6.0 . Since the last version: VW is now 2-3 orders of magnitude faster at linear learning, primarily thanks to Alekh . Given the baseline, this is loads of fun, allowing us to easily deal with terafeature datasets, and dwarfing the scale of any other open source projects. The core improvement here comes from effective parallelization over kilonode clusters (either Hadoop or not). This code is highly scalable, so it even helps with clusters of size 2 (and doesn’t hurt for clusters of size 1). The core allreduce technique appears widely and easily reused—we’ve already used it to parallelize Conjugate Gradient, LBFGS, and two variants of online learning. We’ll be documenting how to do this more thoroughly, but for now “README_cluster” and associated scripts should provide a good starting point. The new LBFGS code from Miro seems to commonly dominate the existing conjugate gradient code in time/quality tradeoffs. The new matrix factoriz
3 0.1330414 281 hunch net-2007-12-21-Vowpal Wabbit Code Release
Introduction: We are releasing the Vowpal Wabbit (Fast Online Learning) code as open source under a BSD (revised) license. This is a project at Yahoo! Research to build a useful large scale learning algorithm which Lihong Li , Alex Strehl , and I have been working on. To appreciate the meaning of “large”, it’s useful to define “small” and “medium”. A “small” supervised learning problem is one where a human could use a labeled dataset and come up with a reasonable predictor. A “medium” supervised learning problem dataset fits into the RAM of a modern desktop computer. A “large” supervised learning problem is one which does not fit into the RAM of a normal machine. VW tackles large scale learning problems by this definition of large. I’m not aware of any other open source Machine Learning tools which can handle this scale (although they may exist). A few close ones are: IBM’s Parallel Machine Learning Toolbox isn’t quite open source . The approach used by this toolbox is essenti
4 0.12645389 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy
Introduction: I’ve released version 5.0 of the Vowpal Wabbit online learning software. The major number has changed since the last release because I regard all earlier versions as obsolete—there are several new algorithms & features including substantial changes and upgrades to the default learning algorithm. The biggest changes are new algorithms: Nikos and I improved the default algorithm. The basic update rule still uses gradient descent, but the size of the update is carefully controlled so that it’s impossible to overrun the label. In addition, the normalization has changed. Computationally, these changes are virtually free and yield better results, sometimes much better. Less careful updates can be reenabled with –loss_function classic, although results are still not identical to previous due to normalization changes. Nikos also implemented the per-feature learning rates as per these two papers . Often, this works better than the default algorithm. It isn’t the defa
5 0.092813797 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project
Introduction: Today brings a new release of the Vowpal Wabbit fast online learning software. This time, unlike the previous release, the project itself is going open source, developing via github . For example, the lastest and greatest can be downloaded via: git clone git://github.com/JohnLangford/vowpal_wabbit.git If you aren’t familiar with git , it’s a distributed version control system which supports quick and easy branching, as well as reconciliation. This version of the code is confirmed to compile without complaint on at least some flavors of OSX as well as Linux boxes. As much of the point of this project is pushing the limits of fast and effective machine learning, let me mention a few datapoints from my experience. The program can effectively scale up to batch-style training on sparse terafeature (i.e. 10 12 sparse feature) size datasets. The limiting factor is typically i/o. I started using the the real datasets from the large-scale learning workshop as a conve
6 0.085716426 473 hunch net-2012-09-29-Vowpal Wabbit, version 7.0
7 0.074368604 492 hunch net-2013-12-01-NIPS tutorials and Vowpal Wabbit 7.4
8 0.071693949 286 hunch net-2008-01-25-Turing’s Club for Machine Learning
9 0.068991594 428 hunch net-2011-03-27-Vowpal Wabbit, v5.1
10 0.062566653 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning
11 0.060875971 267 hunch net-2007-10-17-Online as the new adjective
12 0.059773844 445 hunch net-2011-09-28-Somebody’s Eating Your Lunch
13 0.05684704 323 hunch net-2008-11-04-Rise of the Machines
14 0.056629159 339 hunch net-2009-01-27-Key Scientific Challenges
15 0.054546967 393 hunch net-2010-04-14-MLcomp: a website for objectively comparing ML algorithms
16 0.054420192 346 hunch net-2009-03-18-Parallel ML primitives
17 0.05388261 252 hunch net-2007-07-01-Watchword: Online Learning
18 0.05054082 200 hunch net-2006-08-03-AOL’s data drop
19 0.049302816 451 hunch net-2011-12-13-Vowpal Wabbit version 6.1 & the NIPS tutorial
20 0.048798192 117 hunch net-2005-10-03-Not ICML
topicId topicWeight
[(0, 0.074), (1, 0.022), (2, -0.057), (3, -0.005), (4, 0.049), (5, 0.029), (6, -0.093), (7, -0.062), (8, -0.115), (9, 0.121), (10, -0.045), (11, 0.01), (12, 0.058), (13, -0.032), (14, 0.02), (15, -0.046), (16, 0.002), (17, -0.008), (18, 0.066), (19, -0.058), (20, 0.045), (21, 0.015), (22, -0.038), (23, -0.029), (24, 0.011), (25, -0.019), (26, -0.041), (27, 0.016), (28, -0.041), (29, 0.043), (30, 0.009), (31, 0.02), (32, 0.011), (33, -0.014), (34, 0.012), (35, 0.062), (36, -0.039), (37, -0.038), (38, 0.003), (39, 0.01), (40, 0.013), (41, 0.06), (42, -0.041), (43, -0.014), (44, 0.08), (45, 0.051), (46, -0.058), (47, -0.016), (48, -0.049), (49, -0.071)]
simIndex simValue blogId blogTitle
same-blog 1 0.97573525 436 hunch net-2011-06-22-Ultra LDA
Introduction: Shravan and Alex ‘s LDA code is released . On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting.
2 0.78981382 441 hunch net-2011-08-15-Vowpal Wabbit 6.0
Introduction: I just released Vowpal Wabbit 6.0 . Since the last version: VW is now 2-3 orders of magnitude faster at linear learning, primarily thanks to Alekh . Given the baseline, this is loads of fun, allowing us to easily deal with terafeature datasets, and dwarfing the scale of any other open source projects. The core improvement here comes from effective parallelization over kilonode clusters (either Hadoop or not). This code is highly scalable, so it even helps with clusters of size 2 (and doesn’t hurt for clusters of size 1). The core allreduce technique appears widely and easily reused—we’ve already used it to parallelize Conjugate Gradient, LBFGS, and two variants of online learning. We’ll be documenting how to do this more thoroughly, but for now “README_cluster” and associated scripts should provide a good starting point. The new LBFGS code from Miro seems to commonly dominate the existing conjugate gradient code in time/quality tradeoffs. The new matrix factoriz
3 0.71880597 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project
Introduction: Today brings a new release of the Vowpal Wabbit fast online learning software. This time, unlike the previous release, the project itself is going open source, developing via github . For example, the lastest and greatest can be downloaded via: git clone git://github.com/JohnLangford/vowpal_wabbit.git If you aren’t familiar with git , it’s a distributed version control system which supports quick and easy branching, as well as reconciliation. This version of the code is confirmed to compile without complaint on at least some flavors of OSX as well as Linux boxes. As much of the point of this project is pushing the limits of fast and effective machine learning, let me mention a few datapoints from my experience. The program can effectively scale up to batch-style training on sparse terafeature (i.e. 10 12 sparse feature) size datasets. The limiting factor is typically i/o. I started using the the real datasets from the large-scale learning workshop as a conve
4 0.68610805 419 hunch net-2010-12-04-Vowpal Wabbit, version 5.0, and the second heresy
Introduction: I’ve released version 5.0 of the Vowpal Wabbit online learning software. The major number has changed since the last release because I regard all earlier versions as obsolete—there are several new algorithms & features including substantial changes and upgrades to the default learning algorithm. The biggest changes are new algorithms: Nikos and I improved the default algorithm. The basic update rule still uses gradient descent, but the size of the update is carefully controlled so that it’s impossible to overrun the label. In addition, the normalization has changed. Computationally, these changes are virtually free and yield better results, sometimes much better. Less careful updates can be reenabled with –loss_function classic, although results are still not identical to previous due to normalization changes. Nikos also implemented the per-feature learning rates as per these two papers . Often, this works better than the default algorithm. It isn’t the defa
5 0.60353982 281 hunch net-2007-12-21-Vowpal Wabbit Code Release
Introduction: We are releasing the Vowpal Wabbit (Fast Online Learning) code as open source under a BSD (revised) license. This is a project at Yahoo! Research to build a useful large scale learning algorithm which Lihong Li , Alex Strehl , and I have been working on. To appreciate the meaning of “large”, it’s useful to define “small” and “medium”. A “small” supervised learning problem is one where a human could use a labeled dataset and come up with a reasonable predictor. A “medium” supervised learning problem dataset fits into the RAM of a modern desktop computer. A “large” supervised learning problem is one which does not fit into the RAM of a normal machine. VW tackles large scale learning problems by this definition of large. I’m not aware of any other open source Machine Learning tools which can handle this scale (although they may exist). A few close ones are: IBM’s Parallel Machine Learning Toolbox isn’t quite open source . The approach used by this toolbox is essenti
6 0.60209787 451 hunch net-2011-12-13-Vowpal Wabbit version 6.1 & the NIPS tutorial
7 0.57963628 473 hunch net-2012-09-29-Vowpal Wabbit, version 7.0
8 0.53962004 492 hunch net-2013-12-01-NIPS tutorials and Vowpal Wabbit 7.4
9 0.51892865 381 hunch net-2009-12-07-Vowpal Wabbit version 4.0, and a NIPS heresy
10 0.50966334 428 hunch net-2011-03-27-Vowpal Wabbit, v5.1
11 0.50300723 267 hunch net-2007-10-17-Online as the new adjective
12 0.47520888 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning
13 0.4674086 346 hunch net-2009-03-18-Parallel ML primitives
14 0.45580757 490 hunch net-2013-11-09-Graduates and Postdocs
15 0.42622975 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge
16 0.39645854 471 hunch net-2012-08-24-Patterns for research in machine learning
17 0.39607331 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class
18 0.39406979 378 hunch net-2009-11-15-The Other Online Learning
19 0.38622719 252 hunch net-2007-07-01-Watchword: Online Learning
20 0.38536716 298 hunch net-2008-04-26-Eliminating the Birthday Paradox for Universal Features
topicId topicWeight
[(27, 0.143), (77, 0.657)]
simIndex simValue blogId blogTitle
1 0.93170595 405 hunch net-2010-08-21-Rob Schapire at NYC ML Meetup
Introduction: I’ve been wanting to attend the NYC ML Meetup for some time and hope to make it next week on the 25th . Rob Schapire is talking about “Playing Repeated Games”, which in my experience is far more relevant to machine learning than the title might indicate.
same-blog 2 0.89427984 436 hunch net-2011-06-22-Ultra LDA
Introduction: Shravan and Alex ‘s LDA code is released . On a single machine, I’m not sure how it currently compares to the online LDA in VW , but the ability to effectively scale across very many machines is surely interesting.
3 0.74018383 206 hunch net-2006-09-09-How to solve an NP hard problem in quadratic time
Introduction: This title is a lie, but it is a special lie which has a bit of truth. If n players each play each other, you have a tournament. How do you order the players from weakest to strongest? The standard first attempt is “find the ordering which agrees with the tournament on as many player pairs as possible”. This is called the “minimum feedback arcset” problem in the CS theory literature and it is a well known NP-hard problem. A basic guarantee holds for the solution to this problem: if there is some “true” intrinsic ordering, and the outcome of the tournament disagrees k times (due to noise for instance), then the output ordering will disagree with the original ordering on at most 2k edges (and no solution can be better). One standard approach to tractably solving an NP-hard problem is to find another algorithm with an approximation guarantee. For example, Don Coppersmith , Lisa Fleischer and Atri Rudra proved that ordering players according to the number of wins is
4 0.61270577 165 hunch net-2006-03-23-The Approximation Argument
Introduction: An argument is sometimes made that the Bayesian way is the “right” way to do machine learning. This is a serious argument which deserves a serious reply. The approximation argument is a serious reply for which I have not yet seen a reply 2 . The idea for the Bayesian approach is quite simple, elegant, and general. Essentially, you first specify a prior P(D) over possible processes D producing the data, observe the data, then condition on the data according to Bayes law to construct a posterior: P(D|x) = P(x|D)P(D)/P(x) After this, hard decisions are made (such as “turn left” or “turn right”) by choosing the one which minimizes the expected (with respect to the posterior) loss. This basic idea is reused thousands of times with various choices of P(D) and loss functions which is unsurprising given the many nice properties: There is an extremely strong associated guarantee: If the actual distribution generating the data is drawn from P(D) there is no better method.
5 0.57103211 269 hunch net-2007-10-24-Contextual Bandits
Introduction: One of the fundamental underpinnings of the internet is advertising based content. This has become much more effective due to targeted advertising where ads are specifically matched to interests. Everyone is familiar with this, because everyone uses search engines and all search engines try to make money this way. The problem of matching ads to interests is a natural machine learning problem in some ways since there is much information in who clicks on what. A fundamental problem with this information is that it is not supervised—in particular a click-or-not on one ad doesn’t generally tell you if a different ad would have been clicked on. This implies we have a fundamental exploration problem. A standard mathematical setting for this situation is “ k -Armed Bandits”, often with various relevant embellishments. The k -Armed Bandit setting works on a round-by-round basis. On each round: A policy chooses arm a from 1 of k arms (i.e. 1 of k ads). The world reveals t
6 0.50119311 388 hunch net-2010-01-24-Specializations of the Master Problem
7 0.48408139 375 hunch net-2009-10-26-NIPS workshops
8 0.48259664 317 hunch net-2008-09-12-How do we get weak action dependence for learning with partial observations?
9 0.4000082 392 hunch net-2010-03-26-A Variance only Deviation Bound
10 0.3566587 60 hunch net-2005-04-23-Advantages and Disadvantages of Bayesian Learning
11 0.34937197 100 hunch net-2005-08-04-Why Reinforcement Learning is Important
12 0.34158266 118 hunch net-2005-10-07-On-line learning of regular decision rules
13 0.29814944 259 hunch net-2007-08-19-Choice of Metrics
14 0.2862547 293 hunch net-2008-03-23-Interactive Machine Learning
15 0.2740064 101 hunch net-2005-08-08-Apprenticeship Reinforcement Learning for Control
16 0.27079573 235 hunch net-2007-03-03-All Models of Learning have Flaws
17 0.26636881 220 hunch net-2006-11-27-Continuizing Solutions
18 0.26552296 311 hunch net-2008-07-26-Compositional Machine Learning Algorithm Design
19 0.2638483 185 hunch net-2006-06-16-Regularization = Robustness
20 0.26109222 183 hunch net-2006-06-14-Explorations of Exploration