hunch_net hunch_net-2012 hunch_net-2012-471 knowledge-graph by maker-knowledge-mining

471 hunch net-2012-08-24-Patterns for research in machine learning


meta infos for this blog

Source: html

Introduction: There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are: Separate code from data. Separate input data, working data and output data. Save everything to disk frequently. Separate options from parameters. Do not use global variables. Record the options used to generate each run of the algorithm. Make it easy to sweep options. Make it easy to execute only portions of the code. Use checkpointing. Write demos and tests. Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic. My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a signif


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. [sent-1, score-0.928]

2 Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. [sent-2, score-0.452]

3 Separate input data, working data and output data. [sent-4, score-0.394]

4 Record the options used to generate each run of the algorithm. [sent-8, score-0.511]

5 Make it easy to execute only portions of the code. [sent-10, score-0.419]

6 Click here for discussion and examples for each item. [sent-13, score-0.072]

7 Also see Charles Sutton’s and HackerNews’ thoughts on the same topic. [sent-14, score-0.119]

8 My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a significant amount of time to execute. [sent-15, score-1.17]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('separate', 0.383), ('patterns', 0.36), ('options', 0.278), ('sutton', 0.18), ('code', 0.175), ('portions', 0.167), ('charles', 0.157), ('disk', 0.157), ('click', 0.157), ('generate', 0.157), ('demos', 0.15), ('trust', 0.144), ('execute', 0.144), ('processing', 0.139), ('guess', 0.139), ('save', 0.139), ('data', 0.138), ('wish', 0.128), ('amounts', 0.124), ('thoughts', 0.119), ('involves', 0.119), ('global', 0.114), ('record', 0.114), ('appreciate', 0.108), ('easy', 0.108), ('share', 0.107), ('write', 0.103), ('output', 0.098), ('aware', 0.098), ('everything', 0.096), ('input', 0.094), ('students', 0.093), ('typical', 0.088), ('started', 0.088), ('towards', 0.087), ('amount', 0.08), ('make', 0.08), ('research', 0.079), ('list', 0.078), ('use', 0.077), ('run', 0.076), ('either', 0.074), ('discussion', 0.072), ('ll', 0.071), ('computational', 0.069), ('go', 0.069), ('also', 0.066), ('making', 0.065), ('working', 0.064), ('long', 0.064)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999994 471 hunch net-2012-08-24-Patterns for research in machine learning

Introduction: There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are: Separate code from data. Separate input data, working data and output data. Save everything to disk frequently. Separate options from parameters. Do not use global variables. Record the options used to generate each run of the algorithm. Make it easy to sweep options. Make it easy to execute only portions of the code. Use checkpointing. Write demos and tests. Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic. My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a signif

2 0.12261014 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

Introduction: Machine learning is often computationally bounded which implies that the ability to write fast code becomes important if you ever want to implement a machine learning algorithm. Basic tactical optimizations are covered well elsewhere , but I haven’t seen a reasonable guide to higher level optimizations, which are the most important in my experience. Here are some of the higher level optimizations I’ve often found useful. Algorithmic Improvement First . This is Hard, but it is the most important consideration, and typically yields the most benefits. Good optimizations here are publishable. In the context of machine learning, you should be familiar with the arguments for online vs. batch learning. Choice of Language . There are many arguments about the choice of language . Sometimes you don’t have a choice when interfacing with other people. Personally, I favor C/C++ when I want to write fast code. This (admittedly) makes me a slower programmer than when using higher lev

3 0.10534634 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project

Introduction: Today brings a new release of the Vowpal Wabbit fast online learning software. This time, unlike the previous release, the project itself is going open source, developing via github . For example, the lastest and greatest can be downloaded via: git clone git://github.com/JohnLangford/vowpal_wabbit.git If you aren’t familiar with git , it’s a distributed version control system which supports quick and easy branching, as well as reconciliation. This version of the code is confirmed to compile without complaint on at least some flavors of OSX as well as Linux boxes. As much of the point of this project is pushing the limits of fast and effective machine learning, let me mention a few datapoints from my experience. The program can effectively scale up to batch-style training on sparse terafeature (i.e. 10 12 sparse feature) size datasets. The limiting factor is typically i/o. I started using the the real datasets from the large-scale learning workshop as a conve

4 0.091096401 4 hunch net-2005-01-26-Summer Schools

Introduction: There are several summer schools related to machine learning. We are running a two week machine learning summer school in Chicago, USA May 16-27. IPAM is running a more focused three week summer school on Intelligent Extraction of Information from Graphs and High Dimensional Data in Los Angeles, USA July 11-29. A broad one-week school on analysis of patterns will be held in Erice, Italy, Oct. 28-Nov 6.

5 0.088845603 266 hunch net-2007-10-15-NIPS workshops extended to 3 days

Introduction: (Unofficially, at least.) The Deep Learning Workshop is being held the afternoon before the rest of the workshops in Vancouver, BC. Separate registration is needed, and open. What’s happening fundamentally here is that there are too many interesting workshops to fit into 2 days. Perhaps we can get it officially expanded to 3 days next year.

6 0.081344366 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

7 0.079014637 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

8 0.076444209 281 hunch net-2007-12-21-Vowpal Wabbit Code Release

9 0.070460476 177 hunch net-2006-05-05-An ICML reject

10 0.067175314 346 hunch net-2009-03-18-Parallel ML primitives

11 0.066072136 143 hunch net-2005-12-27-Automated Labeling

12 0.065850526 44 hunch net-2005-03-21-Research Styles in Machine Learning

13 0.065390401 260 hunch net-2007-08-25-The Privacy Problem

14 0.064020209 49 hunch net-2005-03-30-What can Type Theory teach us about Machine Learning?

15 0.062977806 441 hunch net-2011-08-15-Vowpal Wabbit 6.0

16 0.062741064 208 hunch net-2006-09-18-What is missing for online collaborative research?

17 0.061624505 36 hunch net-2005-03-05-Funding Research

18 0.061565835 110 hunch net-2005-09-10-“Failure” is an option

19 0.061565585 22 hunch net-2005-02-18-What it means to do research.

20 0.060540061 286 hunch net-2008-01-25-Turing’s Club for Machine Learning


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.153), (1, 0.004), (2, -0.076), (3, 0.053), (4, -0.013), (5, -0.003), (6, -0.032), (7, -0.003), (8, -0.015), (9, 0.034), (10, -0.065), (11, -0.052), (12, 0.033), (13, -0.027), (14, -0.003), (15, -0.014), (16, 0.016), (17, 0.027), (18, -0.031), (19, -0.015), (20, 0.087), (21, -0.021), (22, 0.016), (23, -0.028), (24, -0.028), (25, -0.05), (26, -0.027), (27, -0.005), (28, -0.004), (29, 0.004), (30, -0.034), (31, 0.021), (32, 0.016), (33, 0.072), (34, 0.005), (35, -0.002), (36, 0.022), (37, 0.007), (38, 0.027), (39, -0.076), (40, -0.007), (41, 0.015), (42, -0.019), (43, 0.036), (44, 0.024), (45, 0.054), (46, 0.022), (47, 0.027), (48, -0.037), (49, -0.095)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.95182759 471 hunch net-2012-08-24-Patterns for research in machine learning

Introduction: There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are: Separate code from data. Separate input data, working data and output data. Save everything to disk frequently. Separate options from parameters. Do not use global variables. Record the options used to generate each run of the algorithm. Make it easy to sweep options. Make it easy to execute only portions of the code. Use checkpointing. Write demos and tests. Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic. My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a signif

2 0.73967075 262 hunch net-2007-09-16-Optimizing Machine Learning Programs

Introduction: Machine learning is often computationally bounded which implies that the ability to write fast code becomes important if you ever want to implement a machine learning algorithm. Basic tactical optimizations are covered well elsewhere , but I haven’t seen a reasonable guide to higher level optimizations, which are the most important in my experience. Here are some of the higher level optimizations I’ve often found useful. Algorithmic Improvement First . This is Hard, but it is the most important consideration, and typically yields the most benefits. Good optimizations here are publishable. In the context of machine learning, you should be familiar with the arguments for online vs. batch learning. Choice of Language . There are many arguments about the choice of language . Sometimes you don’t have a choice when interfacing with other people. Personally, I favor C/C++ when I want to write fast code. This (admittedly) makes me a slower programmer than when using higher lev

3 0.67168576 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

Introduction: Suppose you have a dataset with 2 terafeatures (we only count nonzero entries in a datamatrix), and want to learn a good linear predictor in a reasonable amount of time. How do you do it? As a learning theorist, the first thing you do is pray that this is too much data for the number of parameters—but that’s not the case, there are around 16 billion examples, 16 million parameters, and people really care about a high quality predictor, so subsampling is not a good strategy. Alekh visited us last summer, and we had a breakthrough (see here for details), coming up with the first learning algorithm I’ve seen that is provably faster than any future single machine learning algorithm. The proof of this is simple: We can output a optimal-up-to-precision linear predictor faster than the data can be streamed through the network interface of any single machine involved in the computation. It is necessary but not sufficient to have an effective communication infrastructure. It is ne

4 0.62812144 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project

Introduction: Today brings a new release of the Vowpal Wabbit fast online learning software. This time, unlike the previous release, the project itself is going open source, developing via github . For example, the lastest and greatest can be downloaded via: git clone git://github.com/JohnLangford/vowpal_wabbit.git If you aren’t familiar with git , it’s a distributed version control system which supports quick and easy branching, as well as reconciliation. This version of the code is confirmed to compile without complaint on at least some flavors of OSX as well as Linux boxes. As much of the point of this project is pushing the limits of fast and effective machine learning, let me mention a few datapoints from my experience. The program can effectively scale up to batch-style training on sparse terafeature (i.e. 10 12 sparse feature) size datasets. The limiting factor is typically i/o. I started using the the real datasets from the large-scale learning workshop as a conve

5 0.59233797 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms

Introduction: Muthu invited me to the workshop on algorithms in the field , with the goal of providing a sense of where near-term research should go. When the time came though, I bargained for a post instead, which provides a chance for many other people to comment. There are several things I didn’t fully understand when I went to Yahoo! about 5 years ago. I’d like to repeat them as people in academia may not yet understand them intuitively. Almost all the big impact algorithms operate in pseudo-linear or better time. Think about caching, hashing, sorting, filtering, etc… and you have a sense of what some of the most heavily used algorithms are. This matters quite a bit to Machine Learning research, because people often work with superlinear time algorithms and languages. Two very common examples of this are graphical models, where inference is often a superlinear operation—think about the n 2 dependence on the number of states in a Hidden Markov Model and Kernelized Support Vecto

6 0.58824748 441 hunch net-2011-08-15-Vowpal Wabbit 6.0

7 0.576352 153 hunch net-2006-02-02-Introspectionism as a Disease

8 0.57543224 337 hunch net-2009-01-21-Nearly all natural problems require nonlinearity

9 0.56256223 442 hunch net-2011-08-20-The Large Scale Learning Survey Tutorial

10 0.5614661 13 hunch net-2005-02-04-JMLG

11 0.55666292 346 hunch net-2009-03-18-Parallel ML primitives

12 0.55566144 449 hunch net-2011-11-26-Giving Thanks

13 0.548603 281 hunch net-2007-12-21-Vowpal Wabbit Code Release

14 0.54442286 148 hunch net-2006-01-13-Benchmarks for RL

15 0.5428074 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

16 0.53875583 146 hunch net-2006-01-06-MLTV

17 0.53780454 300 hunch net-2008-04-30-Concerns about the Large Scale Learning Challenge

18 0.53531194 475 hunch net-2012-10-26-ML Symposium and Strata-Hadoop World

19 0.53409714 61 hunch net-2005-04-25-Embeddings: what are they good for?

20 0.53289741 202 hunch net-2006-08-10-Precision is not accuracy


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(14, 0.402), (27, 0.167), (53, 0.042), (55, 0.08), (94, 0.146), (95, 0.044)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.88768971 471 hunch net-2012-08-24-Patterns for research in machine learning

Introduction: There are a handful of basic code patterns that I wish I was more aware of when I started research in machine learning. Each on its own may seem pointless, but collectively they go a long way towards making the typical research workflow more efficient. Here they are: Separate code from data. Separate input data, working data and output data. Save everything to disk frequently. Separate options from parameters. Do not use global variables. Record the options used to generate each run of the algorithm. Make it easy to sweep options. Make it easy to execute only portions of the code. Use checkpointing. Write demos and tests. Click here for discussion and examples for each item. Also see Charles Sutton’s and HackerNews’ thoughts on the same topic. My guess is that these patterns will not only be useful for machine learning, but also any other computational work that involves either a) processing large amounts of data, or b) algorithms that take a signif

2 0.87308204 94 hunch net-2005-07-13-Text Entailment at AAAI

Introduction: Rajat Raina presented a paper on the technique they used for the PASCAL Recognizing Textual Entailment challenge. “Text entailment” is the problem of deciding if one sentence implies another. For example the previous sentence entails: Text entailment is a decision problem. One sentence can imply another. The challenge was of the form: given an original sentence and another sentence predict whether there was an entailment. All current techniques for predicting correctness of an entailment are at the “flail” stage—accuracies of around 58% where humans could achieve near 100% accuracy, so there is much room to improve. Apparently, there may be another PASCAL challenge on this problem in the near future.

3 0.74364978 380 hunch net-2009-11-29-AI Safety

Introduction: Dan Reeves introduced me to Michael Vassar who ran the Singularity Summit and educated me a bit on the subject of AI safety which the Singularity Institute has small grants for . I still believe that interstellar space travel is necessary for long term civilization survival, and the AI is necessary for interstellar space travel . On these grounds alone, we could judge that developing AI is much more safe than not. Nevertheless, there is a basic reasonable fear, as expressed by some commenters, that AI could go bad. A basic scenario starts with someone inventing an AI and telling it to make as much money as possible. The AI promptly starts trading in various markets to make money. To improve, it crafts a virus that takes over most of the world’s computers using it as a surveillance network so that it can always make the right decision. The AI also branches out into any form of distance work, taking over the entire outsourcing process for all jobs that are entirely di

4 0.71416909 430 hunch net-2011-04-11-The Heritage Health Prize

Introduction: The Heritage Health Prize is potentially the largest prediction prize yet at $3M, which is sure to get many people interested. Several elements of the competition may be worth discussing. The most straightforward way for HPN to deploy this predictor is in determining who to cover with insurance. This might easily cover the costs of running the contest itself, but the value to the health system of a whole is minimal, as people not covered still exist. While HPN itself is a provider network, they have active relationships with a number of insurance companies, and the right to resell any entrant. It’s worth keeping in mind that the research and development may nevertheless end up being useful in the longer term, especially as entrants also keep the right to their code. The judging metric is something I haven’t seen previously. If a patient has probability 0.5 of being in the hospital 0 days and probability 0.5 of being in the hospital ~53.6 days, the optimal prediction in e

5 0.71384764 407 hunch net-2010-08-23-Boosted Decision Trees for Deep Learning

Introduction: About 4 years ago, I speculated that decision trees qualify as a deep learning algorithm because they can make decisions which are substantially nonlinear in the input representation. Ping Li has proved this correct, empirically at UAI by showing that boosted decision trees can beat deep belief networks on versions of Mnist which are artificially hardened so as to make them solvable only by deep learning algorithms. This is an important point, because the ability to solve these sorts of problems is probably the best objective definition of a deep learning algorithm we have. I’m not that surprised. In my experience, if you can accept the computational drawbacks of a boosted decision tree, they can achieve pretty good performance. Geoff Hinton once told me that the great thing about deep belief networks is that they work. I understand that Ping had very substantial difficulty in getting this published, so I hope some reviewers step up to the standard of valuing wha

6 0.50950783 221 hunch net-2006-12-04-Structural Problems in NIPS Decision Making

7 0.50149882 286 hunch net-2008-01-25-Turing’s Club for Machine Learning

8 0.50033146 276 hunch net-2007-12-10-Learning Track of International Planning Competition

9 0.49845222 229 hunch net-2007-01-26-Parallel Machine Learning Problems

10 0.49649751 136 hunch net-2005-12-07-Is the Google way the way for machine learning?

11 0.4920243 120 hunch net-2005-10-10-Predictive Search is Coming

12 0.48537907 450 hunch net-2011-12-02-Hadoop AllReduce and Terascale Learning

13 0.48426634 95 hunch net-2005-07-14-What Learning Theory might do

14 0.48415914 132 hunch net-2005-11-26-The Design of an Optimal Research Environment

15 0.48365778 96 hunch net-2005-07-21-Six Months

16 0.48310927 371 hunch net-2009-09-21-Netflix finishes (and starts)

17 0.48116419 281 hunch net-2007-12-21-Vowpal Wabbit Code Release

18 0.48099259 237 hunch net-2007-04-02-Contextual Scaling

19 0.48082677 423 hunch net-2011-02-02-User preferences for search engines

20 0.48030812 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models