fast_ml fast_ml-2014 fast_ml-2014-50 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results. Step by step You have a pickled model from Pylearn2. Let’s load it: from pylearn2.utils import serial model_path = 'model.pkl' model = serial.load( model_path ) Next, some Theano weirdness. Theano is a compiler for symbolic expressions and with these expressions we deal when predicting. We need to define expressions for X and Y: X = model.get_input_space().make_theano_batch() Y = model.fprop( X ) Mind you, these are not variables, but rather descriptions of how to get variables. Y is easy to understand: just feed the data to the model and forward-propagate. X is more of an idiom, the incantations above make sur
sentIndex sentText sentNum sentScore
1 A while ago we’ve shown how to get predictions from a Pylearn2 model. [sent-1, score-0.238]
2 It is a little tricky, partly because of splitting data into batches. [sent-2, score-0.188]
3 If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. [sent-3, score-0.366]
4 We exercise the concept to distinguish cats from dogs again, with superior results. [sent-4, score-0.557]
5 Step by step You have a pickled model from Pylearn2. [sent-5, score-0.422]
6 Theano is a compiler for symbolic expressions and with these expressions we deal when predicting. [sent-10, score-0.772]
7 We need to define expressions for X and Y: X = model. [sent-11, score-0.575]
8 Y is easy to understand: just feed the data to the model and forward-propagate. [sent-15, score-0.099]
9 In case of classification, you need to throw in argmax : Y = T. [sent-17, score-0.214]
10 argmax( Y, axis = 1 ) The next step is to define a link between X and Y. [sent-18, score-0.684]
11 This link is a function that takes X and returns Y. [sent-19, score-0.476]
12 function : the interface for compiling graphs into callable objects . [sent-21, score-0.437]
13 function( [X], Y ) The final step is what you’d expect: you provide some data to the function and it returns predictions: y = f( x_test ) Both x_test and y are numpy arrays. [sent-23, score-0.521]
14 Credit: @CuteEmergency A practical example In the previous article we learned how to get 88% accuracy with a pre-trained network. [sent-27, score-0.079]
15 Kyle has been publishing his code for the contest all along. [sent-29, score-0.107]
16 The general idea is to use a pre-trained network not to classify, but to extract features from images . [sent-31, score-0.074]
17 Then you train a custom classifier, here a biggish perceptron with two hidden layers, rectified linear units and dropout. [sent-34, score-0.085]
18 The resulting network is able to distinguish between classes much better than the original model because it’s trained specifically for the task at hand. [sent-35, score-0.48]
19 Now it’s available, but still - here’s our simpler version without batches . [sent-37, score-0.173]
20 UPDATE : Ian Goodfellow merged the slightly more polished script into Pylearn2, as scripts/mlp/predict_csv. [sent-38, score-0.214]
wordName wordTfidf (topN-words)
[('expressions', 0.386), ('step', 0.216), ('distinguish', 0.214), ('kyle', 0.214), ('define', 0.189), ('returns', 0.189), ('link', 0.171), ('theano', 0.157), ('function', 0.116), ('next', 0.108), ('tricky', 0.107), ('polished', 0.107), ('throw', 0.107), ('publishing', 0.107), ('ordinary', 0.107), ('merged', 0.107), ('pickled', 0.107), ('goodfellow', 0.107), ('interface', 0.107), ('argmax', 0.107), ('callable', 0.107), ('kastner', 0.107), ('objects', 0.107), ('road', 0.107), ('transforms', 0.107), ('model', 0.099), ('concept', 0.094), ('handling', 0.094), ('partly', 0.094), ('splitting', 0.094), ('float', 0.094), ('classify', 0.094), ('ian', 0.094), ('initially', 0.094), ('batch', 0.094), ('batches', 0.094), ('decaf', 0.094), ('able', 0.093), ('exercise', 0.085), ('easier', 0.085), ('ago', 0.085), ('rectified', 0.085), ('dogs', 0.085), ('shown', 0.079), ('understand', 0.079), ('practical', 0.079), ('simpler', 0.079), ('cats', 0.079), ('predictions', 0.074), ('network', 0.074)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 50 fast ml-2014-01-20-How to get predictions from Pylearn2
Introduction: A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results. Step by step You have a pickled model from Pylearn2. Let’s load it: from pylearn2.utils import serial model_path = 'model.pkl' model = serial.load( model_path ) Next, some Theano weirdness. Theano is a compiler for symbolic expressions and with these expressions we deal when predicting. We need to define expressions for X and Y: X = model.get_input_space().make_theano_batch() Y = model.fprop( X ) Mind you, these are not variables, but rather descriptions of how to get variables. Y is easy to understand: just feed the data to the model and forward-propagate. X is more of an idiom, the incantations above make sur
2 0.13355611 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet
Introduction: Object recognition in images is where deep learning, and specifically convolutional neural networks, are often applied and benchmarked these days. To get a piece of the action, we’ll be using Alex Krizhevsky’s cuda-convnet , a shining diamond of machine learning software, in a Kaggle competition. Continuing to run things on a GPU, we turn to applying convolutional neural networks for object recognition. This kind of network was developed by Yann LeCun and it’s powerful, but a bit complicated: Image credit: EBLearn tutorial A typical convolutional network has two parts. The first is responsible for feature extraction and consists of one or more pairs of convolution and subsampling/max-pooling layers, as you can see above. The second part is just a classic fully-connected multilayer perceptron taking extracted features as input. For a detailed explanation of all this see unit 9 in Hugo LaRochelle’s neural networks course . Daniel Nouri has an interesting story about
3 0.11671974 52 fast ml-2014-02-02-Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition
Introduction: Out of 215 contestants, we placed 8th in the Cats and Dogs competition at Kaggle. The top ten finish gave us the master badge. The competition was about discerning the animals in images and here’s how we did it. We extracted the features using pre-trained deep convolutional networks, specifically decaf and OverFeat . Then we trained some classifiers on these features. The whole thing was inspired by Kyle Kastner’s decaf + pylearn2 combo and we expanded this idea. The classifiers were linear models from scikit-learn and a neural network from Pylearn2 . At the end we created a voting ensemble of the individual models. OverFeat features We touched on OverFeat in Classifying images with a pre-trained deep network . A better way to use it in this competition’s context is to extract the features from the layer before the classifier, as Pierre Sermanet suggested in the comments. Concretely, in the larger OverFeat model ( -l ) layer 24 is the softmax, at least in the
4 0.09764991 49 fast ml-2014-01-10-Classifying images with a pre-trained deep network
Introduction: Recently at least two research teams made their pre-trained deep convolutional networks available, so you can classify your images right away. We’ll see how to go about it, with data from the Cats & Dogs competition at Kaggle as an example. We’ll be using OverFeat , a classifier and feature extractor from the New York guys lead by Yann LeCun and Rob Fergus. The principal author, Pierre Sermanet, is currently first on the Dogs vs. Cats leaderboard . The other available implementation we know of comes from Berkeley. It’s called Caffe and is a successor to decaf . Yangqing Jia , the main author of these, is also near the top of the leaderboard. Both networks were trained on ImageNet , which is an image database organized according to the WordNet hierarchy . It was the ImageNet Large Scale Visual Recognition Challenge 2012 in which Alex Krizhevsky crushed the competition with his network. His error was 16%, the second best - 26%. Data The Kaggle competition featur
5 0.090008825 27 fast ml-2013-05-01-Deep learning made easy
Introduction: As usual, there’s an interesting competition at Kaggle: The Black Box. It’s connected to ICML 2013 Workshop on Challenges in Representation Learning, held by the deep learning guys from Montreal. There are a couple benchmarks for this competition and the best one is unusually hard to beat 1 - only less than a fourth of those taking part managed to do so. We’re among them. Here’s how. The key ingredient in our success is a recently developed secret Stanford technology for deep unsupervised learning: sparse filtering by Jiquan Ngiam et al. Actually, it’s not secret. It’s available at Github , and has one or two very appealling properties. Let us explain. The main idea of deep unsupervised learning, as we understand it, is feature extraction. One of the most common applications is in multimedia. The reason for that is that multimedia tasks, for example object recognition, are easy for humans, but difficult for computers 2 . Geoff Hinton from Toronto talks about two ends
6 0.087247275 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow
7 0.081245072 43 fast ml-2013-11-02-Maxing out the digits
8 0.075809844 32 fast ml-2013-07-05-Processing large files, line by line
9 0.070688225 31 fast ml-2013-06-19-Go non-linear with Vowpal Wabbit
10 0.069335684 25 fast ml-2013-04-10-Gender discrimination
11 0.066682793 29 fast ml-2013-05-25-More on sparse filtering and the Black Box competition
12 0.065467201 40 fast ml-2013-10-06-Pylearn2 in practice
13 0.063952386 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python
14 0.063815914 47 fast ml-2013-12-15-A-B testing with bayesian bandits in Google Analytics
15 0.060931906 18 fast ml-2013-01-17-A very fast denoising autoencoder
16 0.058921184 53 fast ml-2014-02-20-Are stocks predictable?
17 0.056386679 55 fast ml-2014-03-20-Good representations, distance, metric learning and supervised dimensionality reduction
18 0.05425816 57 fast ml-2014-04-01-Exclusive Geoff Hinton interview
19 0.054227456 19 fast ml-2013-02-07-The secret of the big guys
20 0.053472091 20 fast ml-2013-02-18-Predicting advertised salaries
topicId topicWeight
[(0, 0.231), (1, 0.012), (2, 0.104), (3, 0.069), (4, -0.108), (5, -0.239), (6, -0.149), (7, 0.029), (8, -0.041), (9, 0.036), (10, -0.039), (11, 0.102), (12, -0.112), (13, -0.279), (14, -0.195), (15, -0.012), (16, -0.086), (17, -0.166), (18, 0.001), (19, 0.177), (20, -0.055), (21, 0.192), (22, 0.143), (23, 0.173), (24, 0.098), (25, -0.024), (26, -0.022), (27, -0.064), (28, 0.11), (29, 0.229), (30, 0.089), (31, -0.102), (32, -0.153), (33, 0.008), (34, -0.334), (35, -0.178), (36, 0.071), (37, 0.135), (38, 0.136), (39, -0.002), (40, -0.123), (41, 0.008), (42, -0.055), (43, 0.209), (44, 0.225), (45, 0.235), (46, 0.165), (47, 0.113), (48, -0.076), (49, -0.073)]
simIndex simValue blogId blogTitle
same-blog 1 0.98369664 50 fast ml-2014-01-20-How to get predictions from Pylearn2
Introduction: A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results. Step by step You have a pickled model from Pylearn2. Let’s load it: from pylearn2.utils import serial model_path = 'model.pkl' model = serial.load( model_path ) Next, some Theano weirdness. Theano is a compiler for symbolic expressions and with these expressions we deal when predicting. We need to define expressions for X and Y: X = model.get_input_space().make_theano_batch() Y = model.fprop( X ) Mind you, these are not variables, but rather descriptions of how to get variables. Y is easy to understand: just feed the data to the model and forward-propagate. X is more of an idiom, the incantations above make sur
2 0.21059939 52 fast ml-2014-02-02-Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition
Introduction: Out of 215 contestants, we placed 8th in the Cats and Dogs competition at Kaggle. The top ten finish gave us the master badge. The competition was about discerning the animals in images and here’s how we did it. We extracted the features using pre-trained deep convolutional networks, specifically decaf and OverFeat . Then we trained some classifiers on these features. The whole thing was inspired by Kyle Kastner’s decaf + pylearn2 combo and we expanded this idea. The classifiers were linear models from scikit-learn and a neural network from Pylearn2 . At the end we created a voting ensemble of the individual models. OverFeat features We touched on OverFeat in Classifying images with a pre-trained deep network . A better way to use it in this competition’s context is to extract the features from the layer before the classifier, as Pierre Sermanet suggested in the comments. Concretely, in the larger OverFeat model ( -l ) layer 24 is the softmax, at least in the
3 0.19303118 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow
Introduction: This time we enter the Stack Overflow challenge , which is about predicting a status of a given question on SO. There are five possible statuses, so it’s a multi-class classification problem. We would prefer a tool able to perform multiclass classification by itself. It can be done by hand by constructing five datasets, each with binary labels (one class against all others), and then combining predictions, but it might be a bit tricky to get right - we tried. Fortunately, nice people at Yahoo, excuse us, Microsoft, recently relased a new version of Vowpal Wabbit , and this new version supports multiclass classification. In case you’re wondering, Vowpal Wabbit is a fast linear learner. We like the “fast” part and “linear” is OK for dealing with lots of words, as in this contest. In any case, with more than three million data points it wouldn’t be that easy to train a kernel SVM, a neural net or what have you. VW, being a well-polished tool, has a few very convenient features.
4 0.18794492 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet
Introduction: Object recognition in images is where deep learning, and specifically convolutional neural networks, are often applied and benchmarked these days. To get a piece of the action, we’ll be using Alex Krizhevsky’s cuda-convnet , a shining diamond of machine learning software, in a Kaggle competition. Continuing to run things on a GPU, we turn to applying convolutional neural networks for object recognition. This kind of network was developed by Yann LeCun and it’s powerful, but a bit complicated: Image credit: EBLearn tutorial A typical convolutional network has two parts. The first is responsible for feature extraction and consists of one or more pairs of convolution and subsampling/max-pooling layers, as you can see above. The second part is just a classic fully-connected multilayer perceptron taking extracted features as input. For a detailed explanation of all this see unit 9 in Hugo LaRochelle’s neural networks course . Daniel Nouri has an interesting story about
5 0.17242321 49 fast ml-2014-01-10-Classifying images with a pre-trained deep network
Introduction: Recently at least two research teams made their pre-trained deep convolutional networks available, so you can classify your images right away. We’ll see how to go about it, with data from the Cats & Dogs competition at Kaggle as an example. We’ll be using OverFeat , a classifier and feature extractor from the New York guys lead by Yann LeCun and Rob Fergus. The principal author, Pierre Sermanet, is currently first on the Dogs vs. Cats leaderboard . The other available implementation we know of comes from Berkeley. It’s called Caffe and is a successor to decaf . Yangqing Jia , the main author of these, is also near the top of the leaderboard. Both networks were trained on ImageNet , which is an image database organized according to the WordNet hierarchy . It was the ImageNet Large Scale Visual Recognition Challenge 2012 in which Alex Krizhevsky crushed the competition with his network. His error was 16%, the second best - 26%. Data The Kaggle competition featur
6 0.16687571 27 fast ml-2013-05-01-Deep learning made easy
7 0.14590161 32 fast ml-2013-07-05-Processing large files, line by line
8 0.13501777 25 fast ml-2013-04-10-Gender discrimination
9 0.12706885 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python
10 0.12589972 29 fast ml-2013-05-25-More on sparse filtering and the Black Box competition
11 0.12312931 31 fast ml-2013-06-19-Go non-linear with Vowpal Wabbit
12 0.12306476 19 fast ml-2013-02-07-The secret of the big guys
13 0.11582886 40 fast ml-2013-10-06-Pylearn2 in practice
14 0.11529656 43 fast ml-2013-11-02-Maxing out the digits
15 0.11389823 30 fast ml-2013-06-01-Amazon aspires to automate access control
16 0.10948326 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint
17 0.10811391 46 fast ml-2013-12-07-13 NIPS papers that caught our eye
18 0.10695048 18 fast ml-2013-01-17-A very fast denoising autoencoder
19 0.10418788 36 fast ml-2013-08-23-A bag of words and a nice little network
20 0.099897668 38 fast ml-2013-09-09-Predicting solar energy from weather forecasts plus a NetCDF4 tutorial
topicId topicWeight
[(26, 0.045), (31, 0.056), (35, 0.08), (48, 0.566), (69, 0.101), (71, 0.024), (99, 0.039)]
simIndex simValue blogId blogTitle
same-blog 1 0.91866976 50 fast ml-2014-01-20-How to get predictions from Pylearn2
Introduction: A while ago we’ve shown how to get predictions from a Pylearn2 model. It is a little tricky, partly because of splitting data into batches. If you’re able to fit your data in memory, you can strip the batch handling code and it becomes easier to see what’s going on. We exercise the concept to distinguish cats from dogs again, with superior results. Step by step You have a pickled model from Pylearn2. Let’s load it: from pylearn2.utils import serial model_path = 'model.pkl' model = serial.load( model_path ) Next, some Theano weirdness. Theano is a compiler for symbolic expressions and with these expressions we deal when predicting. We need to define expressions for X and Y: X = model.get_input_space().make_theano_batch() Y = model.fprop( X ) Mind you, these are not variables, but rather descriptions of how to get variables. Y is easy to understand: just feed the data to the model and forward-propagate. X is more of an idiom, the incantations above make sur
2 0.26336354 49 fast ml-2014-01-10-Classifying images with a pre-trained deep network
Introduction: Recently at least two research teams made their pre-trained deep convolutional networks available, so you can classify your images right away. We’ll see how to go about it, with data from the Cats & Dogs competition at Kaggle as an example. We’ll be using OverFeat , a classifier and feature extractor from the New York guys lead by Yann LeCun and Rob Fergus. The principal author, Pierre Sermanet, is currently first on the Dogs vs. Cats leaderboard . The other available implementation we know of comes from Berkeley. It’s called Caffe and is a successor to decaf . Yangqing Jia , the main author of these, is also near the top of the leaderboard. Both networks were trained on ImageNet , which is an image database organized according to the WordNet hierarchy . It was the ImageNet Large Scale Visual Recognition Challenge 2012 in which Alex Krizhevsky crushed the competition with his network. His error was 16%, the second best - 26%. Data The Kaggle competition featur
3 0.26121402 52 fast ml-2014-02-02-Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition
Introduction: Out of 215 contestants, we placed 8th in the Cats and Dogs competition at Kaggle. The top ten finish gave us the master badge. The competition was about discerning the animals in images and here’s how we did it. We extracted the features using pre-trained deep convolutional networks, specifically decaf and OverFeat . Then we trained some classifiers on these features. The whole thing was inspired by Kyle Kastner’s decaf + pylearn2 combo and we expanded this idea. The classifiers were linear models from scikit-learn and a neural network from Pylearn2 . At the end we created a voting ensemble of the individual models. OverFeat features We touched on OverFeat in Classifying images with a pre-trained deep network . A better way to use it in this competition’s context is to extract the features from the layer before the classifier, as Pierre Sermanet suggested in the comments. Concretely, in the larger OverFeat model ( -l ) layer 24 is the softmax, at least in the
4 0.25537318 40 fast ml-2013-10-06-Pylearn2 in practice
Introduction: What do you get when you mix one part brilliant and one part daft? You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. Here we’ll show how to get through the daft part with your mental health relatively intact. Pylearn2 comes from the Lisa Lab in Montreal , led by Yoshua Bengio. Those are pretty smart guys and they concern themselves with deep learning. Recently they published a paper entitled Pylearn2: a machine learning research library [arxiv] . Here’s a quote: Pylearn2 is a machine learning research library - its users are researchers . This means (…) it is acceptable to assume that the user has some technical sophistication and knowledge of machine learning. The word research is possibly the most common word in the paper. There’s a reason for that: the library is certainly not production-ready. OK, it’s not that bad. There are only two difficult things: getting your data in getting predictions out What’
5 0.25028831 9 fast ml-2012-10-25-So you want to work for Facebook
Introduction: Good news, everyone! There’s a new contest on Kaggle - Facebook is looking for talent . They won’t pay, but just might interview. This post is in a way a bonus for active readers because most visitors of fastml.com originally come from Kaggle forums. For this competition the forums are disabled to encourage own work . To honor this, we won’t publish any code. But own work doesn’t mean original work , and we wouldn’t want to reinvent the wheel, would we? The contest differs substantially from a Kaggle stereotype, if there is such a thing, in three major ways: there’s no money prizes, as mentioned above it’s not a real world problem, but rather an assignment to screen job candidates (this has important consequences, described below) it’s not a typical machine learning project, but rather a broader AI exercise You are given a graph of the internet, actually a snapshot of the graph for each of 15 time steps. You are also given a bunch of paths in this graph, which a
6 0.23517039 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet
7 0.23058571 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python
8 0.22696555 55 fast ml-2014-03-20-Good representations, distance, metric learning and supervised dimensionality reduction
9 0.22038354 43 fast ml-2013-11-02-Maxing out the digits
10 0.21800725 27 fast ml-2013-05-01-Deep learning made easy
11 0.21780178 60 fast ml-2014-04-30-Converting categorical data into numbers with Pandas and Scikit-learn
12 0.21386285 20 fast ml-2013-02-18-Predicting advertised salaries
13 0.21004476 48 fast ml-2013-12-28-Regularizing neural networks with dropout and with DropConnect
14 0.20554729 36 fast ml-2013-08-23-A bag of words and a nice little network
15 0.20548612 18 fast ml-2013-01-17-A very fast denoising autoencoder
16 0.204705 31 fast ml-2013-06-19-Go non-linear with Vowpal Wabbit
17 0.20320536 47 fast ml-2013-12-15-A-B testing with bayesian bandits in Google Analytics
18 0.20226315 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow
19 0.20072186 10 fast ml-2012-11-17-The Facebook challenge HOWTO
20 0.19862667 13 fast ml-2012-12-27-Spearmint with a random forest