fast_ml fast_ml-2013 fast_ml-2013-40 knowledge-graph by maker-knowledge-mining

40 fast ml-2013-10-06-Pylearn2 in practice

meta infos for this blog

Source: html

Introduction: What do you get when you mix one part brilliant and one part daft? You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. Here we’ll show how to get through the daft part with your mental health relatively intact. Pylearn2 comes from the Lisa Lab in Montreal , led by Yoshua Bengio. Those are pretty smart guys and they concern themselves with deep learning. Recently they published a paper entitled Pylearn2: a machine learning research library [arxiv] . Here’s a quote: Pylearn2 is a machine learning research library - its users are researchers . This means (…) it is acceptable to assume that the user has some technical sophistication and knowledge of machine learning. The word research is possibly the most common word in the paper. There’s a reason for that: the library is certainly not production-ready. OK, it’s not that bad. There are only two difficult things: getting your data in getting predictions out What’

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 What do you get when you mix one part brilliant and one part daft? [sent-1, score-0.346]

2 You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. [sent-2, score-0.325]

3 Here we’ll show how to get through the daft part with your mental health relatively intact. [sent-3, score-0.301]

4 Recently they published a paper entitled Pylearn2: a machine learning research library [arxiv] . [sent-6, score-0.433]

5 Here’s a quote: Pylearn2 is a machine learning research library - its users are researchers . [sent-7, score-0.365]

6 The word research is possibly the most common word in the paper. [sent-9, score-0.27]

7 There are only two difficult things: getting your data in getting predictions out What’s attractive about Pylearn2 then? [sent-12, score-0.57]

8 We found the softmax regression tutorial helpful for getting started. [sent-17, score-0.615]

9 Getting your data in To get your data in, you need to write a Python wrapper class for your dataset. [sent-20, score-0.318]

10 Good news: we provide a wrapper for the adult dataset. [sent-21, score-0.335]

11 This wrapper is pretty much ready to be used with other binary classification sets stored as CSV. [sent-22, score-0.42]

12 The wrapper is mainly responsible for loading data. [sent-24, score-0.308]

13 Things like data location and names of training, validation and test sets we prefer to put in the YAML config file. [sent-25, score-0.282]

14 We think it makes more sense to enter a test set path on command line. [sent-27, score-0.269]

15 The details are pretty well described in the softmax regression tutorial . [sent-36, score-0.475]

16 AdultDataset part refers to a Python file and a Python class in that file. [sent-40, score-0.338]

17 , with_labels: 0 } The test set we use has labels; if you’d like to predict on a test set without labels, add with_labels: 0 in the dataset parameters. [sent-51, score-0.272]

18 yaml During training the library will output a bunch of diagnostics for each set, each epoch: epochs seen: 5 time trained: 35. [sent-54, score-0.495]

19 However actually getting those predictions was not paramount in developers’ minds - there’s no single predict script, only a couple of hacks. [sent-75, score-0.355]

20 txt First goes a path to a trained model, then a path to a test file, then where you want predictions. [sent-81, score-0.453]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('pythonpath', 0.338), ('library', 0.257), ('yaml', 0.238), ('wrapper', 0.216), ('getting', 0.215), ('path', 0.184), ('softmax', 0.179), ('daft', 0.162), ('predictions', 0.14), ('part', 0.139), ('obj', 0.135), ('regression', 0.129), ('adult', 0.119), ('dir', 0.108), ('montreal', 0.108), ('research', 0.108), ('class', 0.102), ('dataset', 0.102), ('encoding', 0.099), ('file', 0.097), ('labels', 0.092), ('tutorial', 0.092), ('loading', 0.092), ('test', 0.085), ('python', 0.081), ('word', 0.081), ('pretty', 0.075), ('classes', 0.072), ('needs', 0.068), ('assume', 0.068), ('native', 0.068), ('specifies', 0.068), ('brilliant', 0.068), ('config', 0.068), ('cutting', 0.068), ('configuration', 0.068), ('xrange', 0.068), ('suit', 0.068), ('choosing', 0.068), ('export', 0.068), ('lisa', 0.068), ('dtype', 0.068), ('location', 0.068), ('entitled', 0.068), ('epoch', 0.068), ('merged', 0.068), ('obviously', 0.068), ('stored', 0.068), ('label', 0.065), ('sets', 0.061)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000002 40 fast ml-2013-10-06-Pylearn2 in practice

2 0.18208717 43 fast ml-2013-11-02-Maxing out the digits

Introduction: Recently we’ve been investigating the basics of Pylearn2 . Now it’s time for a more advanced example: a multilayer perceptron with dropout and maxout activation for the MNIST digits. Maxout explained If you’ve been following developments in deep learning, you know that Hinton’s most recent recommendation for supervised learning, after a few years of bashing backpropagation in favour of unsupervised pretraining, is to use classic multilayer perceptrons with dropout and rectified linear units. For us, this breath of simplicity is a welcome change. Rectified linear is f(x) = max( 0, x ) . This makes backpropagation trivial: for x > 0, the derivative is one, else zero. Note that ReLU consists of two linear functions. But why stop at two? Let’s take max. out of three, or four, or five linear functions… And so maxout is a generalization of ReLU. It can approximate any convex function. Now backpropagation is easy and dropout prevents overfitting, so we can train a deep

3 0.15235405 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

Introduction: This time we enter the Stack Overflow challenge , which is about predicting a status of a given question on SO. There are five possible statuses, so it’s a multi-class classification problem. We would prefer a tool able to perform multiclass classification by itself. It can be done by hand by constructing five datasets, each with binary labels (one class against all others), and then combining predictions, but it might be a bit tricky to get right - we tried. Fortunately, nice people at Yahoo, excuse us, Microsoft, recently relased a new version of Vowpal Wabbit , and this new version supports multiclass classification. In case you’re wondering, Vowpal Wabbit is a fast linear learner. We like the “fast” part and “linear” is OK for dealing with lots of words, as in this contest. In any case, with more than three million data points it wouldn’t be that easy to train a kernel SVM, a neural net or what have you. VW, being a well-polished tool, has a few very convenient features.

4 0.14077786 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

Introduction: We have already written a few articles about Pylearn2 . Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end. They’re like day and night: Pylearn2 - Byzantinely complicated, PyBrain - simple. We attempted to train a regression model and succeeded at first take (more on this below). Try this with Pylearn2. While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. On the downside, PyBrain feels unfinished, abandoned. It is no longer actively developed and the documentation is skimpy. There’s no modern gimmicks like dropout and rectified linear units - just good ol’ sigmoid and ta

5 0.12270396 27 fast ml-2013-05-01-Deep learning made easy

Introduction: As usual, there’s an interesting competition at Kaggle: The Black Box. It’s connected to ICML 2013 Workshop on Challenges in Representation Learning, held by the deep learning guys from Montreal. There are a couple benchmarks for this competition and the best one is unusually hard to beat 1 - only less than a fourth of those taking part managed to do so. We’re among them. Here’s how. The key ingredient in our success is a recently developed secret Stanford technology for deep unsupervised learning: sparse filtering by Jiquan Ngiam et al. Actually, it’s not secret. It’s available at Github , and has one or two very appealling properties. Let us explain. The main idea of deep unsupervised learning, as we understand it, is feature extraction. One of the most common applications is in multimedia. The reason for that is that multimedia tasks, for example object recognition, are easy for humans, but difficult for computers 2 . Geoff Hinton from Toronto talks about two ends

6 0.12068342 26 fast ml-2013-04-17-Regression as classification

7 0.11722906 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

8 0.10195354 34 fast ml-2013-07-14-Running things on a GPU

9 0.094692826 33 fast ml-2013-07-09-Introducing phraug

10 0.093630932 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

11 0.091340959 20 fast ml-2013-02-18-Predicting advertised salaries

12 0.08619158 10 fast ml-2012-11-17-The Facebook challenge HOWTO

13 0.085884161 25 fast ml-2013-04-10-Gender discrimination

14 0.085027091 62 fast ml-2014-05-26-Yann LeCun's answers from the Reddit AMA

15 0.082436576 35 fast ml-2013-08-12-Accelerometer Biometric Competition

16 0.079984538 32 fast ml-2013-07-05-Processing large files, line by line

17 0.079884879 22 fast ml-2013-03-07-Choosing a machine learning algorithm

18 0.077678643 46 fast ml-2013-12-07-13 NIPS papers that caught our eye

19 0.076775931 30 fast ml-2013-06-01-Amazon aspires to automate access control

20 0.071886733 38 fast ml-2013-09-09-Predicting solar energy from weather forecasts plus a NetCDF4 tutorial

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.337), (1, -0.01), (2, 0.057), (3, 0.023), (4, -0.168), (5, 0.046), (6, 0.011), (7, 0.155), (8, -0.286), (9, 0.209), (10, -0.144), (11, 0.164), (12, -0.065), (13, 0.18), (14, -0.005), (15, -0.074), (16, 0.081), (17, -0.001), (18, 0.085), (19, -0.052), (20, -0.217), (21, 0.049), (22, 0.033), (23, -0.007), (24, -0.236), (25, 0.031), (26, -0.052), (27, 0.186), (28, 0.016), (29, 0.077), (30, -0.044), (31, 0.146), (32, 0.014), (33, -0.019), (34, 0.065), (35, 0.027), (36, -0.073), (37, 0.165), (38, -0.033), (39, -0.091), (40, -0.106), (41, 0.019), (42, 0.16), (43, 0.182), (44, -0.216), (45, -0.046), (46, -0.021), (47, 0.287), (48, 0.202), (49, -0.156)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.97731835 40 fast ml-2013-10-06-Pylearn2 in practice

2 0.35345569 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

Introduction: The promise What’s attractive in machine learning? That a machine is learning, instead of a human. But an operator still has a lot of work to do. First, he has to learn how to teach a machine, in general. Then, when it comes to a concrete task, there are two main areas where a human needs to do the work (and remember, laziness is a virtue, at least for a programmer, so we’d like to minimize amount of work done by a human): data preparation model tuning This story is about model tuning. Typically, to achieve satisfactory results, first we need to convert raw data into format accepted by the model we would like to use, and then tune a few hyperparameters of the model. For example, some hyperparams to tune for a random forest may be a number of trees to grow and a number of candidate features at each split ( mtry in R randomForest). For a neural network, there are quite a lot of hyperparams: number of layers, number of neurons in each layer (specifically, in each hid

3 0.32331467 43 fast ml-2013-11-02-Maxing out the digits

4 0.31403446 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

5 0.28731191 27 fast ml-2013-05-01-Deep learning made easy

6 0.27532452 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

7 0.23819891 26 fast ml-2013-04-17-Regression as classification

8 0.20960391 35 fast ml-2013-08-12-Accelerometer Biometric Competition

9 0.20261332 34 fast ml-2013-07-14-Running things on a GPU

10 0.20159177 62 fast ml-2014-05-26-Yann LeCun's answers from the Reddit AMA

11 0.18675017 15 fast ml-2013-01-07-Machine learning courses online

12 0.18622965 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

13 0.17887501 22 fast ml-2013-03-07-Choosing a machine learning algorithm

14 0.17874941 25 fast ml-2013-04-10-Gender discrimination

15 0.17022431 24 fast ml-2013-03-25-Dimensionality reduction for sparse binary data - an overview

16 0.16362688 52 fast ml-2014-02-02-Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition

17 0.15733522 20 fast ml-2013-02-18-Predicting advertised salaries

18 0.15725389 32 fast ml-2013-07-05-Processing large files, line by line

19 0.15381491 10 fast ml-2012-11-17-The Facebook challenge HOWTO

20 0.15229517 33 fast ml-2013-07-09-Introducing phraug

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(6, 0.034), (26, 0.066), (31, 0.071), (35, 0.056), (48, 0.024), (51, 0.018), (55, 0.046), (69, 0.146), (71, 0.049), (78, 0.023), (97, 0.364), (99, 0.031)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.86101764 40 fast ml-2013-10-06-Pylearn2 in practice

2 0.43126953 9 fast ml-2012-10-25-So you want to work for Facebook

Introduction: Good news, everyone! There’s a new contest on Kaggle - Facebook is looking for talent . They won’t pay, but just might interview. This post is in a way a bonus for active readers because most visitors of fastml.com originally come from Kaggle forums. For this competition the forums are disabled to encourage own work . To honor this, we won’t publish any code. But own work doesn’t mean original work , and we wouldn’t want to reinvent the wheel, would we? The contest differs substantially from a Kaggle stereotype, if there is such a thing, in three major ways: there’s no money prizes, as mentioned above it’s not a real world problem, but rather an assignment to screen job candidates (this has important consequences, described below) it’s not a typical machine learning project, but rather a broader AI exercise You are given a graph of the internet, actually a snapshot of the graph for each of 15 time steps. You are also given a bunch of paths in this graph, which a

3 0.42384526 43 fast ml-2013-11-02-Maxing out the digits

4 0.42084062 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

Introduction: Object recognition in images is where deep learning, and specifically convolutional neural networks, are often applied and benchmarked these days. To get a piece of the action, we’ll be using Alex Krizhevsky’s cuda-convnet , a shining diamond of machine learning software, in a Kaggle competition. Continuing to run things on a GPU, we turn to applying convolutional neural networks for object recognition. This kind of network was developed by Yann LeCun and it’s powerful, but a bit complicated: Image credit: EBLearn tutorial A typical convolutional network has two parts. The first is responsible for feature extraction and consists of one or more pairs of convolution and subsampling/max-pooling layers, as you can see above. The second part is just a classic fully-connected multilayer perceptron taking extracted features as input. For a detailed explanation of all this see unit 9 in Hugo LaRochelle’s neural networks course . Daniel Nouri has an interesting story about

5 0.41285884 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

6 0.409556 19 fast ml-2013-02-07-The secret of the big guys

7 0.40313765 48 fast ml-2013-12-28-Regularizing neural networks with dropout and with DropConnect

8 0.3997077 27 fast ml-2013-05-01-Deep learning made easy

9 0.39785942 18 fast ml-2013-01-17-A very fast denoising autoencoder

10 0.3967872 13 fast ml-2012-12-27-Spearmint with a random forest

11 0.39061755 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

12 0.39011669 23 fast ml-2013-03-18-Large scale L1 feature selection with Vowpal Wabbit

13 0.38863635 34 fast ml-2013-07-14-Running things on a GPU

14 0.38502851 20 fast ml-2013-02-18-Predicting advertised salaries

15 0.38489872 17 fast ml-2013-01-14-Feature selection in practice

16 0.38043505 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

17 0.37682116 52 fast ml-2014-02-02-Yesterday a kaggler, today a Kaggle master: a wrap-up of the cats and dogs competition

18 0.37414891 55 fast ml-2014-03-20-Good representations, distance, metric learning and supervised dimensionality reduction

19 0.37120819 26 fast ml-2013-04-17-Regression as classification

20 0.3687841 49 fast ml-2014-01-10-Classifying images with a pre-trained deep network