fast_ml fast_ml-2014 fast_ml-2014-54 knowledge-graph by maker-knowledge-mining

54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python


meta infos for this blog

Source: html

Introduction: We have already written a few articles about Pylearn2 . Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end. They’re like day and night: Pylearn2 - Byzantinely complicated, PyBrain - simple. We attempted to train a regression model and succeeded at first take (more on this below). Try this with Pylearn2. While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. On the downside, PyBrain feels unfinished, abandoned. It is no longer actively developed and the documentation is skimpy. There’s no modern gimmicks like dropout and rectified linear units - just good ol’ sigmoid and ta


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We have already written a few articles about Pylearn2 . [sent-1, score-0.111]

2 While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. [sent-7, score-0.185]

3 The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. [sent-8, score-0.288]

4 There is also an independent project named cybrain , written in C but callable from Python. [sent-15, score-0.169]

5 Juergen Schmidhuber One of the reasons PyBrain is interesting is that the library comes from Juergen Schmidhuber ’s students. [sent-17, score-0.243]

6 The man is one of the leading authorities, if not the leading authority, on recurrent neural networks. [sent-20, score-0.337]

7 He has been in the field for as long as Bengio and LeCunn: Schmidhuber was born in 1963, Bengio in 1964, LeCun in 1960 (Hinton is older, if you’re wondering: born 1947). [sent-21, score-0.28]

8 After watching the TED talk about explaining the universe we’re not sure if Schmidhuber is a genius or a madman. [sent-22, score-0.233]

9 He’s certainly a showman - go on, take a look, he tells a joke in the beginning. [sent-23, score-0.14]

10 You can get the vibe from visiting his page, on the net since 1405 . [sent-25, score-0.178]

11 It has more on the universe , beauty and other stuff. [sent-26, score-0.257]

12 To get familiar with the library we will use kin8nm , a small regression benchmark dataset we tackled a few times before using various methods. [sent-32, score-0.269]

13 Here’s how to prepare a dataset: ds = SupervisedDataSet( input_size, target_size ) ds. [sent-43, score-0.21]

14 reshape( -1, 1 ) And to train a network: hidden_size = 100 # arbitrarily chosen net = buildNetwork( input_size, hidden_size, target_size, bias = True ) trainer = BackpropTrainer( net, ds ) trainer. [sent-46, score-0.655]

15 15, maxEpochs = 1000, continueEpochs = 10 ) The trainer has a convenient method trainUntilConvergence . [sent-48, score-0.21]

16 It automatically sets aside some examples for validation and is supposed to train until the validation error stops decreasing. [sent-49, score-0.625]

17 The error fluctuates, so there’s the continueEpochs param which tells the trainer how many epochs to wait to get a new best score - stop otherwise. [sent-50, score-0.559]

18 It seems that the network doesn’t overfit, at least up to a point: training and validation errors went down hand in hand, training more steadily than validation though. [sent-53, score-0.417]

19 The first 100 epochs Epochs 100-1000 Predictions Producing predictions, especially for regression tasks, seems to be also missing from the tutorial. [sent-54, score-0.224]

20 Since trainUntilConvergence splits data for validation automatically, we join train and validation sets first. [sent-66, score-0.393]


similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('pybrain', 0.42), ('schmidhuber', 0.35), ('ds', 0.21), ('juergen', 0.21), ('trainer', 0.21), ('library', 0.185), ('validation', 0.141), ('born', 0.14), ('continueepochs', 0.14), ('tells', 0.14), ('trainuntilconvergence', 0.14), ('universe', 0.14), ('epochs', 0.14), ('net', 0.119), ('leading', 0.117), ('beauty', 0.117), ('written', 0.111), ('recurrent', 0.103), ('talk', 0.093), ('automatically', 0.093), ('bengio', 0.085), ('regression', 0.084), ('predictions', 0.08), ('went', 0.079), ('rmse', 0.074), ('supposed', 0.07), ('error', 0.069), ('true', 0.066), ('since', 0.059), ('reasons', 0.058), ('authority', 0.058), ('tom', 0.058), ('arbitrarily', 0.058), ('join', 0.058), ('beyond', 0.058), ('older', 0.058), ('reshape', 0.058), ('aside', 0.058), ('bias', 0.058), ('facial', 0.058), ('neither', 0.058), ('callable', 0.058), ('algorithmic', 0.058), ('face', 0.058), ('modern', 0.058), ('overfit', 0.058), ('principle', 0.058), ('pure', 0.058), ('hand', 0.056), ('sets', 0.053)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 1.0000001 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

Introduction: We have already written a few articles about Pylearn2 . Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end. They’re like day and night: Pylearn2 - Byzantinely complicated, PyBrain - simple. We attempted to train a regression model and succeeded at first take (more on this below). Try this with Pylearn2. While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. On the downside, PyBrain feels unfinished, abandoned. It is no longer actively developed and the documentation is skimpy. There’s no modern gimmicks like dropout and rectified linear units - just good ol’ sigmoid and ta

2 0.14077786 40 fast ml-2013-10-06-Pylearn2 in practice

Introduction: What do you get when you mix one part brilliant and one part daft? You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. Here we’ll show how to get through the daft part with your mental health relatively intact. Pylearn2 comes from the Lisa Lab in Montreal , led by Yoshua Bengio. Those are pretty smart guys and they concern themselves with deep learning. Recently they published a paper entitled Pylearn2: a machine learning research library [arxiv] . Here’s a quote: Pylearn2 is a machine learning research library - its users are researchers . This means (…) it is acceptable to assume that the user has some technical sophistication and knowledge of machine learning. The word research is possibly the most common word in the paper. There’s a reason for that: the library is certainly not production-ready. OK, it’s not that bad. There are only two difficult things: getting your data in getting predictions out What’

3 0.12880133 43 fast ml-2013-11-02-Maxing out the digits

Introduction: Recently we’ve been investigating the basics of Pylearn2 . Now it’s time for a more advanced example: a multilayer perceptron with dropout and maxout activation for the MNIST digits. Maxout explained If you’ve been following developments in deep learning, you know that Hinton’s most recent recommendation for supervised learning, after a few years of bashing backpropagation in favour of unsupervised pretraining, is to use classic multilayer perceptrons with dropout and rectified linear units. For us, this breath of simplicity is a welcome change. Rectified linear is f(x) = max( 0, x ) . This makes backpropagation trivial: for x > 0, the derivative is one, else zero. Note that ReLU consists of two linear functions. But why stop at two? Let’s take max. out of three, or four, or five linear functions… And so maxout is a generalization of ReLU. It can approximate any convex function. Now backpropagation is easy and dropout prevents overfitting, so we can train a deep

4 0.091186486 14 fast ml-2013-01-04-Madelon: Spearmint's revenge

Introduction: Little Spearmint couldn’t sleep that night. I was so close… - he was thinking. It seemed that he had found a better than default value for one of the random forest hyperparams, but it turned out to be false. He made a decision as he fell asleep: Next time, I will show them! The way to do this is to use a dataset that is known to produce lower error with high mtry values, namely previously mentioned Madelon from NIPS 2003 Feature Selection Challenge. Among 500 attributes, only 20 are informative, the rest are noise. That’s the reason why high mtry is good here: you have to consider a lot of features to find a meaningful one. The dataset consists of a train, validation and test parts, with labels being available for train and validation. We will further split the training set into our train and validation sets, and use the original validation set as a test set to evaluate final results of parameter tuning. As an error measure we use Area Under Curve , or AUC, which was

5 0.086288206 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

Introduction: The promise What’s attractive in machine learning? That a machine is learning, instead of a human. But an operator still has a lot of work to do. First, he has to learn how to teach a machine, in general. Then, when it comes to a concrete task, there are two main areas where a human needs to do the work (and remember, laziness is a virtue, at least for a programmer, so we’d like to minimize amount of work done by a human): data preparation model tuning This story is about model tuning. Typically, to achieve satisfactory results, first we need to convert raw data into format accepted by the model we would like to use, and then tune a few hyperparameters of the model. For example, some hyperparams to tune for a random forest may be a number of trees to grow and a number of candidate features at each split ( mtry in R randomForest). For a neural network, there are quite a lot of hyperparams: number of layers, number of neurons in each layer (specifically, in each hid

6 0.071983621 27 fast ml-2013-05-01-Deep learning made easy

7 0.07122045 48 fast ml-2013-12-28-Regularizing neural networks with dropout and with DropConnect

8 0.071058683 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

9 0.070592798 62 fast ml-2014-05-26-Yann LeCun's answers from the Reddit AMA

10 0.070577957 46 fast ml-2013-12-07-13 NIPS papers that caught our eye

11 0.069604523 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

12 0.067749999 26 fast ml-2013-04-17-Regression as classification

13 0.067543872 13 fast ml-2012-12-27-Spearmint with a random forest

14 0.063952386 50 fast ml-2014-01-20-How to get predictions from Pylearn2

15 0.059740979 20 fast ml-2013-02-18-Predicting advertised salaries

16 0.056202922 57 fast ml-2014-04-01-Exclusive Geoff Hinton interview

17 0.056099348 18 fast ml-2013-01-17-A very fast denoising autoencoder

18 0.055709563 15 fast ml-2013-01-07-Machine learning courses online

19 0.054845016 23 fast ml-2013-03-18-Large scale L1 feature selection with Vowpal Wabbit

20 0.053829309 19 fast ml-2013-02-07-The secret of the big guys


similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.251), (1, 0.093), (2, -0.003), (3, 0.019), (4, -0.076), (5, -0.042), (6, 0.034), (7, 0.127), (8, -0.187), (9, 0.141), (10, -0.115), (11, 0.239), (12, -0.161), (13, 0.062), (14, -0.124), (15, 0.125), (16, 0.19), (17, -0.076), (18, -0.017), (19, 0.077), (20, -0.145), (21, -0.067), (22, 0.046), (23, -0.154), (24, -0.208), (25, -0.1), (26, -0.283), (27, 0.022), (28, -0.217), (29, -0.191), (30, 0.044), (31, -0.06), (32, 0.103), (33, -0.032), (34, -0.272), (35, 0.188), (36, 0.051), (37, -0.004), (38, -0.145), (39, 0.129), (40, 0.245), (41, -0.042), (42, -0.117), (43, -0.157), (44, 0.312), (45, -0.049), (46, -0.105), (47, -0.057), (48, -0.117), (49, -0.013)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.96509367 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

Introduction: We have already written a few articles about Pylearn2 . Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end. They’re like day and night: Pylearn2 - Byzantinely complicated, PyBrain - simple. We attempted to train a regression model and succeeded at first take (more on this below). Try this with Pylearn2. While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. On the downside, PyBrain feels unfinished, abandoned. It is no longer actively developed and the documentation is skimpy. There’s no modern gimmicks like dropout and rectified linear units - just good ol’ sigmoid and ta

2 0.21426825 40 fast ml-2013-10-06-Pylearn2 in practice

Introduction: What do you get when you mix one part brilliant and one part daft? You get Pylearn2, a cutting edge neural networks library from Montreal that’s rather hard to use. Here we’ll show how to get through the daft part with your mental health relatively intact. Pylearn2 comes from the Lisa Lab in Montreal , led by Yoshua Bengio. Those are pretty smart guys and they concern themselves with deep learning. Recently they published a paper entitled Pylearn2: a machine learning research library [arxiv] . Here’s a quote: Pylearn2 is a machine learning research library - its users are researchers . This means (…) it is acceptable to assume that the user has some technical sophistication and knowledge of machine learning. The word research is possibly the most common word in the paper. There’s a reason for that: the library is certainly not production-ready. OK, it’s not that bad. There are only two difficult things: getting your data in getting predictions out What’

3 0.19489272 43 fast ml-2013-11-02-Maxing out the digits

Introduction: Recently we’ve been investigating the basics of Pylearn2 . Now it’s time for a more advanced example: a multilayer perceptron with dropout and maxout activation for the MNIST digits. Maxout explained If you’ve been following developments in deep learning, you know that Hinton’s most recent recommendation for supervised learning, after a few years of bashing backpropagation in favour of unsupervised pretraining, is to use classic multilayer perceptrons with dropout and rectified linear units. For us, this breath of simplicity is a welcome change. Rectified linear is f(x) = max( 0, x ) . This makes backpropagation trivial: for x > 0, the derivative is one, else zero. Note that ReLU consists of two linear functions. But why stop at two? Let’s take max. out of three, or four, or five linear functions… And so maxout is a generalization of ReLU. It can approximate any convex function. Now backpropagation is easy and dropout prevents overfitting, so we can train a deep

4 0.18743233 14 fast ml-2013-01-04-Madelon: Spearmint's revenge

Introduction: Little Spearmint couldn’t sleep that night. I was so close… - he was thinking. It seemed that he had found a better than default value for one of the random forest hyperparams, but it turned out to be false. He made a decision as he fell asleep: Next time, I will show them! The way to do this is to use a dataset that is known to produce lower error with high mtry values, namely previously mentioned Madelon from NIPS 2003 Feature Selection Challenge. Among 500 attributes, only 20 are informative, the rest are noise. That’s the reason why high mtry is good here: you have to consider a lot of features to find a meaningful one. The dataset consists of a train, validation and test parts, with labels being available for train and validation. We will further split the training set into our train and validation sets, and use the original validation set as a test set to evaluate final results of parameter tuning. As an error measure we use Area Under Curve , or AUC, which was

5 0.14345974 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

Introduction: The promise What’s attractive in machine learning? That a machine is learning, instead of a human. But an operator still has a lot of work to do. First, he has to learn how to teach a machine, in general. Then, when it comes to a concrete task, there are two main areas where a human needs to do the work (and remember, laziness is a virtue, at least for a programmer, so we’d like to minimize amount of work done by a human): data preparation model tuning This story is about model tuning. Typically, to achieve satisfactory results, first we need to convert raw data into format accepted by the model we would like to use, and then tune a few hyperparameters of the model. For example, some hyperparams to tune for a random forest may be a number of trees to grow and a number of candidate features at each split ( mtry in R randomForest). For a neural network, there are quite a lot of hyperparams: number of layers, number of neurons in each layer (specifically, in each hid

6 0.14218882 26 fast ml-2013-04-17-Regression as classification

7 0.13948061 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

8 0.13617404 48 fast ml-2013-12-28-Regularizing neural networks with dropout and with DropConnect

9 0.13548701 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

10 0.13076223 13 fast ml-2012-12-27-Spearmint with a random forest

11 0.12508994 34 fast ml-2013-07-14-Running things on a GPU

12 0.12469616 27 fast ml-2013-05-01-Deep learning made easy

13 0.12247452 24 fast ml-2013-03-25-Dimensionality reduction for sparse binary data - an overview

14 0.12193009 25 fast ml-2013-04-10-Gender discrimination

15 0.11937965 46 fast ml-2013-12-07-13 NIPS papers that caught our eye

16 0.11833676 62 fast ml-2014-05-26-Yann LeCun's answers from the Reddit AMA

17 0.11769053 49 fast ml-2014-01-10-Classifying images with a pre-trained deep network

18 0.11697053 23 fast ml-2013-03-18-Large scale L1 feature selection with Vowpal Wabbit

19 0.11629788 18 fast ml-2013-01-17-A very fast denoising autoencoder

20 0.11535916 20 fast ml-2013-02-18-Predicting advertised salaries


similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(26, 0.036), (31, 0.582), (35, 0.018), (48, 0.023), (55, 0.018), (58, 0.014), (69, 0.126), (71, 0.031), (73, 0.023), (99, 0.037)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.97202128 29 fast ml-2013-05-25-More on sparse filtering and the Black Box competition

Introduction: The Black Box challenge has just ended. We were thoroughly thrilled to learn that the winner, doubleshot , used sparse filtering, apparently following our cue. His score in terms of accuracy is 0.702, ours 0.645, and the best benchmark 0.525. We ranked 15th out of 217, a few places ahead of the Toronto team consisting of Charlie Tang and Nitish Srivastava . To their credit, Charlie has won the two remaining Challenges in Representation Learning . Not-so-deep learning The difference to our previous, beating-the-benchmark attempt is twofold: one layer instead of two for supervised learning, VW instead of a random forest Somewhat suprisingly, one layer works better than two. Even more surprisingly, with enough units you can get 0.634 using a linear model (Vowpal Wabbit, of course, One-Against-All). In our understanding, that’s the point of overcomplete representations*, which Stanford people seem to care much about. Recall The secret of the big guys and the pape

same-blog 2 0.92081499 54 fast ml-2014-03-06-PyBrain - a simple neural networks library in Python

Introduction: We have already written a few articles about Pylearn2 . Today we’ll look at PyBrain. It is another Python neural networks library, and this is where similiarites end. They’re like day and night: Pylearn2 - Byzantinely complicated, PyBrain - simple. We attempted to train a regression model and succeeded at first take (more on this below). Try this with Pylearn2. While there are a few machine learning libraries out there, PyBrain aims to be a very easy-to-use modular library that can be used by entry-level students but still offers the flexibility and algorithms for state-of-the-art research. The library features classic perceptron as well as recurrent neural networks and other things, some of which, for example Evolino , would be hard to find elsewhere. On the downside, PyBrain feels unfinished, abandoned. It is no longer actively developed and the documentation is skimpy. There’s no modern gimmicks like dropout and rectified linear units - just good ol’ sigmoid and ta

3 0.42764792 48 fast ml-2013-12-28-Regularizing neural networks with dropout and with DropConnect

Introduction: We continue with CIFAR-10-based competition at Kaggle to get to know DropConnect. It’s supposed to be an improvement over dropout. And dropout is certainly one of the bigger steps forward in neural network development. Is DropConnect really better than dropout? TL;DR DropConnect seems to offer results similiar to dropout. State of the art scores reported in the paper come from model ensembling. Dropout Dropout , by Hinton et al., is perhaps a biggest invention in the field of neural networks in recent years. It adresses the main problem in machine learning, that is overfitting. It does so by “dropping out” some unit activations in a given layer, that is setting them to zero. Thus it prevents co-adaptation of units and can also be seen as a method of ensembling many networks sharing the same weights. For each training example a different set of units to drop is randomly chosen. The idea has a biological inspiration . When a child is conceived, it receives half its genes f

4 0.40789902 26 fast ml-2013-04-17-Regression as classification

Introduction: An interesting development occured in Job salary prediction at Kaggle: the guy who ranked 3rd used logistic regression , in spite of the task being regression, not classification. We attempt to replicate the experiment. The idea is to discretize salaries into a number of bins, just like with a histogram. Guocong Song , the man, used 30 bins. We like a convenient uniform bin width of 0.1, as a minimum log salary in the training set is 8.5 and a maximum is 12.2. Since there are few examples in the high end, we stop at 12.0, so that gives us 36 bins. Here’s the code: import numpy as np min_salary = 8.5 max_salary = 12.0 interval = 0.1 a_range = np.arange( min_salary, max_salary + interval, interval ) class_mapping = {} for i, n in enumerate( a_range ): n = round( n, 1 ) class_mapping[n] = i + 1 This way we get a mapping from log salaries to classes. Class labels start with 1, because Vowpal Wabbit expects that, and we intend to use VW. The code can be

5 0.40557641 19 fast ml-2013-02-07-The secret of the big guys

Introduction: Are you interested in linear models, or K-means clustering? Probably not much. These are very basic techniques with fancier alternatives. But here’s the bomb: when you combine those two methods for supervised learning, you can get better results than from a random forest. And maybe even faster. We have already written about Vowpal Wabbit , a fast linear learner from Yahoo/Microsoft. Google’s response (or at least, a Google’s guy response) seems to be Sofia-ML . The software consists of two parts: a linear learner and K-means clustering. We found Sofia a while ago and wondered about K-means: who needs K-means? Here’s a clue: This package can be used for learning cluster centers (…) and for mapping a given data set onto a new feature space based on the learned cluster centers. Our eyes only opened when we read a certain paper, namely An Analysis of Single-Layer Networks in Unsupervised Feature Learning ( PDF ). The paper, by Coates , Lee and Ng, is about object recogni

6 0.40396798 40 fast ml-2013-10-06-Pylearn2 in practice

7 0.39069808 45 fast ml-2013-11-27-Object recognition in images with cuda-convnet

8 0.38739169 34 fast ml-2013-07-14-Running things on a GPU

9 0.38500008 31 fast ml-2013-06-19-Go non-linear with Vowpal Wabbit

10 0.38402808 46 fast ml-2013-12-07-13 NIPS papers that caught our eye

11 0.37659436 61 fast ml-2014-05-08-Impute missing values with Amelia

12 0.37506711 18 fast ml-2013-01-17-A very fast denoising autoencoder

13 0.36986199 7 fast ml-2012-10-05-Predicting closed questions on Stack Overflow

14 0.36982563 23 fast ml-2013-03-18-Large scale L1 feature selection with Vowpal Wabbit

15 0.3547745 24 fast ml-2013-03-25-Dimensionality reduction for sparse binary data - an overview

16 0.35267937 58 fast ml-2014-04-12-Deep learning these days

17 0.34431899 27 fast ml-2013-05-01-Deep learning made easy

18 0.34101689 36 fast ml-2013-08-23-A bag of words and a nice little network

19 0.34065178 12 fast ml-2012-12-21-Tuning hyperparams automatically with Spearmint

20 0.32527119 50 fast ml-2014-01-20-How to get predictions from Pylearn2