hunch_net hunch_net-2006 hunch_net-2006-224 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Here are some papers that I found surprisingly interesting. Yoshua Bengio , Pascal Lamblin, Dan Popovici, Hugo Larochelle, Greedy Layer-wise Training of Deep Networks . Empirically investigates some of the design choices behind deep belief networks. Long Zhu , Yuanhao Chen, Alan Yuille Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing. An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . Shai Ben-David , John Blitzer , Koby Crammer , and Fernando Pereira , Analysis of Representations for Domain Adaptation . This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things.
sentIndex sentText sentNum sentScore
1 Here are some papers that I found surprisingly interesting. [sent-1, score-0.215]
2 Empirically investigates some of the design choices behind deep belief networks. [sent-3, score-0.662]
3 An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . [sent-5, score-0.897]
4 This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. [sent-7, score-0.966]
5 All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things. [sent-8, score-0.551]
wordName wordTfidf (topN-words)
[('unsupervised', 0.242), ('differently', 0.175), ('blitzer', 0.175), ('investigates', 0.175), ('adaptation', 0.175), ('chen', 0.175), ('crammer', 0.175), ('detecting', 0.175), ('koby', 0.175), ('theme', 0.175), ('pereira', 0.162), ('measurable', 0.162), ('detection', 0.162), ('deep', 0.16), ('bengio', 0.153), ('yoshua', 0.153), ('alan', 0.153), ('generically', 0.153), ('fernando', 0.146), ('pascal', 0.14), ('dan', 0.14), ('analysis', 0.136), ('remarkably', 0.135), ('filters', 0.135), ('behind', 0.135), ('greedy', 0.131), ('depends', 0.131), ('shai', 0.131), ('surprisingly', 0.124), ('objects', 0.121), ('domain', 0.121), ('object', 0.111), ('representations', 0.111), ('evaluation', 0.107), ('unlabeled', 0.107), ('turn', 0.102), ('choices', 0.1), ('power', 0.098), ('empirically', 0.096), ('probabilistic', 0.092), ('belief', 0.092), ('drawn', 0.092), ('john', 0.092), ('papers', 0.091), ('dataset', 0.089), ('networks', 0.088), ('supervised', 0.088), ('samples', 0.087), ('training', 0.079), ('seen', 0.076)]
simIndex simValue blogId blogTitle
same-blog 1 1.0 224 hunch net-2006-12-12-Interesting Papers at NIPS 2006
Introduction: Here are some papers that I found surprisingly interesting. Yoshua Bengio , Pascal Lamblin, Dan Popovici, Hugo Larochelle, Greedy Layer-wise Training of Deep Networks . Empirically investigates some of the design choices behind deep belief networks. Long Zhu , Yuanhao Chen, Alan Yuille Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing. An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . Shai Ben-David , John Blitzer , Koby Crammer , and Fernando Pereira , Analysis of Representations for Domain Adaptation . This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things.
2 0.158685 201 hunch net-2006-08-07-The Call of the Deep
Introduction: Many learning algorithms used in practice are fairly simple. Viewed representationally, many prediction algorithms either compute a linear separator of basic features (perceptron, winnow, weighted majority, SVM) or perhaps a linear separator of slightly more complex features (2-layer neural networks or kernelized SVMs). Should we go beyond this, and start using “deep” representations? What is deep learning? Intuitively, deep learning is about learning to predict in ways which can involve complex dependencies between the input (observed) features. Specifying this more rigorously turns out to be rather difficult. Consider the following cases: SVM with Gaussian Kernel. This is not considered deep learning, because an SVM with a gaussian kernel can’t succinctly represent certain decision surfaces. One of Yann LeCun ‘s examples is recognizing objects based on pixel values. An SVM will need a new support vector for each significantly different background. Since the number
3 0.14028822 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011
Introduction: Maybe it’s too early to call, but with four separate Neural Network sessions at this year’s ICML , it looks like Neural Networks are making a comeback. Here are my highlights of these sessions. In general, my feeling is that these papers both demystify deep learning and show its broader applicability. The first observation I made is that the once disreputable “Neural” nomenclature is being used again in lieu of “deep learning”. Maybe it’s because Adam Coates et al. showed that single layer networks can work surprisingly well. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , Adam Coates , Honglak Lee , Andrew Y. Ng (AISTATS 2011) The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , Adam Coates , Andrew Y. Ng (ICML 2011) Another surprising result out of Andrew Ng’s group comes from Andrew Saxe et al. who show that certain convolutional pooling architectures can obtain close to state-of-the-art pe
4 0.11035947 407 hunch net-2010-08-23-Boosted Decision Trees for Deep Learning
Introduction: About 4 years ago, I speculated that decision trees qualify as a deep learning algorithm because they can make decisions which are substantially nonlinear in the input representation. Ping Li has proved this correct, empirically at UAI by showing that boosted decision trees can beat deep belief networks on versions of Mnist which are artificially hardened so as to make them solvable only by deep learning algorithms. This is an important point, because the ability to solve these sorts of problems is probably the best objective definition of a deep learning algorithm we have. I’m not that surprised. In my experience, if you can accept the computational drawbacks of a boosted decision tree, they can achieve pretty good performance. Geoff Hinton once told me that the great thing about deep belief networks is that they work. I understand that Ping had very substantial difficulty in getting this published, so I hope some reviewers step up to the standard of valuing wha
5 0.1064233 143 hunch net-2005-12-27-Automated Labeling
Introduction: One of the common trends in machine learning has been an emphasis on the use of unlabeled data. The argument goes something like “there aren’t many labeled web pages out there, but there are a huge number of web pages, so we must find a way to take advantage of them.” There are several standard approaches for doing this: Unsupervised Learning . You use only unlabeled data. In a typical application, you cluster the data and hope that the clusters somehow correspond to what you care about. Semisupervised Learning. You use both unlabeled and labeled data to build a predictor. The unlabeled data influences the learned predictor in some way. Active Learning . You have unlabeled data and access to a labeling oracle. You interactively choose which examples to label so as to optimize prediction accuracy. It seems there is a fourth approach worth serious investigation—automated labeling. The approach goes as follows: Identify some subset of observed values to predict
6 0.096173212 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
7 0.092699975 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem
8 0.08761254 161 hunch net-2006-03-05-“Structural” Learning
9 0.084330983 444 hunch net-2011-09-07-KDD and MUCMD 2011
10 0.080325514 456 hunch net-2012-02-24-ICML+50%
11 0.078055263 45 hunch net-2005-03-22-Active learning
12 0.074160092 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
13 0.071845949 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three
14 0.069688663 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models
15 0.068595856 299 hunch net-2008-04-27-Watchword: Supervised Learning
16 0.068094581 332 hunch net-2008-12-23-Use of Learning Theory
17 0.066928372 431 hunch net-2011-04-18-A paper not at Snowbird
18 0.064099386 13 hunch net-2005-02-04-JMLG
19 0.063692354 165 hunch net-2006-03-23-The Approximation Argument
20 0.061541468 351 hunch net-2009-05-02-Wielding a New Abstraction
topicId topicWeight
[(0, 0.126), (1, 0.043), (2, 0.01), (3, -0.023), (4, 0.13), (5, 0.002), (6, -0.029), (7, 0.006), (8, 0.058), (9, -0.084), (10, -0.012), (11, -0.01), (12, -0.131), (13, -0.079), (14, -0.053), (15, 0.109), (16, -0.081), (17, 0.099), (18, -0.074), (19, 0.026), (20, 0.058), (21, 0.015), (22, 0.033), (23, -0.01), (24, -0.006), (25, -0.036), (26, -0.015), (27, 0.037), (28, 0.009), (29, -0.027), (30, 0.058), (31, 0.059), (32, 0.057), (33, 0.022), (34, 0.005), (35, -0.043), (36, 0.05), (37, 0.012), (38, -0.104), (39, -0.035), (40, 0.036), (41, 0.063), (42, -0.005), (43, 0.023), (44, -0.025), (45, 0.045), (46, 0.008), (47, -0.096), (48, 0.021), (49, -0.037)]
simIndex simValue blogId blogTitle
same-blog 1 0.97680587 224 hunch net-2006-12-12-Interesting Papers at NIPS 2006
Introduction: Here are some papers that I found surprisingly interesting. Yoshua Bengio , Pascal Lamblin, Dan Popovici, Hugo Larochelle, Greedy Layer-wise Training of Deep Networks . Empirically investigates some of the design choices behind deep belief networks. Long Zhu , Yuanhao Chen, Alan Yuille Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing. An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . Shai Ben-David , John Blitzer , Koby Crammer , and Fernando Pereira , Analysis of Representations for Domain Adaptation . This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things.
2 0.70917523 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011
Introduction: Maybe it’s too early to call, but with four separate Neural Network sessions at this year’s ICML , it looks like Neural Networks are making a comeback. Here are my highlights of these sessions. In general, my feeling is that these papers both demystify deep learning and show its broader applicability. The first observation I made is that the once disreputable “Neural” nomenclature is being used again in lieu of “deep learning”. Maybe it’s because Adam Coates et al. showed that single layer networks can work surprisingly well. An Analysis of Single-Layer Networks in Unsupervised Feature Learning , Adam Coates , Honglak Lee , Andrew Y. Ng (AISTATS 2011) The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , Adam Coates , Andrew Y. Ng (ICML 2011) Another surprising result out of Andrew Ng’s group comes from Andrew Saxe et al. who show that certain convolutional pooling architectures can obtain close to state-of-the-art pe
3 0.70171678 201 hunch net-2006-08-07-The Call of the Deep
Introduction: Many learning algorithms used in practice are fairly simple. Viewed representationally, many prediction algorithms either compute a linear separator of basic features (perceptron, winnow, weighted majority, SVM) or perhaps a linear separator of slightly more complex features (2-layer neural networks or kernelized SVMs). Should we go beyond this, and start using “deep” representations? What is deep learning? Intuitively, deep learning is about learning to predict in ways which can involve complex dependencies between the input (observed) features. Specifying this more rigorously turns out to be rather difficult. Consider the following cases: SVM with Gaussian Kernel. This is not considered deep learning, because an SVM with a gaussian kernel can’t succinctly represent certain decision surfaces. One of Yann LeCun ‘s examples is recognizing objects based on pixel values. An SVM will need a new support vector for each significantly different background. Since the number
4 0.63098246 477 hunch net-2013-01-01-Deep Learning 2012
Introduction: 2012 was a tumultuous year for me, but it was undeniably a great year for deep learning efforts. Signs of this include: Winning a Kaggle competition . Wide adoption of deep learning for speech recognition . Significant industry support . Gains in image recognition . This is a rare event in research: a significant capability breakout. Congratulations are definitely in order for those who managed to achieve it. At this point, deep learning algorithms seem like a choice undeniably worth investigating for real applications with significant data.
5 0.62379622 407 hunch net-2010-08-23-Boosted Decision Trees for Deep Learning
Introduction: About 4 years ago, I speculated that decision trees qualify as a deep learning algorithm because they can make decisions which are substantially nonlinear in the input representation. Ping Li has proved this correct, empirically at UAI by showing that boosted decision trees can beat deep belief networks on versions of Mnist which are artificially hardened so as to make them solvable only by deep learning algorithms. This is an important point, because the ability to solve these sorts of problems is probably the best objective definition of a deep learning algorithm we have. I’m not that surprised. In my experience, if you can accept the computational drawbacks of a boosted decision tree, they can achieve pretty good performance. Geoff Hinton once told me that the great thing about deep belief networks is that they work. I understand that Ping had very substantial difficulty in getting this published, so I hope some reviewers step up to the standard of valuing wha
6 0.56117779 16 hunch net-2005-02-09-Intuitions from applied learning
7 0.55377167 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem
8 0.54184264 431 hunch net-2011-04-18-A paper not at Snowbird
9 0.54149932 143 hunch net-2005-12-27-Automated Labeling
10 0.51570487 161 hunch net-2006-03-05-“Structural” Learning
11 0.50991052 45 hunch net-2005-03-22-Active learning
12 0.48849407 456 hunch net-2012-02-24-ICML+50%
13 0.456081 361 hunch net-2009-06-24-Interesting papers at UAICMOLT 2009
14 0.45274413 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates
15 0.449781 444 hunch net-2011-09-07-KDD and MUCMD 2011
16 0.44123295 139 hunch net-2005-12-11-More NIPS Papers
17 0.4404749 432 hunch net-2011-04-20-The End of the Beginning of Active Learning
18 0.42792928 466 hunch net-2012-06-05-ICML acceptance statistics
19 0.4235962 152 hunch net-2006-01-30-Should the Input Representation be a Vector?
20 0.41963586 277 hunch net-2007-12-12-Workshop Summary—Principles of Learning Problem Design
topicId topicWeight
[(27, 0.175), (49, 0.594), (53, 0.065), (55, 0.033), (95, 0.024)]
simIndex simValue blogId blogTitle
1 0.99230707 338 hunch net-2009-01-23-An Active Learning Survey
Introduction: Burr Settles wrote a fairly comprehensive survey of active learning . He intends to maintain and update the survey, so send him any suggestions you have.
same-blog 2 0.95007014 224 hunch net-2006-12-12-Interesting Papers at NIPS 2006
Introduction: Here are some papers that I found surprisingly interesting. Yoshua Bengio , Pascal Lamblin, Dan Popovici, Hugo Larochelle, Greedy Layer-wise Training of Deep Networks . Empirically investigates some of the design choices behind deep belief networks. Long Zhu , Yuanhao Chen, Alan Yuille Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing. An unsupervised method for detecting objects using simple feature filters that works remarkably well on the (supervised) caltech-101 dataset . Shai Ben-David , John Blitzer , Koby Crammer , and Fernando Pereira , Analysis of Representations for Domain Adaptation . This is the first analysis I’ve seen of learning with respect to samples drawn differently from the evaluation distribution which depends on reasonable measurable quantities. All of these papers turn out to have a common theme—the power of unlabeled data to do generically useful things.
3 0.91067743 122 hunch net-2005-10-13-Site tweak
Introduction: Several people have had difficulty with comments which seem to have an allowed language significantly poorer than posts. The set of allowed html tags has been increased and the markdown filter has been put in place to try to make commenting easier. I’ll put some examples into the comments of this post.
4 0.66119635 365 hunch net-2009-07-31-Vowpal Wabbit Open Source Project
Introduction: Today brings a new release of the Vowpal Wabbit fast online learning software. This time, unlike the previous release, the project itself is going open source, developing via github . For example, the lastest and greatest can be downloaded via: git clone git://github.com/JohnLangford/vowpal_wabbit.git If you aren’t familiar with git , it’s a distributed version control system which supports quick and easy branching, as well as reconciliation. This version of the code is confirmed to compile without complaint on at least some flavors of OSX as well as Linux boxes. As much of the point of this project is pushing the limits of fast and effective machine learning, let me mention a few datapoints from my experience. The program can effectively scale up to batch-style training on sparse terafeature (i.e. 10 12 sparse feature) size datasets. The limiting factor is typically i/o. I started using the the real datasets from the large-scale learning workshop as a conve
5 0.64124745 37 hunch net-2005-03-08-Fast Physics for Learning
Introduction: While everyone is silently working on ICML submissions, I found this discussion about a fast physics simulator chip interesting from a learning viewpoint. In many cases, learning attempts to predict the outcome of physical processes. Access to a fast simulator for these processes might be quite helpful in predicting the outcome. Bayesian learning in particular may directly benefit while many other algorithms (like support vector machines) might have their speed greatly increased. The biggest drawback is that writing software for these odd architectures is always difficult and time consuming, but a several-orders-of-magnitude speedup might make that worthwhile.
6 0.61048108 23 hunch net-2005-02-19-Loss Functions for Discriminative Training of Energy-Based Models
7 0.58539182 348 hunch net-2009-04-02-Asymmophobia
8 0.42608926 359 hunch net-2009-06-03-Functionally defined Nonlinear Dynamic Models
9 0.37653542 438 hunch net-2011-07-11-Interesting Neural Network Papers at ICML 2011
10 0.34128365 426 hunch net-2011-03-19-The Ideal Large Scale Learning Class
11 0.33946875 280 hunch net-2007-12-20-Cool and Interesting things at NIPS, take three
12 0.33376661 493 hunch net-2014-02-16-Metacademy: a package manager for knowledge
13 0.32846683 201 hunch net-2006-08-07-The Call of the Deep
14 0.32739869 227 hunch net-2007-01-10-A Deep Belief Net Learning Problem
15 0.31474113 435 hunch net-2011-05-16-Research Directions for Machine Learning and Algorithms
16 0.31413978 194 hunch net-2006-07-11-New Models
17 0.31232601 329 hunch net-2008-11-28-A Bumper Crop of Machine Learning Graduates
18 0.30815247 144 hunch net-2005-12-28-Yet more nips thoughts
19 0.3075639 5 hunch net-2005-01-26-Watchword: Probability
20 0.30237007 132 hunch net-2005-11-26-The Design of an Optimal Research Environment