nips nips2011 nips2011-244 nips2011-244-reference knowledge-graph by maker-knowledge-mining

244 nips-2011-Selecting Receptive Fields in Deep Networks

Source: pdf

Author: Adam Coates, Andrew Y. Ng

Abstract: Recent deep learning and unsupervised feature learning systems that learn from unlabeled data have achieved high performance in benchmarks by using extremely large architectures with many features (hidden units) at each layer. Unfortunately, for such large architectures the number of parameters can grow quadratically in the width of the network, thus necessitating hand-coded “local receptive ﬁelds” that limit the number of connections from lower level features to higher ones (e.g., based on spatial locality). In this paper we propose a fast method to choose these connections that may be incorporated into a wide variety of unsupervised training methods. Speciﬁcally, we choose local receptive ﬁelds that group together those low-level features that are most similar to each other according to a pairwise similarity metric. This approach allows us to harness the advantages of local receptive ﬁelds (such as improved scalability, and reduced data requirements) when we do not know how to specify such receptive ﬁelds by hand or where our unsupervised training algorithm has no obvious generalization to a topographic setting. We produce results showing how this method allows us to use even simple unsupervised training algorithms to train successful multi-layered networks that achieve state-of-the-art results on CIFAR and STL datasets: 82.0% and 60.1% accuracy, respectively. 1

reference text

[1] R. Adams, H. Wallach, and Z. Ghahramani. Learning the structure of deep sparse graphical models. In International Conference on AI and Statistics, 2010.

[2] A. Bell and T. J. Sejnowski. The ‘independent components’ of natural scenes are edge ﬁlters. Vision Research, 37, 1997.

[3] Y. Boureau, F. Bach, Y. LeCun, and J. Ponce. Learning mid-level features for recognition. In Computer Vision and Pattern Recognition, 2010.

[4] D. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber. Highperformance neural networks for visual object classiﬁcation. Pre-print, 2011. http://arxiv.org/abs/1102.0183.

[5] A. Coates, H. Lee, and A. Y. Ng. An analysis of single-layer networks in unsupervised feature learning. In International Conference on AI and Statistics, 2011.

[6] A. Coates and A. Y. Ng. The importance of encoding versus training with sparse coding and vector quantization. In International Conference on Machine Learning, 2011.

[7] P. Garrigues and B. Olshausen. Group sparse coding with a laplacian scale mixture prior. In Advances in Neural Information Processing Systems, 2010.

[8] K. Gregor and Y. LeCun. Emergence of complex-like cells in a temporal product network with local receptive ﬁelds, 2010.

[9] F. Huang and Y. LeCun. Large-scale learning with SVM and convolutional nets for generic object categorization. In Computer Vision and Pattern Recognition, 2006.

[10] A. Hyvarinen, P. Hoyer, and M. Inki. Topographic independent component analysis. Neural Computation, 13(7):1527–1558, 2001.

[11] A. Hyvarinen and E. Oja. Independent component analysis: algorithms and applications. Neural networks, 13(4-5):411–430, 2000.

[12] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In International Conference on Computer Vision, 2009.

[13] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10. Unpublished manuscript, 2010.

[14] Y. LeCun, F. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. In Computer Vision and Pattern Recognition, 2004.

[15] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In International Conference on Machine Learning, 2009.

[16] V. Nair and G. E. Hinton. Rectiﬁed Linear Units Improve Restricted Boltzmann Machines. In International Conference on Machine Learning, 2010.

[17] N. Pinto, D. Doukhan, J. J. DiCarlo, and D. D. Cox. A high-throughput screening approach to discovering good forms of biologically inspired visual representation. PLoS Comput Biol, 2009.

[18] A. Saxe, P. Koh, Z. Chen, M. Bhand, B. Suresh, and A. Y. Ng. On random weights and unsupervised feature learning. In International Conference on Machine Learning, 2011.

[19] D. Scherer, A. Mller, and S. Behnke. Evaluation of pooling operations in convolutional architectures for object recognition. In International Conference on Artiﬁcial Neural Networks, 2010.

[20] E. Simoncelli and O. Schwartz. Modeling surround suppression in v1 neurons with a statistically derived normalization model. Advances in Neural Information Processing Systems, 1998.

[21] K. Zhang and L. Chan. Ica with sparse connections. Intelligent Data Engineering and Automated Learning, 2006. 9