nips nips2007 nips2007-180 nips2007-180-reference knowledge-graph by maker-knowledge-mining

180 nips-2007-Sparse Feature Learning for Deep Belief Networks


Source: pdf

Author: Marc'aurelio Ranzato, Y-lan Boureau, Yann L. Cun

Abstract: Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured. 1


reference text

[1] G.E. Hinton and R. R Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313(5786):504–507, 2006.

[2] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS, 2006.

[3] M. Ranzato, C. Poultney, S. Chopra, and Y. LeCun. Efficient learning of sparse representations with an energy-based model. In NIPS 2006. MIT Press, 2006.

[4] Y. Bengio and Y. LeCun. Scaling learning algorithms towars ai. In D. DeCoste L. Bottou, O. Chapelle and J. Weston, editors, Large-Scale Kernel Machines. MIT Press, 2007.

[5] M. Ranzato, Y. Boureau, S. Chopra, and Y. LeCun. A unified energy-based framework for unsupervised learning. In Proc. Conference on AI and Statistics (AI-Stats), 2007.

[6] E. Doi, D. C. Balcan, and M. S. Lewicki. A theoretical analysis of robust coding over noisy overcomplete channels. In NIPS. MIT Press, 2006.

[7] Y. W. Teh, M. Welling, S. Osindero, and G. E. Hinton. Energy-based models for sparse overcomplete representations. Journal of Machine Learning Research, 4:1235–1260, 2003.

[8] B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Research, 37:3311–3325, 1997.

[9] D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788–791, 1999.

[10] G.E. Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14:1771–1800, 2002.

[11] P. Lennie. The cost of cortical computation. Current biology, 13:493–497, 2003.

[12] J.F. Murray and K. Kreutz-Delgado. Learning sparse overcomplete codes for images. The Journal of VLSI Signal Processing, 45:97–110, 2008.

[13] G.E. Hinton and R.S. Zemel. Autoencoders, minimum description length, and helmholtz free energy. In NIPS, 1994.

[14] G.E. Hinton, P. Dayan, and M. Revow. Modeling the manifolds of images of handwritten digits. IEEE Transactions on Neural Networks, 8:65–74, 1997.

[15] M.S. Lewicki and T.J. Sejnowski. Learning overcomplete representations. Neural Computation, 12:337– 365, 2000.

[16] Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F.J. Huang. A tutorial on energy-based learning. In G. Bakir and al.., editors, Predicting Structured Data. MIT Press, 2006.

[17] P. Sallee and B.A. Olshausen. Learning sparse multiscale image representations. In NIPS. MIT Press, 2002.

[18] http://yann.lecun.com/exdb/mnist/.

[19] G.E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18:1527–1554, 2006.

[20] http://www.cs.berkeley.edu/projects/vision/grouping/segbench/. 8