nips nips2011 nips2011-261 nips2011-261-reference knowledge-graph by maker-knowledge-mining

261 nips-2011-Sparse Filtering

Source: pdf

Author: Jiquan Ngiam, Zhenghao Chen, Sonia A. Bhaskar, Pang W. Koh, Andrew Y. Ng

Abstract: Unsupervised feature learning has been shown to be effective at learning representations that perform well on image, video and audio classiﬁcation. However, many existing feature learning algorithms are hard to use and require extensive hyperparameter tuning. In this work, we present sparse ﬁltering, a simple new algorithm which is efﬁcient and only has one hyperparameter, the number of features to learn. In contrast to most other feature learning methods, sparse ﬁltering does not explicitly attempt to construct a model of the data distribution. Instead, it optimizes a simple cost function – the sparsity of 2 -normalized features – which can easily be implemented in a few lines of MATLAB code. Sparse ﬁltering scales gracefully to handle high-dimensional inputs, and can also be used to learn meaningful features in additional layers with greedy layer-wise stacking. We evaluate sparse ﬁltering on natural images, object classiﬁcation (STL-10), and phone classiﬁcation (TIMIT), and show that our method works well on a range of different modalities. 1

reference text

[1] G. E. Dahl, M. Ranzato, A. Mohamed, and G. E. Hinton. Phone recognition with the mean-covariance restricted Boltzmann machine. In NIPS. 2010.

[2] H. Lee, Y. Largman, P. Pham, and A. Y. Ng. Unsupervised feature learning for audio classiﬁcation using convolutional deep belief networks. In NIPS. 2009.

[3] J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classiﬁcation. In CVPR, 2009.

[4] M.A. Ranzato, F. J. Huang, Y.-L. Boureau, and Y. LeCun. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In CVPR, 2007.

[5] Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng. Learning hierarchical spatio-temporal features for action recognition with independent subspace analysis. In CVPR, 2011.

[6] H. Lee, C. Ekanadham, and A.Y. Ng. Sparse deep belief net model for visual area v2. In NIPS, 2008.

[7] G. E. Hinton, S. Osindero, and Y.W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006.

[8] P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.

[9] J. H. van Hateren and A. van der Schaaf. Independent component ﬁlters of natural images compared with simple cells in primary visual cortex. Proceedings: Biological Sciences, 265(1394):359–366, 1998.

[10] A. J. Bell and T. J. Sejnowski. The ”independent components” of natural scenes are edge ﬁlters. Vision Res., 37(23):3327–3338, December 1997.

[11] B. Olshausen and D. Field. Sparse coding with an overcomplete basis set: A strategy employed by V1? Nature, 1997.

[12] A. Hyv¨ rinen, J. Hurri, and Patrick O. Hoyer. Natural Image Statistics: A Probabilistic Approach to Early a Computational Vision. (Computational Imaging and Vision). Springer, 2nd printing. edition, 2009.

[13] D. J. Field. What is the goal of sensory coding? Neural Computation, 6(4):559–601, July 1994.

[14] B. Willmore and D. J. Tolhurst. Characterizing the sparseness of neural codes. Network, 12(3):255–270, January 2001.

[15] O. Schwartz and E. P. Simoncelli. Natural signal statistics and sensory gain control. Nature Neuroscience, 4:819–825, 2001.

[16] M.A. Ranzato, C. Poultney, S. Chopra, and Y. Lecun. Efﬁcient learning of sparse representations with an energy-based model. In NIPS, 2006.

[17] A. Coates, H. Lee, and A. Y. Ng. An analysis of single-layer networks in unsupervised feature learning. In AISTATS, 2011. 8

[18] A. Treves and E. Rolls. What determines the capacity of autoassociative memories in the brain? Network: Computation in Neural Systems, 2:371–397(27), 1991.

[19] Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy Layer-Wise training of deep networks. In NIPS, 2006.

[20] M. Schmidt. minFunc. http://www.cs.ubc.ca/˜schmidtm/Software/minFunc.html, 2005.

[21] M. Ranzato and G. E. Hinton. Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines. In CVPR, 2010.

[22] U. K¨ ster and A. Hyv¨ rinen. A two-layer model of natural stimuli estimated with score matching. Neural o a Computation, 22(9):2308–2333, 2010.

[23] A. Saxe, M. Bhand, Z. Chen, P.W. Koh, B. Suresh, and A.Y. Ng. On random weights and unsupervised feature learning. In ICML, 2011.

[24] S. Petrov, A. Pauls, and D. Klein. Learning structured models for phone recognition. In Proc. of EMNLPCoNLL, 2007.

[25] F. Sha and L.K. Saul. Large margin gaussian mixture modeling for phonetic classiﬁcation and recognition. In ICASSP. IEEE, 2006.

[26] D. Yu, L. Deng, and A. Acero. Hidden conditional random ﬁeld with distribution constraints for phone classiﬁcation. In Interspeech, 2009.

[27] H.A. Chang and J.R. Glass. Hierarchical large-margin gaussian mixture models for phonetic classiﬁcation. In Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on, pages 272–277. IEEE, 2007.

[28] W. E. Fisher, G. R. Doddington, and K. M. Goudle-marshall. The DARPA speech recognition research database: speciﬁcations and status. 1986.

[29] P. Clarkson and P. J. Moreno. On the use of support vector machines for phonetic classiﬁcation. Acoustics, Speech, and Signal Processing, IEEE International Conference on, 2:585–588, 1999.

[30] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparse coding. In ICML, 2009.

[31] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011. Software available at http://www.csie. ntu.edu.tw/˜cjlin/libsvm.

[32] M. Wainwright, O. Schwartz, and E. Simoncelli. Natural image statistics and divisive normalization: Modeling nonlinearity and adaptation in cortical neurons, 2001.

[33] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun. What is the best multi-stage architecture for object recognition? In ICCV, 2009.

[34] N. Pinto, D. D. Cox, and J. J. DiCarlo. Why is Real-World visual object recognition hard? PLoS Comput Biol, 4(1):e27+, January 2008.

[35] Patrik O. Hoyer. Non-negative matrix factorization with sparseness constraints. JMLR, 5:1457–1469, 2004. 9