nips nips2002 nips2002-2 nips2002-2-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: David B. Grimes, Rajesh P. Rao
Abstract: Recent algorithms for sparse coding and independent component analysis (ICA) have demonstrated how localized features can be learned from natural images. However, these approaches do not take image transformations into account. As a result, they produce image codes that are redundant because the same feature is learned at multiple locations. We describe an algorithm for sparse coding based on a bilinear generative model of images. By explicitly modeling the interaction between image features and their transformations, the bilinear approach helps reduce redundancy in the image code and provides a basis for transformationinvariant vision. We present results demonstrating bilinear sparse coding of natural images. We also explore an extension of the model that can capture spatial relationships between the independent features of an object, thereby providing a new framework for parts-based object recognition.
[1] F. Attneave. Some informational aspects of visual perception. Psychological Review, 61(3):183–193, 1954.
[2] H. B. Barlow. Possible principles underlying the transformation of sensory messages. In W. A. Rosenblith, editor, Sensory Communication, pages 217–234. Cambridge, MA: MIT Press, 1961.
[3] A. J. Bell and T. J. Sejnowski. The ‘independent components’ of natural scenes are edge filters. Vision Research, 37(23):3327–3338, 1997.
[4] G. E. Hinton and Z. Ghahramani. Generative models for discovering sparse distributed representations. Philosophical Transactions Royal Society B, 352(1177– 1190), 1997.
[5] M. S. Lewicki and T. J. Sejnowski. Learning overcomplete representations. Neural Computation, 12(2):337–365, 2000.
[6] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–609, 1996.
[7] B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37:33113325, 1997.
[8] R. P. N. Rao and D. H. Ballard. Development of localized oriented receptive fields by learning a translation-invariant code for natural images. Network: Computation in Neural Systems, 9(2):219–234, 1998.
[9] R. P. N. Rao and D. H. Ballard. Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive field effects. Nature Neuroscience, 2(1):79–87, 1999.
[10] R. P. N. Rao and D. L. Ruderman. Learning Lie groups for invariant visual perception. In Advances in Neural Information Processing Systems 11, pages 810–816. Cambridge, MA: MIT Press, 1999.
[11] O. Schwartz and E. P. Simoncelli. Natural signal statistics and sensory gain control. Nature Neuroscience, 4(8):819–825, August 2001.
[12] J. B. Tenenbaum and W. T. Freeman. Separating style and content with bilinear models. Neural Computation, 12(6):1247–1283, 2000.