nips nips2008 nips2008-118 nips2008-118-reference knowledge-graph by maker-knowledge-mining

118 nips-2008-Learning Transformational Invariants from Natural Movies

Source: pdf

Author: Charles Cadieu, Bruno A. Olshausen

Abstract: We describe a hierarchical, probabilistic model that learns to extract complex motion from movies of the natural environment. The model consists of two hidden layers: the ﬁrst layer produces a sparse representation of the image that is expressed in terms of local amplitude and phase variables. The second layer learns the higher-order structure among the time-varying phase variables. After training on natural movies, the top layer units discover the structure of phase-shifts within the ﬁrst layer. We show that the top layer units encode transformational invariants: they are selective for the speed and direction of a moving pattern, but are invariant to its spatial structure (orientation/spatial-frequency). The diversity of units in both the intermediate and top layers of the model provides a set of testable predictions for representations that might be found in V1 and MT. In addition, the model demonstrates how feedback from higher levels can inﬂuence representations at lower levels as a by-product of inference in a graphical model. 1

reference text

[1] W. Einhauser, C. Kayser, P. Konig, and K.P. Kording. Learning the invariance properties of complex cells from their responses to natural stimuli. European Journal of Neuroscience, 15(3):475–486, 2002.

[2] Y. Karklin and M.S. Lewicki. A hierarchical bayesian model for learning nonlinear statistical regularities in nonstationary natural signals. Neural Computation, 17(2):397–423, 2005.

[3] A. Hyv¨ rinen, J. Hurri, and J. V¨ yrynen. Bubbles: a unifying framework for low-level statistical propera a ties of natural image sequences. Journal of the Optical Society of America A, 20(7):1237–1252, 2003.

[4] G. Wallis and E.T. Rolls. Invariant face and object recognition in the visual system. Progress in Neurobiology, 51(2):167–194, 1997.

[5] Y. LeCun, F.J. Huang, and L. Bottou. Learning methods for generic object recognition with invariance to pose and lighting. Computer Vision and Pattern Recognition, 2004.

[6] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 411–426, 2007.

[7] SJ Nowlan and T.J. Sejnowski. A selection model for motion processing in area MT of primates. Journal of Neuroscience, 15(2):1195–1214, 1995.

[8] K. Zhang, M. I. Sereno, and M. E. Sereno. Emergence of position-independent detectors of sense of rotation and dilation with Hebbian learning: An analysis. Neural Computation, 5(4):597–612, 1993.

[9] E.T. Rolls and S.M. Stringer. Invariant global motion recognition in the dorsal visual system: A unifying theory. Neural Computation, 19(1):139–169, 2007.

[10] D.B. Grimes and R.P.N. Rao. Bilinear sparse coding for invariant vision. Neural Computation, 17(1):47– 73, 2005.

[11] B.A. Olshausen. Probabilistic Models of Perception and Brain Function, chapter Sparse codes and spikes, pages 257–272. MIT Press, 2002.

[12] E.P. Simoncelli and D.J. Heeger. A model of neuronal responses in visual area MT. Vision Research, 38(5):743–761, 1998.

[13] A. Hyvarinen and P. Hoyer. Emergence of phase-and shift-invariant features by decomposition of natural images into independent feature subspaces. Neural Computation, 12(7):1705–1720, 2000.

[14] E.H. Adelson and J.R. Bergen. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America, A, 2(2):284–299, 1985.

[15] B.A. Olshausen and D.J. Field. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision Research, 37:3311–3325, 1997.

[16] A.J. Bell and T. Sejnowski. The independent components of natural images are edge ﬁlters. Vision Research, 37:3327–3338, 1997.

[17] C. Zetzsche, G. Krieger, and B. Wegmann. The atoms of vision: Cartesian or polar? Journal of the Optical Society of America A, 16(7):1554–1565, 1999.

[18] P. Foldiak. Learning invariance from transformation sequences. Neural Computation, 3(2):194–200, 1991.

[19] L. Wiskott and T.J. Sejnowski. Slow feature analysis: Unsupervised learning of invariances. Neural Computation, 14(4):715–770, 2002.

[20] D.J. Fleet and A.D. Jepson. Computation of component image velocity from local phase information. International Journal of Computer Vision, 5:77–104, 1990.

[21] J.A. Movshon, E.H. Adelson, M.S. Gizzi, and W.T. Newsome. The analysis of moving visual patterns. Pattern Recognition Mechanisms, 54:117–151, 1985.

[22] T.S. Lee and D. Mumford. Hierarchical bayesian inference in the visual cortex. Journal of the Optical Society of America A, 20(7):1434–1448, 2003. 8