nips nips2006 nips2006-8 nips2006-8-reference knowledge-graph by maker-knowledge-mining

8 nips-2006-A Nonparametric Approach to Bottom-Up Visual Saliency

Source: pdf

Author: Wolf Kienzle, Felix A. Wichmann, Matthias O. Franz, Bernhard Schölkopf

Abstract: This paper addresses the bottom-up inﬂuence of local image information on human eye movements. Most existing computational models use a set of biologically plausible linear ﬁlters, e.g., Gabor or Difference-of-Gaussians ﬁlters as a front-end, the outputs of which are nonlinearly combined into a real number that indicates visual saliency. Unfortunately, this requires many design parameters such as the number, type, and size of the front-end ﬁlters, as well as the choice of nonlinearities, weighting and normalization schemes etc., for which biological plausibility cannot always be justiﬁed. As a result, these parameters have to be chosen in a more or less ad hoc way. Here, we propose to learn a visual saliency model directly from human eye movement data. The model is rather simplistic and essentially parameter-free, and therefore contrasts recent developments in the ﬁeld that usually aim at higher prediction rates at the cost of additional parameters and increasing model complexity. Experimental results show that—despite the lack of any biological prior knowledge—our model performs comparably to existing approaches, and in fact learns image features that resemble ﬁndings from several previous studies. In particular, its maximally excitatory stimuli have center-surround structure, similar to receptive ﬁelds in the early human visual system. 1

reference text

[1] R. J. Baddeley and B. W. Tatler. High frequency edges (but not contrast) predict where we ﬁxate: A bayesian system identiﬁcation analysis. Vision Research, 46(18):2824–2833, 2006.

[2] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.

[5] L. Itti. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition, 12(6):1093–1123, 2005.

[6] L. Itti. Quantitative modeling of perceptual salience at human eye position. Visual Cognition (in press), 2006.

[3] L. Itti, Koch C., and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254–1259, 1998.

[4] L. Itti and C. Koch. A saliency-based search mechanism for overt and covert shifts of visual attention. Vision Research, 40(10-12):1489–1506, 2000.

[7] S. S. Keerthi and C. J. Lin. Asymptotic behaviors of support vector machines with gaussian kernel. Neural Computation, 15:1667–1689, 2003.

[8] W. Kienzle, F. A. Wichmann, B. Sch¨ lkopf, and M. O. Franz. Learning an interest operao tor from human eye movements. In Beyond Patches Workshop, International Conference on Computer Vision and Pattern Recognition, 2006.

[9] C. Koch and S. Ullman. Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiology, 4(4):219–227, 1985.

[10] G. Krieger, I. Rentschler, G. Hauske, K. Schill, and C. Zetzsche. Object and scene analysis by saccadic eye-movements: an investigation with higher-order statistics. Spatial Vision, 3(2,3):201–214, 2000.

[11] S. W. Kufﬂer. Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16(1):37–68, 1953.

[12] S. K. Mannan, K. H. Ruddock, and D. S. Wooding. The relationship between the locations of spatial features and those of ﬁxations made during visual examination of brieﬂy presented images. Spatial Vision, 10(3):165–88, 1996.

[13] D. J. Parkhurst, K. Law, and E. Niebur. Modeling the role of salience in the allocation of overt visual attention. Vision Research, 42(1):107–123, 2002.

[14] D. J. Parkhurst and E. Niebur. Scene content selected by active vision. Spatial Vision, 16(2):125–154, 2003.

[15] R. J. Peters, A. Iyer, C. Koch, and L. Itti. Components of bottom-up gaze allocation in natural scenes (poster). In Vision Sciences Society (VSS) Annual Meeting, 2005.

[16] C. M. Privitera and L. W. Stark. Algorithms for deﬁning visual regions-of-interest: Comparison with eye ﬁxations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(9):970–982, 2000.

[17] R. Raj, W. S. Geisler, R. A. Frazor, and A. C. Bovik. Contrast statistics for foveated visual systems: Fixation selection by minimizing contrast entropy. Journal of the Optical Society of America A., 22(10):2039–2049, 2005.

[18] P. Reinagel and A. M. Zador. Natural scene statistics at the center of gaze. Network: Computation in Neural Systems, 10(4):341–350, 1999.

[19] L. W. Renninger, J. Coughlan, P. Verghese, and J. Malik. An information maximization model of eye movements. In Advances in Neural Information Processing Systems, volume 17, pages 1121–1128, 2005.

[20] I. Steinwart. On the inﬂuence of the kernel on the consistency of support vector machines. Journal of Machine Learning Research, 2:67–93, 2001.

[22] D. Walther. Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics. PhD thesis, California Institute of Technology, 2006.