nips nips2012 nips2012-303 nips2012-303-reference knowledge-graph by maker-knowledge-mining

303 nips-2012-Searching for objects driven by context

Source: pdf

Author: Bogdan Alexe, Nicolas Heess, Yee W. Teh, Vittorio Ferrari

Abstract: The dominant visual search paradigm for object class detection is sliding windows. Although simple and effective, it is also wasteful, unnatural and rigidly hardwired. We propose strategies to search for objects which intelligently explore the space of windows by making sequential observations at locations decided based on previous observations. Our strategies adapt to the class being searched and to the content of a particular test image, exploiting context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set. In addition to being more elegant than sliding windows, we demonstrate experimentally on the PASCAL VOC 2010 dataset that our strategies evaluate two orders of magnitude fewer windows while achieving higher object detection performance. 1

reference text

[1] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In CVPR, 2010.

[2] J. Arpit, R. Saiprasad, and M. Anurag. Multi-stage contour based detection of deformable objects. In ECCV, 2008.

[3] H. Bay, A. Ess, T. Tuytelaars, and L. van Gool. SURF: Speeded up robust features. CVIU, 110(3):346– 359, 2008.

[4] L. Bazzani, N. de Freitas, H. Larochelle, V. Murino, and J. Ting. Learning attentional policies for tracking and recognition in video with deep networks. In ICML, 2011.

[5] N. J. Butko and J. R. Movellan. Optimal scanning for faster object detection. In CVPR, 2009.

[6] M. Choi, J. Lim, A. Torralba, and A. Willsky. Exploiting hierarchical context on a large database of object categories. In CVPR, 2010.

[7] N. Dalal and B Triggs. Histogram of Oriented Gradients for Human Detection. In CVPR, 2005.

[8] C Desai, D. Ramanan, and C. Fowlkes. Discriminative models for multi-class object layout. In ICCV, 2009.

[9] W. Einhauser and P. Konig. Does luminance-contrast contribute to saliency map for overt visual attention. European Journal of Neuroscience, 5(17):1089–1097, 2003.

[10] M. Everingham et al. The PASCAL Visual Object Classes Challenge 2010 Results, 2010.

[11] P. Felzenszwalb, R. Girshick, and D. McAllester. Cascade object detection with deformable part models. In CVPR, 2010.

[12] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Trans. on PAMI, 32(9):1627–1645, 2010.

[13] D. Gao and N. Vasconcelos. Bottom-up saliency is a discriminant process. In ICCV, 2007.

[14] Y. Gong and S. Lazebnik. Iterative quantization: A procrustean approach to learning binary codes. In CVPR, 2011.

[15] H. Harzallah, F. Jurie, and C. Schmid. Combining efﬁcient object localization and image classiﬁcation. In ICCV, 2009.

[16] G. Heitz and D. Koller. Learning spatial context: Using stuff to ﬁnd things. In ECCV, 2008.

[17] X. Hou and L. Zhang. Saliency detection: A spectral residual approach. In CVPR, 2007.

[18] L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. on PAMI, 20(11):1254–1259, 1998.

[19] G. Krieger, I. Rentschler, G. Hauske, K. Schill, and C. Zetzsche. Object and scene analysis by saccadic eye-movements: an investigation with higher-order statistics. Spatial Vision, 2(16):201–214, 2000.

[20] C. H. Lampert, M. B. Blaschko, and T. Hofmann. Beyond sliding windows: Object localization by efﬁcient subwindow search. In CVPR, 2008.

[21] H. Larochelle and G. E. Hinton. Learning to combine foveal glimpses with a third-order Boltzmann machine. In NIPS, 2010.

[22] B. Leibe, A. Leonardis, and B. Schiele. Combined object categorization and segmentation with an implicit shape model. In Workshop on Statistical Learning in Computer Vision, ECCV, May 2004.

[23] S. Maji, A. Berg, and J. Malik. Classiﬁcation using intersection kernel support vector machines is efﬁcient. In CVPR, 2008.

[24] J. Najemnik and W. S. Geisler. Optimal eye movement strategies in visual search. Nature, 434:381–391, 2005.

[25] L. Paletta, G. Fritz, and C. Seifert. Q-learning of sequential attention for visual object recognition from informative local descriptors. In ICML, 2005.

[26] M. Pedersoli, A. Vedaldi, and J. Gonzales. A coarse-to-ﬁne approach for fast deformable object detection. In CVPR, 2011.

[27] A. Prest, C. Leistner, J. Civera, C. Schmid, and V. Ferrari. Learning object class detectors from weakly annotated video. In CVPR, 2012.

[28] A. Rabinovich, A. Vedaldi, C. Galleguillos, E. Wiewiora, and S. Belongie. Objects in context. In ICCV, 2007.

[29] K. Van de Sande, J. Uijlings, T. Gevers, and A. Smeulders. Segmentation as selective search for object recognition. In ICCV, 2011.

[30] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009.

[31] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In CVPR, 2001. 9