iccv iccv2013 iccv2013-377 iccv2013-377-reference knowledge-graph by maker-knowledge-mining

377 iccv-2013-Segmentation Driven Object Detection with Fisher Vectors

Source: pdf

Author: Ramazan Gokberk Cinbis, Jakob Verbeek, Cordelia Schmid

Abstract: We present an object detection system based on the Fisher vector (FV) image representation computed over SIFT and color descriptors. For computational and storage efficiency, we use a recent segmentation-based method to generate class-independent object detection hypotheses, in combination with data compression techniques. Our main contribution is a method to produce tentative object segmentation masks to suppress background clutter in the features. Re-weighting the local image features based on these masks is shown to improve object detection significantly. We also exploit contextual features in the form of a full-image FV descriptor, and an inter-category rescoring mechanism. Our experiments on the PASCAL VOC 2007 and 2010 datasets show that our detector improves over the current state-of-the-art detection results.

reference text

[1] B. Alexe, T. Deselares, and V. Ferrari. Measuring the objectness of image windows. PAMI, 34(11):2189–2202, 2012.

[2] F. Alted. Why modern CPUs are starving and what can be done about it. Computing in Science & Engineering, 12(2):68–71, 2010.

[3] J. Carreira, R. Caseiroa, J. Batista, and C. Sminchisescu. Semantic segmentation with second-order pooling. In ECCV,

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14] 2012. K. Chatfield, V. Lempitsky, A. Vedaldi, and A. Zisserman. The devil is in the details: an evaluation of recent feature encoding methods. In BMVC, 2011. G. Chen, Y. Ding, J. Xiao, and T. X. Han. Detection evolution with multi-order contextual co-occurrence. In CVPR, 2013. Q. Chen, Z. Song, R. Feris, A. Datta, L. Cao, Z. Huang, and S. Yan. Efcient maximum appearance search for large-scale object detection. In CVPR, 2013. R. Cinbis, J. Verbeek, and C. Schmid. Image categorization using Fisher kernels of non-iid image models. In CVPR, 2012. S. Clinchant, G. Csurka, F. Perronnin, and J.-M. Renders. XRCE’s participation to ImagEval. In ImageEval workshop at CVIR, 2007. Q. Dai and D. Hoiem. Learning to localize detected objects. In CVPR, 2012. N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005. I. Endres and D. Hoiem. Category independent object proposals. In ECCV, 2010. M. Everingham, L. van Gool, C. Williams, J. Winn, and A. Zisserman. The Pascal Visual Object Classes (VOC) challenge. IJCV, 88(2):303–338, June 2010. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: a library for large linear classification. JMLR, 9: 1871–1874, 2008. P. Felzenszwalb, R. Grishick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. PAMI, 32(9), 2010.

[15] S. Fidler, R. Mottaghi, A. Yuille, and R. Urtasun. Bottom-up segmentation for top-down detection. In CVPR, 2013.

[16] R. Girshick, P. Felzenszwalb, and D. McAllester. Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/ rbg/latent-release5, 2012.

[17] C. Gu, P. Arbel ´aez, Y. Lin, K. Yu, and Malik. Multicomponent models for object detection. In ECCV, 2012.

[18] C. Gu, J. Lim, P. Arbel ´aez, and J. Malik. Recognition using regions. In CVPR, 2009. 2974 the masked window descriptors. Correct detections are shown in yellow, incorrect ones in magenta. See text for details.

[19] B. Hariharan, J. Malik, and D. Ramanan. Discriminative decorrelation for clustering and classification. In ECCV, 2012.

[20] H. Harzallah, F. Jurie, and C. Schmid. Combining efficient object localization and image classification. In ICCV, 2009.

[21] H. J ´egou, M. Douze, and C. Schmid. Product quantization

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30] [3 1]

[32] for nearest neighbor search. PAMI, 33(1): 117–128, 2011. F. Khan, R. Anwer, J. van de Weijer, A. Bagdanov, M. Vanrell, and A. Lopez. Color attributes for object detection. In CVPR, 2012. F. Khan, J. van de Weijer, and M. Vanrell. Top-down color attention for object recognition. In ICCV, 2009. C. Lampert, M. Blaschko, and T. Hofmann. Efficient subwindow search: a branch and bound framework for object localization. PAMI, 3 1(12):2129–2142, 2009. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006. D. Oneata, J. Verbeek, and C. Schmid. Action and event recognition with Fisher vectors on a compact feature set. In ICCV, 2013. O. Parkhi, A. Vedaldi, C. Jawahar, and A. Zisserman. The truth about cats and dogs. In ICCV, 2011. D. Ramanan. Using segmentation to verify object hypotheses. In CVPR, 2007. J. S ´anchez and F. Perronnin. High-dimensional signature compression for large-scale image classification. In CVPR, 2011. J. S ´anchez, F. Perronnin, and T. de Campos. Modeling the spatial layout of images beyond spatial pyramids. Pattern Recognition Letters, 33(16):2216–2223, 2012. J. S ´anchez, F. Perronnin, T. Mensink, and J. Verbeek. Image classification with the Fisher vector: Theory and practice. IJCV, 105(3):222–245, 2013. X. Song, T. Wu, Y. Jia, and S.-C. Zhu. Discriminatively trained and-or tree models for object detection. In CVPR, 2013.

[33] Z. Song, Q. Chen, Z. Huang, Y. Hua, and S. Yan. Contextualizing object detection and classification. In CVPR, 2011.

[34] K. van de Sande, J. Uijlings, T. Gevers, and A. Smeulders. Segmentation as selective search for object recognition. In ICCV, 2011.

[35] A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009.

[36] A. Vedaldi and A. Zisserman. Sparse kernel approximations for efficient classification and detection. In CVPR, 2012.

[37] P. Viola and M. Jones. Robust real-time object detection. IJCV, 57(2): 137–154, 2004.

[38] L. Wang, J. Shi, G. Song, and I.-F. Shen. Object detection combining recognition and segmentation. In ACCV, 2007. 2975