iccv iccv2013 iccv2013-349 iccv2013-349-reference knowledge-graph by maker-knowledge-mining

349 iccv-2013-Regionlets for Generic Object Detection

Source: pdf

Author: Xiaoyu Wang, Ming Yang, Shenghuo Zhu, Yuanqing Lin

Abstract: Generic object detection is confronted by dealing with different degrees of variations in distinct object classes with tractable computations, which demands for descriptive and flexible object representations that are also efficient to evaluate for many locations. In view of this, we propose to model an object class by a cascaded boosting classifier which integrates various types of features from competing local regions, named as regionlets. A regionlet is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e. size and aspect ratio). These regionlets are organized in small groups with stable relative positions to delineate fine-grained spatial layouts inside objects. Their features are aggregated to a one-dimensional feature within one group so as to tolerate deformations. Then we evaluate the object bounding box proposal in selective search from segmentation cues, limiting the evaluation locations to thousands. Our approach significantly outperforms the state-of-the-art on popular multi-class detection benchmark datasets with a single method, without any contexts. It achieves the detec- tion mean average precision of 41. 7% on the PASCAL VOC 2007 dataset and 39. 7% on the VOC 2010 for 20 object categories. It achieves 14. 7% mean average precision on the ImageNet dataset for 200 object categories, outperforming the latest deformable part-based model (DPM) by 4. 7%.

reference text

[1] T. Ahonen, A. Hadid, and M. Pietik¨ ainen. local binary patterns. In ECCV, 2004. 1We thank our intern Miao Face recognition with Sun from Univeristy of Missouri for evaluating the DPM performance on ImageNet.

[2] B. Alexe, T. Deselaers, and V. Ferrari. What is an object? In CVPR, 2010.

[3] B. Alexe, T. Deselaers, and V. Ferrari. Measuring the objectness of image windows. IEEE T-PAMI, 2012.

[4] R. Benenson, M. Mathias, R. Timofte, and L. Van Gool. Pedestrian detection at 100 frames per second. In CVPR, 2012.

[5] R. G. Cinbis and S. Sclaroff. Contextual object detection using setbased classification. In ECCV, 2012.

[6] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005.

[7] C. Desai, D. Ramanan, and C. Fowlkes. Discriminative models for multi-class object layout. In ICCV, 2009.

[8] Y. Ding and J. Xiao. Contextual boost for pedestrian detection. In CVPR, 2012.

[9] P. Dollar, R. Appel, and W. Kienzle. Crosstalk cascades for framerate pedestrian detection. In ECCV, 2012.

[10] M. Everingham, L. van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2010 Re-

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27] sults. http://www.pascal-network.org/challenges/VOC/voc2010/ workshop/index.html. P. Felzenszwalb, R. Girshick, and D. McAllester. Cascade object detection with deformable part models. In CVPR, 2010. P. Felzenszwalb, D. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In CVPR, 2008. H. Harzallah, F. Jurie, and C. Schmid. Combining efficient object localization and image classification. In ICCV, 2009. C. Huang, H. Ai, B. Wu, and S. Lao. Boosting nested cascade detector for multi-view face detection. In ICPR, 2004. C. H. Lampert. An efficient divide-and-conquer cascade for nonlinear object detection. In CVPR, 2010. C. H. Lampert, M. B. Blaschko, and T. Hofmann. Beyond sliding windows: object localization by efficient subwindow search. In CVPR, 2008. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In CVPR, 2006. C. Li, D. Parikh, and T. Chen. Extracting adaptive contextual cues from unlabeled regions. In ICCV, 2011. M. Pedersoli, A. Vedaldi, and J. Gonzalez. A coarse-to-fine approach for fast deformable object detection. In CVPR, 2011. E. Rahtu, J. Kannala, and M. Blaschko. Learning a category independent object detection cascade. In ICCV, 2011. O. Russakovsky, J. Deng, J. Krause, A. Berg, and L. Fei-Fei. Large scale visual recognition challenge 2013 (ILSVRC2013). http://www.image-net.org/challenges/LSVRC/2013/, 2013. R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. In Machine Learning, 1999. Z. Song, Q. Chen, Z. Huang, Y. Hua, and S. Yan. Contextualizing object detection and classification. In CVPR, 2011. O. Tuzel, F. Porikli, and P. Meer. Human detection via classification on riemannian manifolds. In CVPR, 2007. K. E. A. Van de Sande, J. R. R. Uijlings, T. Gevers, and A. W. M. Smeulders. Segmentation as selective search for object recognition. In ICCV, 2011. P. Viola and M. Jones. Robust real-time object detection. IJCV, 20.01 X. Wang, X. Bai, W. Liu, and L. J. Latecki. Feature context for image

[28]

[29]

[30] [3 1] 24 classification and object detection. In CVPR, 2011. X. Wang, T. X. Han, and S. Yan. An HOG-LBP human detector with partial occlusion handling. In ICCV, 2009. J. Zhang, K. Huang, Y. Yu, and T. Tan. Boosted local structured HOG-LBP for object localization. In CVPR, 2011. Z. Zhang, J. Warrell, and P. H. S. Torr. Proposal generation for object detection using cascaded ranking SVMs. In CVPR, 2011. L. Zhu, Y. Chen, A. Yuille, and W. Freeman. Latent hierarchical structural learning for object detection. In CVPR, 2010.