cvpr cvpr2013 cvpr2013-122 cvpr2013-122-reference knowledge-graph by maker-knowledge-mining

122 cvpr-2013-Detection Evolution with Multi-order Contextual Co-occurrence


Source: pdf

Author: Guang Chen, Yuanyuan Ding, Jing Xiao, Tony X. Han

Abstract: Context has been playing an increasingly important role to improve the object detection performance. In this paper we propose an effective representation, Multi-Order Contextual co-Occurrence (MOCO), to implicitly model the high level context using solely detection responses from a baseline object detector. The so-called (1st-order) context feature is computed as a set of randomized binary comparisons on the response map of the baseline object detector. The statistics of the 1st-order binary context features are further calculated to construct a high order co-occurrence descriptor. Combining the MOCO feature with the original image feature, we can evolve the baseline object detector to a stronger context aware detector. With the updated detector, we can continue the evolution till the contextual improvements saturate. Using the successful deformable-partmodel detector [13] as the baseline detector, we test the proposed MOCO evolution framework on the PASCAL VOC 2007 dataset [8] and Caltech pedestrian dataset [7]: The proposed MOCO detector outperforms all known state-ofthe-art approaches, contextually boosting deformable part models (ver.5) [13] by 3.3% in mean average precision on the PASCAL 2007 dataset. For the Caltech pedestrian dataset, our method further reduces the log-average miss rate from 48% to 46% and the miss rate at 1 FPPI from 25% to 23%, compared with the best prior art [6].


reference text

[1] S. Avidan. SpatialBoost: adding spatial reasoning to adaboost. In ECCV, 2006. 2

[2] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool. Speeded-up robust features (surf). Comput. Vis. Image Underst., 2008. 7

[3] M. Calonder, V. Lepetit, C. Strecha, and P. Fua. Brief: Binary robust independent elementary features. In ECCV, 2010. 2, 3

[4] P. Carbonetto, N. de Freitas, and K. Barnard. A statistical model for general contextual object recognition. In ECCV, 2004. 2

[5] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005. 2

[6] Y. Ding and J. Xiao. Contextual boost for pedestrian detection. In CVPR, 2012. 1, 2, 4, 5, 6, 7, 8

[7] P. Doll a´r, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: An evaluation of the state of the art. PAMI, 2011. 1, 2, 6, 7

[8] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. IJCV, 2010. 1, 2, 5, 7

[9] P. F. Felzenszwalb, R. B. Girshick, and D. Mcallester. Cascade object detection with deformable part models. In CVPR, 2010. 7

[10] P. F. Felzenszwalb, R. B. Girshick, D. A. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. PAMI, 2010. 1, 4, 5, 6, 7

[11] Y. Freund. An adaptive version of the boost by majority algorithm. Machine Learning, 2001 . 5

[12] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Annals of Statistics, 2000. 5, 6

[13] R. B. Girshick, P. F. Felzenszwalb, and D. McAllester. Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/ rbg/latent-release5/. 1, 2, 4, 5, 6, 7, 8

[14] G. Heitz and D. Koller. Learning spatial context: Using stuff to find

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31] things. In ECCV, 2008. 2 D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. IJCV, 2008. 2 M. Jones, P. Viola, P. Viola, M. J. Jones, D. Snow, and D. Snow. Detecting pedestrians using patterns of motion and appearance. In ICCV, 2003. 1 T. Kobayashi. Higher-order co-occurrence features based on discriminative co-clusters for image classification. In BMVC, 2012. 2, 4 T. Kobayashi and N. Otsu. Bag ofhierarchical co-occurrence features for image classification. In ICPR, 2010. 2 C. Li, D. Parikh, and T. Chen. Extracting adaptive contextual cues from unlabeled regions. In ICCV, 2011. 1, 3, 8 H. Ling and S. Soatto. Proximity distribution kernels for geometric context in category recognition. In ICCV, 2007. 2 K. Mikolajczyk, C. Schmid, and A. Zisserman. Human detection based on a probabilistic assembly of robust part detectors. In ECCV, 2004. 1 M. O¨zuysal, M. Calonder, V. Lepetit, and P. Fua. Fast keypoint recognition using random ferns. PAMI, 2010. 2 D. Ramanan. Using segmentation to verify object hypotheses. In CVPR, 2007. 2 E. Rublee, V. Rabaud, K. Konolige, and G. R. Bradski. Orb: An efficient alternative to sift or surf. In ICCV, 2011. 2 H. Schneiderman and T. Kanade. A statistical method for 3d object detection applied to faces and cars. In CVPR, 2000. 1 Z. Song, Q. Chen, Z. Huang, Y. Hua, and S. Yan. Contextualizing object detection and classification. In CVPR, 2011. 1, 8 A. Torralba. Contextual priming for object detection. IJCV, 2003. 2 A. Torralba, K. P. Murphy, and W. T. Freeman. Contextual models for object detection using boosted random fields. In NIPS, 2004. 2 Z. Tu and X. Bai. Auto-context and its application to high-level vision tasks and 3d brain image segmentation. PAMI, 2010. 2 M. Varma and B. R. Babu. More generality in efficient multiple kernel learning. In ICML, 2009. 5 A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In ICCV, 2009. 1, 5, 8

[32] P. Viola and M. Jones. Robust real-time face detection. IJCV, 2004. 1

[33] X. Wang, X. Han, and S. Yan. An hog-lbp human detector with partial occlusion handling. In ICCV, 2009. 7

[34] Y. Yang and S. Newsam. Spatial pyramid co-occurrence for image classification. In ICCV, 2011. 2

[35] J. Zhang, K. Huang, Y. Yu, and T. Tan. Boosted local structured hog-lbp for object localization. In CVPR, 2010. 1, 8

[36] L. Zhu, Y. Chen, A. L. Yuille, and W. T. Freeman. Latent hierarchical structural learning for object detection. In CVPR, 2010. 1, 8 111888000533