iccv iccv2013 iccv2013-66 iccv2013-66-reference knowledge-graph by maker-knowledge-mining

66 iccv-2013-Building Part-Based Object Detectors via 3D Geometry

Source: pdf

Author: Abhinav Shrivastava, Abhinav Gupta

Abstract: This paper proposes a novel part-based representation for modeling object categories. Our representation combines the effectiveness of deformable part-based models with the richness of geometric representation by defining parts based on consistent underlying 3D geometry. Our key hypothesis is that while the appearance and the arrangement of parts might vary across the instances of object categories, the constituent parts will still have consistent underlying 3D geometry. We propose to learn this geometrydriven deformable part-based model (gDPM) from a set of labeled RGBD images. We also demonstrate how the geometric representation of gDPM can help us leverage depth data during training and constrain the latent model learning problem. But most importantly, a joint geometric and appearance based representation not only allows us to achieve state-of-the-art results on object detection but also allows us to tackle the grand challenge of understanding 3D objects from 2D images.

reference text

[1] H. Azizpour and I. Laptev. Object detection using stronglysupervised deformable part models. In ECCV, 2012. 1, 2, 5

[2] L. Bo, K. Lai, X. Ren, and D. Fox. Object recognition with hierarchical kernel descriptors. In CVPR, 2011. 2

[3] L. Bourdev and J. Malik. Poselets: Body part detectors trained using 3D human pose annotations. In ICCV, 2009. 2

[4] S. Branson, S. Belongie, and P. Perona. Strong supervision from weak annotation: Interactive training of deformable part models. In ICCV, 2011. 2

[5] R. Brooks. Symbolic reasoning among 3D models and 2D images. Artificial Intelligence, 1981 . 2

[6] R. Brooks, R. Creiner, and T. Binford. The acronym modelbased vision system. IJCAI, 1978. 2

[7] X. Chen, A. Shrivastava, and A. Gupta. NEIL: Extracting visual knowledge from web data. In ICCV, 2013. 2

[8] H. Chiu, L. Kaelbling, and T. Lozano-Perez. Virtual training for multi-view object class recognition. In CVPR, 2007. 2

[9] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, 2005. 2

[10] S. Divvala, A. Efros, and M. Hebert. How important are ’deformable parts’ in the deformable parts model? In ECCV Parts and Attributes Workshop, 2012. 1, 2

[11] I. Endres, V. Srikumar, M.-W. Chang, and D. Hoiem. Learning shared body-plans. In CVPR, 2012. 2, 5

[12] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results, 2005. 1

[13] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ra-

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24] manan. Object detection with discriminatively trained part based models. In TPAMI, 2010. 1, 2, 4, 5 S. Fidler, S. Dickinson, and R. Urtasun. 3D object detection and viewpoint estimation with a deformable 3D cuboid model. In NIPS, 2012. 2 D. Fouhey, A. Gupta, and M. Hebert. Data-driven 3D primitives for single image understanding. In ICCV, 2013. 2 D. Glasner, M. Galun, S. Alpert, R. Basri, and G. Shakhnarovich. Viewpoint-aware object detection and pose estimation. In ICCV, 2011. 2 C. Gu and X. Ren. Discriminative mixture-of-templates for view-point classification. In ECCV, 2010. 2 V. Hedau, D. Hoiem, and D. Forsyth. Thinking inside the box: Using appearance models and context based on room geometry. In ECCV, 2010. 2 D. Hoiem, A. A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 2007. 6 K. Lai, L. Bo, X. Ren, and D. Fox. A scalable tree-based approach for joint object and pose recognition. In AAAI, 2011. 2 K. Lai, L. Bo, X. Ren, and D. Fox. Sparse distance learning for object recognition combining RGB and depth information. In ICRA, 2011. 2 J. Lim, R. Salakhutdinov, and A. Torralba. Transfer learning by borrowing examples for multiclass object detection. In NIPS, 2011. 2 D. Lowe. Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence, 1987. 2 D. Marr and H. Nishihara. Representation and recognition of the spatial organization of three-dimensional shapes. Proc. Roy. Soc., 1978. 2

[25] P. K. Nathan Silberman, Derek Hoiem and R. Fergus. Indoor segmentation and support inference from RGBD images. In ECCV, 2012. 3, 6

[26] B. Pepik, M. Stark, P. Gehler, and B. Schiele. Teaching 3D geometry to deformable part models. In CVPR, 2012. 2

[27] S. Savarese and L. Fei-Fei. 3D generic object categorization, localization and pose estimation. In ICCV, 2007. 2

[28] S. Singh, A. Gupta, and A. Efros. Unsupervised discovery of mid-level discriminative patches. In ECCV, 2012. 2

[29] M. Stark, M. Goesele, and B. Schiele. Back to the future: Learning shape models from 3D CAD data. In BMVC, 2010. 2

[30] M. Sun, H. Su, S. Savarese, and L. Fei-Fei. A multi-view probabilistic model for 3D object classes. In CVPR, 2009. 2 [3 1] Y. Xiang and S. Savarese. Estimating the aspect layout of object categories. In CVPR, 2012. 2

[32] Y. Yang and D. Ramanan. Articulated pose estimation with exible mixtures-of-parts. In CVPR, 2011. 2, 5

[33] X. Zhu, C. Vondrick, D. Ramanan, and C. Fowlkes. Do we need more training data or better models for object detection? In BMVC, 2012. 1 11775522