iccv iccv2013 iccv2013-121 iccv2013-121-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Reyes Rios-Cabrera, Tinne Tuytelaars
Abstract: In this paper we propose a new method for detecting multiple specific 3D objects in real time. We start from the template-based approach based on the LINE2D/LINEMOD representation introduced recently by Hinterstoisser et al., yet extend it in two ways. First, we propose to learn the templates in a discriminative fashion. We show that this can be done online during the collection of the example images, in just a few milliseconds, and has a big impact on the accuracy of the detector. Second, we propose a scheme based on cascades that speeds up detection. Since detection of an object is fast, new objects can be added with very low cost, making our approach scale well. In our experiments, we easily handle 10-30 3D objects at frame rates above 10fps using a single CPU core. We outperform the state-of-the-art both in terms of speed as well as in terms of accuracy, as validated on 3 different datasets. This holds both when using monocular color images (with LINE2D) and when using RGBD images (with LINEMOD). Moreover, wepropose a challenging new dataset made of12 objects, for future competing methods on monocular color images.
[1] H. Bay, A. Ess, T. Tuytelaars, and L. J. V. Gool. Speeded-up robust features (surf). Computer Vision and Image Understanding, 110(3):346–359, 2008. 2
[2] D. Damen, P. Bunnun, A. Calway, and W. Mayol-Cuevas. Real-time learning and detection of 3d texture-less objects: A scalable approach. In British Machine Vision Conference (BMVC). BMVA, September 2012. 1, 2, 3, 5, 6
[3] B. Drost, M. Ulrich, N. Navab, and S. Ilic. Model globally, match locally: Efficient and robust 3d object recognition. In Computer Vision and Pattern Recognition, CVPR 2010, pages 998–1005, 2010. 6
[4] D. Gavrila and V. Philomin. Real-time object detection for smart vehicles. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 87–93, 1999. 1
[5] S. Hinterstoisser, C. Cagniart, S. Ilic, P. Sturm, N. Navab, P. Fua, and V. Lepetit. Gradient response maps for real-time detection of texture-less objects. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2012. 1, 2, 3, 4, 5, 6
[6] S. Hinterstoisser, V. Lepetit, S. Ilic, P. Fua, and N. Navab. Dominant orientation templates for real-time detection of texture-less objects. In IEEE Computer Vision and Pattern Recognition (CVPR), pages 2257–2264, 2010. 2, 3, 4, 5
[7] S. Hinterstoisser, V. Lepetit, S. Ilic, S. Holzer, G. Bradski, K. Konolige, , and N. Navab. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In Asian Conference on Computer Vision (ACCV), 2012. 1, 2, 3, 4, 5, 6, 7
[8] D. G. Lowe. Object recognition from local scale-invariant features. In IEEE International Conference on Computer Vision, ICCV, pages 1150–, 1999. 2
[9] T. Malisiewicz, A. Gupta, and A. A. Efros. Ensemble of exemplar-svms for object detection and beyond. In IEEE Int. Conf. on Computer Vision (ICCV), 2011. 3
[10] D. Mladeni c´, J. Brank, M. Grobelnik, and N. Milic-Frayling.
[11]
[12]
[13]
[14]
[15] Feature selection using linear classifier weights: interaction with classification models. In Conf. on Research and development in information retrieval. ACM, 2004. 4 D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. In IEEE Conference on Computer Vision and Pattern Recognition - CVPR, 2006. 2 R. Rios-Cabrera and T. Tuytelaars. Boosting binary masks and dominant orientation templates for efficient object detection. Under revision CVIU. 3, 4 C. Steger. Occlusion, clutter, and illumination invariant object recognition. In International Archives of Photogrammetry and Remote Sensing, volume XXXIV, part 3A, pages 345–350, 2002. 2, 3 S. Yang, L. Bo, J. Wang, and L. G. Shapiro. Unsupervised template learning for fine-grained object recognition. In NIPS, pages 3131–3139. 3 C. Zhang and P. A. Viola. Multiple-instance pruning for learning efficient cascade detectors. In NIPS, 2007. 5 2055