cvpr cvpr2013 cvpr2013-231 cvpr2013-231-reference knowledge-graph by maker-knowledge-mining

231 cvpr-2013-Joint Detection, Tracking and Mapping by Semantic Bundle Adjustment


Source: pdf

Author: Nicola Fioraio, Luigi Di_Stefano

Abstract: In this paper we propose a novel Semantic Bundle Adjustmentframework whereby known rigid stationary objects are detected while tracking the camera and mapping the environment. The system builds on established tracking and mapping techniques to exploit incremental 3D reconstruction in order to validate hypotheses on the presence and pose of sought objects. Then, detected objects are explicitly taken into account for a global semantic optimization of both camera and object poses. Thus, unlike all systems proposed so far, our approach allows for solving jointly the detection and SLAM problems, so as to achieve object detection together with improved SLAM accuracy.


reference text

[1] A. Aldoma, F. Tombari, L. Di Stefano, and M. Vincze. A global hypothesis verification method for 3d object recognition. In Computer Vision (ECCV), IEEE European Conf. on, Florence, Italy, Oct 2012.

[2] K. S. Arun, T. S. Huang, and S. D. Blostein. Least-squares

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] fitting of two 3-d point sets. Pattern Analysis and Machine Intelligence (PAMI), IEEE Trans. on, 9(5):698–700, Sep 1987. S. Y. Bao, M. Bagra, Y.-W. Chao, and S. Savarese. Semantic structure from motion with points, regions, and objects. In Computer Vision and Pattern Recognition (CVPR), IEEE Int’l Conf. on, 2012. S. Y. Bao and S. Savarese. Semantic structure from motion. In Computer Vision and Pattern Recognition (CVPR), IEEE Int’l Conf. on, 2011. P. J. Besl and H. D. McKay. A method for registration of 3-d shapes. Pattern Analysis and Machine Intelligence (PAMI), IEEE Trans. on, 14(2):239–256, 1992. R. O. Castle, G. Klein, and D. W. Murray. Combining monoslam with object recognition for scene augmentation using a wearable camera. Image and Vision Computing, 28(11):1548–1556, 2010. J. Civera, D. G ´alvez-L o´pez, L. Riazuelo, J. D. Tard o´s, and J. M. M. Montiel. Towards semantic SLAM using a monocular camera. In Intelligent Robot Systems (IROS), IEEE/RSJ Int’l Conf. on, pages 1277–1284, 2011. J. Civera, O. G. Grasa, A. J. Davison, and J. M. M. Montiel. 1-point ransac for extended kalman filtering: Application to real-time structure from motion and visual odometry. Journal of Field Robotics, 27(5):609–631, Sep 2010. N. Cornelis, B. Leibe, K. Cornelis, and L. Van Gool. 3d urban scene modeling integrating recognition and reconstruction. International Journal of Computer Vision (IJCV), 78(23): 121–141, July 2008. A. J. Davison. Real-time simultaneous localisation and mapping with a single camera. In Computer Vision (ICCV), IEEE

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19] Int’l Conf. on, page 1403, Washington, DC, USA, 2003. E. Eade and T. Drummond. Monocular SLAM as a graph of coalesced observations. In Computer Vision (ICCV), IEEE Int’l Conf. on, pages 1–8, Rio de Janeiro, Brasil, Oct 2007. S. Ekvall, P. Jensfelt, and D. Kragic. Integrating active mobile robot object recognition and slam in natural environments. In Intelligent Robots and Systems, IEEE/RSJ Int’l Conf. on, Oct 2006. F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard. An evaluation of the RGB-D SLAM system. In Robotics and Automation (ICRA), IEEE Int’l Conf. on, St. Paul, MA, USA, May 2012. P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. Rgbd mapping: Using depth cameras for dense 3d modeling of indoor environments. In Experimental Robotics (ISER), Int’l Symp on, 2010. D. Hoiem, A. A. Efros, and M. Hebert. Putting objects in perspective. International Journal of Computer Vision (IJCV), 80(1):3–15, Oct 2008. A. Johnson. Spin-Images: A Representation for 3-D Surface Matching. PhD thesis, Robotics Institute, Carnegie Mellon University, Aug 1997. G. Klein and D. Murray. Parallel tracking and mapping for small ar workspaces. In Mixed and Augmented Reality (ISMAR), IEEE and ACM Int’l Symp. on, pages 225 –234, Nov 2007. R. K ¨ummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard. g2o: A general framework for graph optimization. In Robotics and Automation (ICRA), IEEE Int’l Conf. on, Shanghai, China, May 2011. L. Ladick´ y, P. Sturgess, C. Russell, S. Sengupta, Y. Bastan-

[20]

[21]

[22]

[23]

[24]

[25]

[26] lar, W. Clocksin, and P. H. Torr. Joint optimization for object class segmentation and dense stereo reconstruction. International Journal of Computer Vision (IJCV), 100: 122–133, 2012. L.-J. Li, R. Socher, and L. Fei-Fei. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Computer Vision and Pattern Recognition (CVPR), IEEE Int’l Conf. on, 2009. D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2):91–1 19, Jan, 5 2004. D. Meger, P.-E. Forss ´en, K. Lai, S. Helmer, S. McCann, T. Southey, M. Baumann, J. J. Little, and D. G. Lowe. Curious george: An attentive semantic robot. Robotics and Autonomous Systems, 56(6):503–51 1, June 2008. R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In Mixed and Augmented Reality (ISMAR), IEEEand ACMInt’l Symp. on, pages 127–136, Washington, DC, USA, 2011. H. Strasdat, A. J. Davison, J. Montiel, and K. Konolige. Double window optimisation for constant time visual SLAM. In Computer Vision (ICCV), IEEE Int’l Conf. on, pages 2352– 2359, Los Alamitos, CA, USA, 2011. J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. A benchmark for the evaluation of rgb-d slam systems. In Intelligent Robot Systems (IROS), IEEE/RSJ Int’l Conf. on, Oct 2012. F. Tombari, S. Salti, and L. Di Stefano. A combined textureshape descriptor for enhanced 3D feature matching. In Image Processing (ICIP), IEEE Int’l Conf. on, pages 809–812, Sep 2011.

[27] B. Triggs, P. Mclauchlan, R. Hartley, and A. Fitzgibbon. Bundle adjustment – a modern synthesis. In Vision Algorithms: Theory and Practice, LNCS, pages 298–375. Springer Verlag, 2000.

[28] S. Vasudevan, S. G ¨achter, V. Nguyen, and R. Siegwart. Cognitive maps for mobile robots-an object based approach. Robotics and Autonomous Systems, 55(5):359–371, May 2007.

[29] Y. Zhong. Intrinsic shape signatures: A shape descriptor for 3d object recognition. In ICCV Workshop, pages 689–696, Oct 2009. 111555444533