iccv iccv2013 iccv2013-132 iccv2013-132-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Olaf Kähler, Ian Reid
Abstract: We address the problem of 3D scene labeling in a structured learning framework. Unlike previous work which uses structured Support VectorMachines, we employ the recently described Decision Tree Field and Regression Tree Field frameworks, which learn the unary and binary terms of a Conditional Random Field from training data. We show this has significant advantages in terms of inference speed, while maintaining similar accuracy. We also demonstrate empirically the importance for overall labeling accuracy of features that make use of prior knowledge about the coarse scene layout such as the location of the ground plane. We show how this coarse layout can be estimated by our framework automatically, and that this information can be used to bootstrap improved accuracy in the detailed labeling.
[1] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S ¨usstrunk. Slic superpixels compared to state-of-the-art superpixel methods. PAMI, 34(1 1):2274–2282, 2012.
[2] A. Anand, H. S. Koppula, T. Joachims, and A. Saxena. Contextually guided semantic labeling and search for threedimensional point clouds. International Journal of Robotics Research, 32(1): 19–34, 2013.
[3] L. Breiman. Random forests. Machine Learning, 45(1):5– 32, 2001.
[4] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, volume 1, pages 886–893, 2005.
[5] A. Flint, D. Murray, and I. Reid. Manhattan scene understanding using monocular, stereo, and 3d features. In ICCV, pages 2228–2235, 2011.
[6] J. Jancsary, S. Nowozin, and C. Rother. Regression tree fields an efficient, non-parametric approach to image labeling problems. In CVPR, pages 2376–2383, 2012.
[7] A. Johnson and M. Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. PAMI, 21(5):433– 449, 1999.
[8] G. Klein and D. Murray. Parallel tracking and mapping on a camera phone. In ISMAR, pages 83–86, 2009.
[9] R. Newcombe, S. Lovegrove, and A. Davison. Dtam: Dense tracking and mapping in real-time. In ICCV, pages 2320– 2327, 2011.
[10] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In ISMAR, pages 127–136, 2011.
[11] S. Nowozin, C. Rother, S. Bagon, T. Sharp, B. Yao, and P. Kohli. Decision tree fields. In ICCV, pages 1668–1675, 2011.
[12] I. Posner, M. Cummins, and P. Newman. A generative framework for fast urban labeling using spatial and temporal context. Autonomous Robots, 26(2): 153–170, 2009.
[13] X. Ren, L. Bo, and D. Fox. Rgb-(d) scene labeling: Features and algorithms. In CVPR, pages 2759–2766, 2012.
[14] M. Schmidt, E. Van Den Berg, M. Friedlander, and K. Mur- phy. Optimizing costly functions with simple constraints: A limited-memory projected quasi-newton algorithm. In Conference on Artificial Intelligence and Statistics, pages 456– 463, 2009.
[15] N. Silberman and R. Fergus. Indoor scene segmentation using a structured light sensor. In ICCV Workshops, pages 601– 608, 2011.
[16] H. Wang, S. Gould, and D. Koller. Discriminative learning with latent variables for cluttered indoor scene understanding. In ECCV, pages 435–449, 2010. 33006714