cvpr cvpr2013 cvpr2013-230 cvpr2013-230-reference knowledge-graph by maker-knowledge-mining

230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

Source: pdf

Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys

Abstract: Both image segmentation and dense 3D modeling from images represent an intrinsically ill-posed problem. Strong regularizers are therefore required to constrain the solutions from being ’too noisy’. Unfortunately, these priors generally yield overly smooth reconstructions and/or segmentations in certain regions whereas they fail in other areas to constrain the solution sufficiently. In this paper we argue that image segmentation and dense 3D reconstruction contribute valuable information to each other’s task. As a consequence, we propose a rigorous mathematical framework to formulate and solve a joint segmentation and dense reconstruction problem. Image segmentations provide geometric cues about which surface orientations are more likely to appear at a certain location in space whereas a dense 3D reconstruction yields a suitable regularization for the segmentation problem by lifting the labeling from 2D images to 3D space. We show how appearance-based cues and 3D surface orientation priors can be learned from training data and subsequently used for class-specific regularization. Experimental results on several real data sets highlight the advantages of our joint formulation.

reference text

[1] Y. Bao, M. Chandraker, Y. Lin, and S. Savarese. Dense object reconstruction with semantic priors. In Proc. CVPR, 2013.

[2] M. Bleyer, C. Rother, P. Kohli, D. Scharstein, and S. Sinha. Object stereo-joint stereo matching and object segmentation. In Proc. CVPR, pages 3081–3088, 2011.

[3] G. Brostow, J. Fauqueur, and R. Cipolla. Semantic object classes in video: A high-definition ground truth database. Pattern Recognition Letters, 30(2):88–97, 2009.

[4] A. Chambolle and T. Pock. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. J. Math. Imag. Vision, pages 1–26, 2010.

[5] A. Cohen, C. Zach, S. Sinha, and M. Pollefeys. Discovering and exploiting 3d symmetries in structure from motion. In Proc. CVPR, 2012.

[6] P. F. Felzenszwalb and O. Veksler. Tiered scene labeling with dynamic programming. In Proc. CVPR, pages 3097–3 104, 2010.

[7] S. Gould, O. Russakovsky, I. Goodfellow, P. Baumstarck, A. Y. Ng, and D. Koller. The STAIR Vision Library. http://ai.stanford.edu/˜sgould/svl, 2010. 111000333 input images, example depth map, raw image labeling, our result, tv-flux fusion result; The different class labels are depicted using the following color scheme: building → red, ground → dark gray, vegetation → green, clutter → light gray.

[8] D. Hoiem, A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 75(1): 151–172, 2007.

[9] M. Jancosek and T. Pajdla. Multi-view reconstruction preserving weakly-supported surfaces. In Proc. CVPR, 2011.

[10] K. Kolev, T. Pock, and D. Cremers. Anisotropic minimal surfaces integrating photoconsistency and normal information for multiview stereo. In Proc. ECCV, pages 538–55 1, 2010.

[11] L. Ladick´y, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, and P. Torr. Joint optimisation for object class segmentation and dense stereo reconstruction. In Proc. BMVC, pages 104.1–1 1, 2010.

[12] J. Lellmann and C. Schno¨rr. Continuous multiclass label-

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24] ing approaches and algorithms. SIAM Journal on Imaging Sciences, 4(4): 1049–1096, 2011. V. Lempitsky and Y. Boykov. Global optimization for shape fitting. In Proc. CVPR, 2007. S. Liu and D. Cooper. A complete statistical inverse ray tracing approach to multi-view stereo. In Proc. CVPR, pages 913–920, 2011. X. Liu, O. Veksler, and J. Samarabandu. Order-preserving moves for graph-cut-based optimization. IEEE Trans. Pattern Anal. Mach. Intell., 32: 1182—-1 196, 2010. J. Melonakos, E. Pichon, S. Angenent, and A. Tannenbaum. Finsler active contours. IEEE Trans. Pattern Anal. Mach. Intell., 30(3):412–423, 2008. S. Osher and S. Esedoglu. Decomposition of images by the anisotropic Rudin-Osher-Fatemi model. Comm. Pure Appl. Math., 57: 1609–1626, 2004. L. Quan, J. Wang, P. Tan, and L. Yuan. Image-based modeling by joint segmentation. IJCV, 75(1): 135–150, 2007. A. Saxena, S. Chung, and A. Ng. 3-d depth reconstruction from a single still image. IJCV, 76(1):53–69, 2008. S. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. CVPR, pages 5 19–526, 2006. C. Strecha, W. von Hansen, L. V. Gool, P. Fua, and U. Thoennessen. On benchmarking camera calibration and multi-view stereo for high resolution imagery. In Proc. CVPR, 2008. E. Strekalovskiy and D. Cremers. Generalized ordering constraints for multilabel optimization. In Proc. ICCV, 2011. C. Zach. Fast and high quality fusion of depth maps. In Proc. 3DPVT, 2008. C. Zach, C. Ha¨ne, and M. Pollefeys. What is optimized in convex relaxations for multi-label problems: Connecting discrete and continuously-inspired MAP inference. Technical report, MSR Cambridge, 2012.

[25] C. Zach, C. Ha¨ne, and M. Pollefeys. What is optimized in tight convex relaxations for multi-label problems? In Proc. CVPR, 2012.

[26] C. Zach, M. Klopschitz, and M. Pollefeys. Disambiguating visual relations using loop constraints. In Proc. CVPR, 2010.

[27] C. Zach, T. Pock, and H. Bischof. A globally optimal algorithm for robust TV-L1 range image integration. In Proc. ICCV, 2007. 111000444