cvpr cvpr2013 cvpr2013-115 cvpr2013-115-reference knowledge-graph by maker-knowledge-mining

115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D


Source: pdf

Author: unkown-author

Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.


reference text

[1] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman. PatchMatch: A randomized correspondence algorithm for structural image editing. SIGGRAPH, 2009.

[2] C. Barnes, E. Shechtman, D. B. Goldman, and A. Finkelstein. The generalized PatchMatch correspondence algorithm. In ECCV, 2010.

[3] F. Besse, C. Rother, A. Fitzgibbon, and J. Kautz. Pmbp: Patchmatch belief propagation for correspondence field estimation. In BMVC, 2012.

[4] M. Bleyer, C. Rhemann, and C. Rother. PatchMatch Stereo - Stereo matching with slanted support windows. In BMVC, 2011.

[5] J. Diebel and S. Thrun. An application of markov random fields to range sensing. In NIPS, 2005.

[6] D. Douglas and T. Peucker. Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartographica, 10(2): 112–122, 1973.

[7] W. T. Freeman and C. Liu. Markov random fields for superresolution and texture synthesis. In Advances in Markov Random Fields for Vision and Image Processing. MIT Press, Berlin, 2011.

[8] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a single image. In ICCV, 2009.

[9] R. I. Hartley and A. Zisserman. Multiple View Geometry

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20] in Computer Vision. Cambridge University Press, second edition, 2004. S. Izadi, D. Kim, O. Hilliges, D. Molyneaux, R. Newcombe, P. Kohli, J. Shotton, S. Hodges, D. Freeman, A. Davison, and A. Fitzgibbon. Kinectfusion: real-time 3D reconstruction and interaction using a moving depth camera. In UIST, 2011. J. Kopf, M. F. Cohen, D. Lischinski, and M. Uyttendaele. Joint bilateral upsampling. SIGGRAPH, 2007. O. Mac Aodha, N. D. Campbell, A. Nair, and G. J. Brostow. Patch based synthesis for single depth image superresolution. In ECCV, 2012. J. Park, H. Kim, Y.-W. Tai, M. Brown, and I. Kweon. High quality depth map upsampling for 3D-TOF cameras. In ICCV, 2011. S. Rusinkiewicz and M. Levoy. Efficient variants of the ICP algorithm. In 3DIM, 2001 . M. Sambridge, J. Braun, and H. McQueen. Geophysical parametrization and interpolation of irregular data using natural neighbours. Geophysical Journal International, 122(3):837–857, 1995. D. Scharstein and R. Szeliski. High-accuracy stereo depth maps using structured light. In CVPR, 2003. S. Schuon, C. Theobalt, J. Davis, and S. Thrun. LidarBoost: Depth superresolution for ToF 3D shape scanning. In CVPR, 2009. J. Tian and K.-K. Ma. A survey on super-resolution imaging. Signal, Image and Video Processing, 5(3):329–342, 2011. J. van Ouwerkerk. Image super-resolution survey. Image and Vision Computing, 24(10): 1039 – 1052, 2006. J. Yang, J. Wright, T. Huang, and Y. Ma. Image superresolution via sparse representation. IEEE Transactions on Image Processing, 19(11):2861–2873, 2010. 1 1 1 1 1 12 2 28 6 6 2x 4x 4x Cones Teddy Tsukuba Venus Cones Teddy Tsukuba Venus Scan 21 Scan 30 Scan 42 Nearest Neighbor Diebel and Thrun [5] Yang et al. [21] 1.094 0.740 0.756 0.815 0.527 0.510 0.612 0.401 0.393 0.268 0.170 0.167 1.531 1.141 0.993 1.129 0.801 0.690 0.833 0.549 0.514 0.368 0.243 0.216 0.018 N/A N/A 0.016 N/A N/A 0.040 N/A N/A Yang et al. [20] Freeman and Liu [7] Glasner et al. [8] Mac Aodha et al. [12] Our Method 2.027 1.447 0.867 1.127 0.994 1.420 0.969 0.596 0.825 0.791 0.705 0.617 0.482 0.601 0.580 0.992 0.332 0.209 0.276 0.257 2.214 1.536 1.483 1.504 1.399 1.572 1.110 1.065 1.026 1.196 0.840 0.869 0.832 0.833 0.727 1.012 0.367 0.394 0.337 0.450 0.030 0.019 1.851 0.017 0.021 0.035 0.017 1.865 0.017 0.018 0.054 0.075 1.764 0.045 0.030 Table 1. Root mean squared error (RMSE) scores. Yang et al. [20] and Freeman and Liu [7] are image SR methods and Mac Aodha et al. [12] a depth SR method, all of which require an external database. Diebel and Thrun [5] and Yang et al. [21] are depth SR methods that use an image at the target resolution. Glasner et al. [8] is an image SR technique that uses patches from within the input image. For most data sets, our method is competitive with the top performer. Laser scan tests on the image-guided techniques were not possible for want of images at the target resolution. Best score is indicated in bold for the example-based methods, which we consider our main competitors. 2x 4x Cones Teddy Tsukuba Venus Cones Teddy Tsukuba Venus Nearest Neighbor Diebel and Thrun [5] Yang et al. [21] 1.713 3.800 2.346 1.548 2.786 1.918 1.240 2.745 1.161 0.328 0.574 0.250 3.121 7.452 4.582 3.358 6.865 4.079 2.197 5.118 2.565 0.609 1.236 0.421 Yang et al. [20] Freeman and Liu [7] Glasner et al. [8] Mac Aodha et al. [12] Our Method 61.617 6.266 4.697 2.935 2.018 54.194 4.660 3.137 2.31 1 1.862 5.566 3.240 3.234 2.235 1.644 46.985 0.790 0.940 0.536 0.377 63.742 15.077 8.790 6.541 3.271 55.080 12.122 6.806 5.309 4.234 7.649 10.030 6.454 4.780 2.932 47.053 3.348 1.770 0.856 3.245 Table 2. Percent error scores. Our method is the top performer among example-based methods and on a few occasions outperforms Diebel and Thrun [5] and Yang et al. [21]. Results provided for Yang et al. [20] suffer from incorrect absolute intensities. Figure 6. 2x nearest neighbor upscaling (b) and SR (c-e) on a stereo data set of two similar egg cartons obtained using the method of Bleyer et al. [4]. Note that (e) was preprocessed using a bilateral filter (window size 5, spatial deviation 0.5, range deviation 0.001). 1 1 1 1 1 12 2 29 7 7 (a)Nearestneighbor(b)Our esult(c)MacAodhaetal.(d)Glasneretal.[8](e)Yangetal.[20](f)Fre manandLiu[7]

[12] (preprocessed) processed, 32 bit) processed, 32 bit) 32 bit) 32 bit) Figure 7. Above, we provide zooms on a region of interest of the noisy PMD CamCube 2.0 ToF data set shown in Figure 2 for 4x nearest neighbor upscaling in (a) and 4x SR otherwise. A depth map zoom for Mac Aodha et al. was available only with bilateral preprocessing (window size 5, spatial deviation 3, range deviation 0.1). Below, we show shaded meshes for the preprocessed result of Mac Aodha et al. [12] and for our method with and without the same preprocessing ((h) is not aligned with the other meshes because we obtained the rendering from the authors). Note that although we in (i) perform worse than (h) on the vase, we preserve fine detail better and do not introduce square patch artefacts. (e)Nearestneighbor(32bit)(f Our esult(32bit)(g)Our esult(8bit)(h)Glasneretal.[8](8bit)(i)MacAodhaetal.[12] (8 bit) Figure 8. Above, zooms on a region of interest of the noiseless, though quantized Middlebury Cones data set. 2x SR was carried out (in our case, using the parameters from the quantitative evaluation) on the 2x nearest neighbor downscaling of the original, depicted in (a). Our method produces the sharpest object boundaries. Below, the corresponding shaded meshes. We show our 8 bit quantized mesh in (g) for comparison. Our method performs the best smoothing even after quantization (particularly over the cones), although it lightly smooths away the nose for the parameters used, which were kept the same for all Middlebury tests. We provide additional results on our website.

[21] Q. Yang, R. Yang, J. Davis, and D. Nist e´r. Spatial-depth super resolution for range images. In CVPR, 2007. 1 1 1 1 1 132 230 8 8