cvpr cvpr2013 cvpr2013-227 cvpr2013-227-reference knowledge-graph by maker-knowledge-mining

227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image


Source: pdf

Author: Jonathan T. Barron, Jitendra Malik

Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.


reference text

[1] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik. Contour detection and hierarchical image segmentation. TPAMI, 2011.

[2] J. T. Barron and J. Malik. High-frequency shape and albedo from shading using natural image statistics. CVPR, 2011.

[3] J. T. Barron and J. Malik. Color constancy, intrinsic images,

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17] and shape estimation. ECCV, 2012. J. T. Barron and J. Malik. Shape, albedo, and illumination from a single image of an unknown object. CVPR, 2012. H. Barrow and J. Tenenbaum. Recovering intrinsic scene characteristics from images. Computer Vision Systems, 1978. M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 2002. A. Blake, A. Zisserman, and G. Knowles. Surface descriptions from stereo and shading. IVC, 1986. F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997. D. A. Forsyth. Variable-source shading analysis. IJCV, 2011. B. Freedman, A. Shpunt, M. Machline, and Y. Arieli. Depth mapping using projected patterns. US Patent, 2009. P. Gehler, C. Rother, M. Kiefel, L. Zhang, and B. Schoelkopf. Recovering intrinsic images with a global sparsity prior on reflectance. NIPS, 2011. R. Grosse, M. K. Johnson, E. H. Adelson, and W. T. Freeman. Ground-truth dataset and baseline evaluations for intrinsic image algorithms. ICCV, 2009. D. Hoiem, A. A. Efros, and M. Hebert. Recovering surface layout from an image. IJCV, 2007. B. K. P. Horn. Shape from shading: A method for obtaining the shape of a smooth opaque object from one view. Technical report, MIT, 1970. B. K. P. Horn. Determining lightness from an image. Computer Graphics and Image Processing, 1974. K. Karsch, V. Hedau, D. Forsyth, and D. Hoiem. Rendering synthetic objects into legacy photographs. SIGGRAPH Asia, 2011. J. Koenderink. What does the occluding contour tell us about solid shape? Perception, 1984.

[18] E. H. Land and J. J. McCann. Lightness and retinex theory. JOSA, 1971. 222333 Figure5.Afteramodelisrecover d,thecameracanbemoved and the input image (left) can be shown from a different viewpoint (right). Such a warping could be produced using just the smoothed Kinect depth maps provided in the NYU dataset (middle), but these images have jagged artifacts at surface and normal discontinuities. Both renderings, of course, contain artifacts in occluded regions. minations can be replaced (here we use randomly generated illuminations) and the input image (left) can shown under a different illumination (right). The middle image is our attempt to produce similar re-lit images using only the inpainted depth maps in the NYU dataset, which look noticeably worse due to noise in the depth image and the fact that illumination and reflectance have not been decomposed.

[19] K. J. Lee, Q. Zhao, X. Tong, M. Gong, S. Izadi, S. U. Lee, P. Tan, and S. Lin. Estimation of intrinsic image sequences from image+depth video. ECCV, 2012.

[20] T. K. Leung and J. Malik. Contour continuity in region based image segmentation. ECCV, 1998.

[21] A. Levin, Y. Weiss, F. Durand, and W. Freeman. Understanding and evaluating blind deconvolution algorithms. CVPR, 2009. (a) input (b) NYU depth (c) denoised depth (d) our depth Figure 7. One output of our model is a denoised depth-map. In 7(a) we have the RGB-D input to our model, demonstrating how noisy and incomplete the raw Kinect depth map can be. 7(b) shows the inpainted normals and depth included in the NYU dataset [28], where holes have been inpainted but there is still a great deal of noise, and many fine-scale shape details are missing. 7(c) is from an ablation of our model in which we just denoise/inpaint the raw depth map (“model H” in our ablation study), and 7(d) is from our complete model. The NYU depth map is noisy at high frequencies and does not model depth discontinuities (hence the dark “slant” lines outlining each object), and our “denoising” model tends to oversmooth the scene, but our complete model has little noise while recovering much of the detail of the scene and correctly separating objects into different layers.

[22] M. Maire, P. Arbelaez, C. C. Fowlkes, and J. Malik. Using contours to detect and localize junctions in natural images. CVPR, 2008.

[23] S. Maji, N. Vishnoi, and J. Malik. Biased normalized cuts. CVPR, 2011.

[24] Y. Ostrovsky, P. Cavanagh, and P. Sinha. Perceiving illumination inconsistencies in scenes. Perception, 2005.

[25] R. Ramamoorthi and P. Hanrahan. An Efficient Representation for Irradiance Environment Maps. CGIT, 2001.

[26] A. Saxena, M. Sun, and A. Ng. Make3d: learning 3d scene structure from a single still image. TPAMI, 2008.

[27] J. Shi and J. Malik. Normalized cuts and image segmentation. TPAMI, 2000.

[28] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from rgbd images. ECCV, 2012.

[29] M. F. Tappen, W. T. Freeman, and E. H. Adelson. Recovering intrinsic images from a single image. TPAMI, 2005.

[30] Y. Yu, P. Debevec, J. Malik, and T. Hawkins. Inverse global illumination: recovering reflectance models of real scenes from photographs. SIGGRAPH, 1999. 222444