cvpr cvpr2013 cvpr2013-227 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jonathan T. Barron, Jitendra Malik
Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. [sent-4, score-0.984]
2 We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. [sent-6, score-0.697]
3 We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. [sent-7, score-0.537]
4 Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. [sent-8, score-1.013]
5 Introduction One of the core problems of computer vision is inferring the properties of a scene (shape, surface normals, illumination, reflectance, etc) that together produced a single observed image. [sent-11, score-0.243]
6 Natural images, in contrast, contain many shapes which may occlude or support one another, as well as complicated, spatially-varying illumination in the form of shadows, attenuation, and interreflection. [sent-15, score-0.334]
7 In this paper, we address the problem of inferring a mixture of shapes and a mixture ofilluminations (and implicitly, a shading image and a reflectance image) which explain a natural scene. [sent-16, score-1.074]
8 But this is a classic “chicken-orthe-egg” problem, as we cannot reliably segment an image into its constituent shapes and illuminations without first in- ferring shape and illumination, and vice versa. [sent-18, score-0.397]
9 This is motivated by the observation that variation in shape and illumination tends to produce gradients and contours in the image, and so our mixtures of shapes and illuminations should be embedded in a space that respects such image variation. [sent-23, score-0.744]
10 Using shading cues to infer shape, as we are attempting, is understood to work poorly for recovering low-frequency (coarse) shape information [2, 7]. [sent-24, score-0.407]
11 Thankfully, depth data from sensors such as the Kinect [10] is becoming increasing commonplace, and is complementary to shading: binocular disparity (the principle by which the Kinect computes depth) is accurate at coarse scales and inaccurate at fine scales. [sent-25, score-0.399]
12 We will therefore assume the input to our model is an RGB-D image, where “D” is the depth map produced by a sensor such as the Kinect. [sent-26, score-0.521]
13 This makes our problem easier, but in no way trivial depth maps from sensors such as the Kinect are noisy and incomplete for many rea— 111777 we have the output of our model. [sent-27, score-0.479]
14 Depth maps are visualized with hue corresponding to depth and luminance corresponding to slant, and surface normals are visualized with hue corresponding to orientation, and saturation and luminance corresponding to slant. [sent-28, score-0.792]
15 Illumination is visualized by rendering a coarse grid of spheres under the spatially-varying reflectance and shading images produced by two illumination. [sent-30, score-0.703]
16 Attempts to use raw depth maps from the Kinect for photometric applications therefore often fail badly. [sent-34, score-0.478]
17 See Figures 1, 5, 6, and 7 for demonstrations of how noisy these depth maps are compared to the depth maps that our model produces. [sent-35, score-0.853]
18 Forsyth [9] used a spatially-varying model of illumination to address complicated illumination and interreflection, but did not address reflectance or scene-like shape occlusion. [sent-41, score-0.833]
19 [16] have attempted to recover the reflectance and illumination of a scene, but assume known geometry and multiple images, or a user annotation of geometry and illumination, respectively. [sent-44, score-0.511]
20 [19] produces shading and reflectance images given RGB-D data, but requires a video and a fused depth map, and does not produce an illumination model or a refined shape. [sent-49, score-1.172]
21 Our paper is as follows: in Section 2 we review SIRFS, in Section 3 we introduce Scene-SIRFS, and in Section 4 we introduce the embedding used by our shape and illumination mixtures. [sent-50, score-0.401]
22 In Sections 5 and 6 we present our priors on shape and illumination (our shape prior incorporates the input depth map from the Kinect), and in Section 7 we show how we optimize the resulting inference problem. [sent-51, score-0.836]
23 SIRFS Our model builds upon the “shape, illumination, and reflectance from shading” (SIRFS) model [3, 4], which is a framework for recovering intrinsic scene properties from a single image of a segmented object. [sent-54, score-0.559]
24 SIRFS can be thought of as an extension of classic shape-from-shading models [14] in which reflectance and illumination are recovered in addition to shape. [sent-55, score-0.545]
25 g(R), f(Z), and h(L) are cost functions for reflectance, 111888 shape, and illumination respectively, and can be viewed (roughly) as negative log-likelihoods. [sent-58, score-0.234]
26 This limitation is due to several factors: 1) SIRFS considers shapes to be composed of a single smooth depth-map Z, and therefore cannot model depth discontinuities, occlusion, etc. [sent-62, score-0.459]
27 2) SIRFS has a single global model of illumination L, but natural scenes contain spatially-varying illumination due to attenuation, interreflection, cast and attached shadows, etc. [sent-63, score-0.5]
28 shapes san sdim lights oins Etqeaudat oiofn a single shape ta wnde light, and we have introduced U and V , two sets of “images” that define distributions over shapes and lights, respectively. [sent-83, score-0.323]
29 Similarly, V is the “ownership” of each illumination in L, such that if Vi,mj = 1 then pixel (i, j) is entirely illuminated by Lm. [sent-86, score-0.263]
30 Our prior on shape is now a sum of priors over individual depth maps, where each Zn in Z is regularized independently (see Section 5). [sent-87, score-0.448]
31 In contrast, our prior on illumination is over the expected illumination of the entire scene, the per-pixel weighted combination of each illumination (see Section 6). [sent-88, score-0.731]
32 We use 8 shapes and illuminations in our mixtures for all experiments (|L| = |Z| = 8) though this is arbitrary. [sent-91, score-0.393]
33 For the purpose of optimization, we need to define the normal field of this mixture of shapes N? [sent-93, score-0.344]
34 use the surface normals of the expected depth map N(? [sent-96, score-0.579]
35 (·) be our rendering engine for our mixtures, which computes tbhee o onuorrm renald feireilndg go fe nthgien mei fxoturr oeu orf m shapes sa,n wd renders it such that the spherical harmonic illumination at pixel (i,j) is a linear combination of all Lm, weighted by Vi,mj: S? [sent-129, score-0.449]
36 Though th ise t spatially varying niellu inm sintaantidoanr by {L, V } (4) parametrized is capable of explaining away shadows, specu- lbayri t{ieLs,, Van }d i sin ctaerpraebflleec otifon exs,p no attempt yha ssh a bdeoewn m, sapdeec uto- ensure that the illumination is globally consistent. [sent-133, score-0.28]
37 Though this may seem unsettling, the human visual system has similar properties: people tend not to notice inconsistent shadows or impossible illumination [24]. [sent-134, score-0.272]
38 Each shape’s and light’s “ownership” of the image is parametrized by a 17-dimensional vector, which is projected onto the eigenvector basis and passed through a softmax function to yield the probability of each pixel belonging to each mixture component. [sent-140, score-0.36]
39 Mixture Embedding Using a mixture of shapes and illuminations is necessary to model depth discontinuities and spatially varying illumination, both of which tend to produce variation in the image in the form of contours, intensity variation, texture gradients, etc. [sent-143, score-0.874]
40 It therefore follows that we should embed the shape and light mixtures in some space where the “ownership” of each mixture adheres to the segmentation of the scene. [sent-144, score-0.457]
41 We will instead embed each mixture component in a more “soft” embedding: the eigenvectors of the normalized Laplacian of a graph corresponding to the input RGB image [27]. [sent-147, score-0.28]
42 B is our embedding space, in that each mixture component is defined by a 17-dimensional vector, whose inner prod- uct with B defines how dominant that mixture component is at every pixel in the input image. [sent-151, score-0.506]
43 We do this because the depth images are often mis-aligned and noisy enough that it is challenging to construct a single accurate contour signal from both sources of information. [sent-155, score-0.44]
44 Using only the image to create an embedding circumvents the noise in the depth map and forces the reconstructed shape to be aligned with the image. [sent-156, score-0.596]
45 Our prior on reflectance g(·) is exactly the same as in [3]. [sent-157, score-0.306]
46 (a·m), our priors on our shape and illumination mixtur(e·)s, a respectively. [sent-160, score-0.322]
47 We introduce fZˆ (Z, U), which encourages Z to be similar to the raw sensor depth map if Z is thought to be “visible” according to U. [sent-165, score-0.526]
48 Crucially, we apply this prior to each individual depth map in our mixture rather than to some average depth map. [sent-166, score-0.956]
49 This encourages the scene’s constituent depth maps to be smooth while allowing the expected depth map implied by the mixture to vary abruptly, thereby allowing us to model depth discontinuities and occlusion. [sent-167, score-1.433]
50 We use version 2 of the NYU Depth Dataset [28], which consists of RGB images and aligned Kinect depth maps. [sent-168, score-0.331]
51 Because Kinect depth maps often have gaps, the dataset also provides inpainted depth maps. [sent-169, score-0.83]
52 We will use the raw depth maps rather than the inpainted ones, as our algorithm will implicitly denoise and inpaint depth during inference. [sent-170, score-0.904]
53 In addition to gaps, Kinect depth maps have different kinds of noise. [sent-171, score-0.404]
54 First: the depth and RGB images are often not well-aligned — not enough to matter for most recognition tasks, but enough to affect photometric or reconstruction 222000 tasks. [sent-172, score-0.331]
55 We must construct a loss function to encourage our recovered depth Z to resemble the raw sensor depth First, Zˆ. [sent-175, score-0.853]
56 The loss is proportional to Ui,j, which means that Zi,j need only resemble Zˆi,j if our model believes that this depth map is in the foreground at pixel (i, j). [sent-193, score-0.454]
57 Illumination Priors Our prior on illumination is a simple extension of the illumination prior of [3] to a mixture model, in which we regularize the expectation of a set of illuminations instead of a single illumination. [sent-197, score-0.927]
58 1 Where L¯i,j is a 27-dimensional vector describing the effective illumination at pixel (i, j) in the image. [sent-204, score-0.263]
59 Our prior on illumination is the negative log-likelihood of a multivariate normal distribution, applied to each 27-dimensional “pixel” in L¯: h? [sent-205, score-0.308]
60 We initialize the depth maps in our shape mixture by fitting a mixture of Gaussians to the (x, y, z) coordinates of depth- map pixels, and then fitting a plane to each Gaussian. [sent-210, score-0.956]
61 3(a) shows the raw depth map, 3(b) shows the posterior probability of each pixel under each mixture component, and 3(c) shows the fitted planes composed into one depth map according to hard assignments under the mixture of Gaussians. [sent-211, score-1.229]
62 gIn( optimization we internally represent each depth map Zn as a pyramid, and whiten each illumination Lm according to {μL , ΣL}. [sent-215, score-0.631]
63 Because the scenes in the NYU dataset are mostly composed of planar surfaces, we will initialize each depth map Zi in Z to a plane such that the scene is well-described by the set of planes. [sent-219, score-0.471]
64 We initialize each surface in our mixture to its corresponding plane in our mixture of Gaussians, by solving for z at every pixel. [sent-224, score-0.472]
65 Therefore, in our synthetic experiments we initialize the depth maps by doing K-means (with 50 random restarts) on just the z values in the scene, and then initializing each depth map to be a centroid, thereby constraining the initial depth-planes to be fronto-parallel. [sent-227, score-0.801]
66 In 4(b) we have the depth map, surface normals, reflectance, shading, and spatially-varying illumination that our model produces, and the corresponding ground-truth scene properties on the bottom. [sent-229, score-0.709]
67 In 4(c) and 4(d) we show the shading and reflectance images produced by the best-performing intrinsic image algorithms. [sent-230, score-0.749]
68 However, it is ex- tremely difficult to produce ground-truth shape, reflectance, shading, and illumination models for real-world natural AlgorithmZ-MAEN-MAEs-MSEr-MSErs-MSEL-MSEAvg. [sent-234, score-0.263]
69 Z-MAE measures shape errors, N-MAE measures surface-normal errors, s-MSE, r-MSE, and rs-MSE measure shading and reflectance errors, and L -MSE measures illumination errors. [sent-246, score-0.868]
70 (1)-(3) are intrinsic image algorithms, which produce shading and reflectance images from an RGB image, where (3) is the current state-of-theart. [sent-248, score-0.709]
71 (4) evaluates the error of the noisy Kinect-like depth maps we use as input. [sent-249, score-0.475]
72 (5) is the SIRFS model that we build upon, and is equivalent to our model without any mixture models or a Kinect depth map. [sent-250, score-0.53]
73 (G) is a shape-denoising algorithms in which we omit the RGB image and just optimize over shape with respect to our prior on shapes, and (H) is (G) with a single depth map, instead of a mixture model. [sent-254, score-0.647]
74 Thankfully, using the MIT Intrinsic Images dataset [12] extended with the ground-truth depth maps produced by Berkeley [4] we can compose pseudo-synthetic scenes that emulate natural scenes. [sent-256, score-0.505]
75 We also generate noisy Kinect-like depth maps from ground-truth depth maps for use as input to our model. [sent-258, score-0.853]
76 The shading and reflectance images produced by our model beat or match the best intrinsic image algorithms. [sent-262, score-0.749]
77 The surface normals produced by our model have half of the error of the input, though for absolute depth error we do not improve. [sent-263, score-0.665]
78 This is consistent with the limits of shading, as shading directly informs the surface normal, but only implicitly informs absolute depth. [sent-264, score-0.407]
79 The degenerate case of our model which only denoises the depth map and ignores the RGB image performs surprisingly well 222222 in terms of error relative to the ground-truth shape and normal field. [sent-268, score-0.556]
80 Our shading and reflectance images generally look much better than those produced by the intrinsic image algorithms, and our recovered depth and surface normals look much better than the input Kinect image. [sent-275, score-1.296]
81 Our spatially varying illumination captures shadowing and interreflections, and looks reasonable. [sent-276, score-0.261]
82 In Figures 5 and 6 we use our output to re-render the input image under different camera viewpoints and under different illumination conditions. [sent-279, score-0.264]
83 Our renderings look significantly better than renderings produced with the inpainted Kinect depth map provided by the NYU dataset. [sent-280, score-0.655]
84 Changing the viewpoint with the raw Kinect depths creates jagged artifacts at the edges of shapes, while our depth (which is both denoised and better-aligned to the image) looks smooth and natural at object boundaries. [sent-281, score-0.573]
85 Relighting the raw Kinect depth produces terrible artifacts, as the surface normals of the raw depth are very inaccurate due to noise and quantization, while relighting our output looks reasonable, as the surface normals are cleaner and reflectance has been separated from shading. [sent-282, score-1.622]
86 In Figure 7 we see that the depth maps our model produces are less noisy than the NYU depth maps, and more detailed than the output of the shape-denoising ablation of our model, demonstrating the importance of the complete model. [sent-283, score-0.912]
87 We have done this by generalizing SIRFS into a mixture model of shapes and illuminations, and by embedding those mixtures into a soft segmentation of an image. [sent-286, score-0.497]
88 We additionally use the noisy depth maps in RGB-D data to improve low-frequency shape estimation. [sent-287, score-0.537]
89 Our model improves the initial depth map by removing noise, adding finescale shape detail, and aligning the depth to the RGB image, all of which presumably would be useful in any application involving RGB-D images. [sent-289, score-0.843]
90 Perhaps most importantly, our model takes an important step towards solving one of the grand challenges in vision inferring all intrinsic scene properties from a single image. [sent-290, score-0.234]
91 High-frequency shape and albedo from shading using natural image statistics. [sent-305, score-0.357]
92 Shape, albedo, and illumination from a single image of an unknown object. [sent-317, score-0.234]
93 Such a warping could be produced using just the smoothed Kinect depth maps provided in the NYU dataset (middle), but these images have jagged artifacts at surface and normal discontinuities. [sent-410, score-0.665]
94 minations can be replaced (here we use randomly generated illuminations) and the input image (left) can shown under a different illumination (right). [sent-412, score-0.234]
95 The middle image is our attempt to produce similar re-lit images using only the inpainted depth maps in the NYU dataset, which look noticeably worse due to noise in the depth image and the fact that illumination and reflectance have not been decomposed. [sent-413, score-1.402]
96 (a) input (b) NYU depth (c) denoised depth (d) our depth Figure 7. [sent-440, score-1.033]
97 In 7(a) we have the RGB-D input to our model, demonstrating how noisy and incomplete the raw Kinect depth map can be. [sent-442, score-0.516]
98 7(b) shows the inpainted normals and depth included in the NYU dataset [28], where holes have been inpainted but there is still a great deal of noise, and many fine-scale shape details are missing. [sent-443, score-0.717]
99 7(c) is from an ablation of our model in which we just denoise/inpaint the raw depth map (“model H” in our ablation study), and 7(d) is from our complete model. [sent-444, score-0.611]
100 Inverse global illumination: recovering reflectance models of real scenes from photographs. [sent-503, score-0.359]
wordName wordTfidf (topN-words)
[('sirfs', 0.456), ('depth', 0.331), ('reflectance', 0.277), ('shading', 0.269), ('kinect', 0.24), ('illumination', 0.234), ('mixture', 0.199), ('illuminations', 0.175), ('nyu', 0.144), ('intrinsic', 0.134), ('normals', 0.108), ('zn', 0.102), ('shapes', 0.1), ('inpainted', 0.095), ('shape', 0.088), ('mixtures', 0.087), ('embedding', 0.079), ('surface', 0.074), ('raw', 0.074), ('maps', 0.073), ('ownership', 0.07), ('ablation', 0.07), ('produced', 0.069), ('map', 0.066), ('contour', 0.064), ('fz', 0.063), ('barron', 0.062), ('rgb', 0.058), ('slant', 0.057), ('sensor', 0.055), ('eigenvector', 0.054), ('light', 0.051), ('recovering', 0.05), ('relighting', 0.05), ('eigenvectors', 0.049), ('rendering', 0.048), ('retinex', 0.047), ('renderings', 0.047), ('parametrized', 0.046), ('noisy', 0.045), ('normal', 0.045), ('ablations', 0.043), ('thankfully', 0.043), ('scene', 0.042), ('denoised', 0.04), ('visualized', 0.04), ('discontinuities', 0.04), ('supplementary', 0.039), ('engine', 0.038), ('occluding', 0.038), ('un', 0.038), ('disparity', 0.038), ('laplacian', 0.038), ('mpb', 0.038), ('artifacts', 0.038), ('shadows', 0.038), ('hue', 0.037), ('isolation', 0.037), ('gaussians', 0.037), ('jagged', 0.035), ('interreflection', 0.035), ('attenuation', 0.035), ('karsch', 0.035), ('lights', 0.035), ('recovered', 0.034), ('constituent', 0.034), ('lightness', 0.033), ('lm', 0.033), ('metrics', 0.033), ('produces', 0.032), ('embed', 0.032), ('soft', 0.032), ('scenes', 0.032), ('noise', 0.032), ('informs', 0.032), ('intervening', 0.032), ('dozens', 0.032), ('softmax', 0.032), ('though', 0.031), ('contours', 0.031), ('binocular', 0.03), ('hurts', 0.03), ('inferring', 0.03), ('material', 0.03), ('output', 0.03), ('pb', 0.029), ('pixel', 0.029), ('produce', 0.029), ('prior', 0.029), ('properties', 0.028), ('conservative', 0.028), ('segmented', 0.028), ('smooth', 0.028), ('resemble', 0.028), ('presumably', 0.027), ('expectation', 0.027), ('looks', 0.027), ('error', 0.026), ('saxena', 0.026), ('luminance', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000007 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
Author: Jonathan T. Barron, Jitendra Malik
Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.
2 0.3450506 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
Author: Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin
Abstract: We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading. In our framework, the partial depth information is used to overcome bas-relief ambiguity in normals estimation, as well as to assist in recovering relative albedos, which are needed to reliably estimate the lighting environment and to separate shading from albedo. This refinement of surface normals using a noisy depth map leads to high-quality 3D surfaces. The effectiveness of our algorithm is demonstrated through several challenging real-world examples.
3 0.29783767 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
Author: Ju Shen, Sen-Ching S. Cheung
Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.
4 0.27462256 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
Author: Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin
Abstract: We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information. DFD suffers from important limitations namely coarse shape reconstruction and poor accuracy on textureless surfaces that can be overcome with the help of shading. We integrate both forms of data within a Bayesian framework that capitalizes on their relative strengths. Shading data, however, is challenging to recover accurately from surfaces that contain texture. To address this issue, we propose an iterative technique that utilizes depth information to improve shading estimation, which in turn is used to elevate depth estimation in the presence of textures. With this approach, we demonstrate improvements over existing DFD techniques, as well as effective shape reconstruction of textureless surfaces. – –
5 0.26435864 305 cvpr-2013-Non-parametric Filtering for Geometric Detail Extraction and Material Representation
Author: Zicheng Liao, Jason Rock, Yang Wang, David Forsyth
Abstract: Geometric detail is a universal phenomenon in real world objects. It is an important component in object modeling, but not accounted for in current intrinsic image works. In this work, we explore using a non-parametric method to separate geometric detail from intrinsic image components. We further decompose an image as albedo ∗ (ccoomarpsoen-escnatsle. shading +e shading pdoestaeil a).n Oaugre decomposition offers quantitative improvement in albedo recovery and material classification.Our method also enables interesting image editing activities, including bump removal, geometric detail smoothing/enhancement and material transfer.
6 0.23917994 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
7 0.22546177 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials
8 0.19153458 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
9 0.18328615 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction
10 0.16228303 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
11 0.15932348 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
12 0.15821646 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors
13 0.1578642 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
14 0.15220118 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
16 0.14471814 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
17 0.14455849 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
18 0.14282514 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
19 0.1397081 465 cvpr-2013-What Object Motion Reveals about Shape with Unknown BRDF and Lighting
20 0.13283595 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
topicId topicWeight
[(0, 0.209), (1, 0.287), (2, 0.032), (3, 0.104), (4, -0.027), (5, -0.109), (6, -0.133), (7, 0.18), (8, 0.055), (9, -0.047), (10, -0.093), (11, -0.221), (12, -0.087), (13, 0.136), (14, 0.099), (15, 0.04), (16, -0.121), (17, -0.059), (18, -0.094), (19, -0.094), (20, 0.016), (21, 0.024), (22, -0.02), (23, 0.001), (24, 0.095), (25, 0.058), (26, 0.025), (27, -0.029), (28, -0.006), (29, 0.019), (30, 0.123), (31, -0.06), (32, -0.016), (33, 0.083), (34, -0.027), (35, 0.049), (36, -0.048), (37, 0.017), (38, -0.052), (39, 0.011), (40, -0.135), (41, -0.014), (42, 0.012), (43, -0.084), (44, 0.04), (45, -0.014), (46, 0.017), (47, 0.013), (48, 0.009), (49, -0.014)]
simIndex simValue paperId paperTitle
same-paper 1 0.95026875 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
Author: Jonathan T. Barron, Jitendra Malik
Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.
2 0.92847723 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
Author: Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin
Abstract: We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information. DFD suffers from important limitations namely coarse shape reconstruction and poor accuracy on textureless surfaces that can be overcome with the help of shading. We integrate both forms of data within a Bayesian framework that capitalizes on their relative strengths. Shading data, however, is challenging to recover accurately from surfaces that contain texture. To address this issue, we propose an iterative technique that utilizes depth information to improve shading estimation, which in turn is used to elevate depth estimation in the presence of textures. With this approach, we demonstrate improvements over existing DFD techniques, as well as effective shape reconstruction of textureless surfaces. – –
3 0.91476333 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
Author: Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin
Abstract: We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading. In our framework, the partial depth information is used to overcome bas-relief ambiguity in normals estimation, as well as to assist in recovering relative albedos, which are needed to reliably estimate the lighting environment and to separate shading from albedo. This refinement of surface normals using a noisy depth map leads to high-quality 3D surfaces. The effectiveness of our algorithm is demonstrated through several challenging real-world examples.
4 0.801718 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction
Author: Eno Töppe, Claudia Nieuwenhuis, Daniel Cremers
Abstract: We introduce the concept of relative volume constraints in order to account for insufficient information in the reconstruction of 3D objects from a single image. The key idea is to formulate a variational reconstruction approach with shape priors in form of relative depth profiles or volume ratios relating object parts. Such shape priors can easily be derived either from a user sketch or from the object’s shading profile in the image. They can handle textured or shadowed object regions by propagating information. We propose a convex relaxation of the constrained optimization problem which can be solved optimally in a few seconds on graphics hardware. In contrast to existing single view reconstruction algorithms, the proposed algorithm provides substantially more flexibility to recover shape details such as self-occlusions, dents and holes, which are not visible in the object silhouette.
5 0.80125517 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
Author: Kevin Karsch, Zicheng Liao, Jason Rock, Jonathan T. Barron, Derek Hoiem
Abstract: Early work in computer vision considered a host of geometric cues for both shape reconstruction [11] and recognition [14]. However, since then, the vision community has focused heavily on shading cues for reconstruction [1], and moved towards data-driven approaches for recognition [6]. In this paper, we reconsider these perhaps overlooked “boundary” cues (such as self occlusions and folds in a surface), as well as many other established constraints for shape reconstruction. In a variety of user studies and quantitative tasks, we evaluate how well these cues inform shape reconstruction (relative to each other) in terms of both shape quality and shape recognition. Our findings suggest many new directions for future research in shape reconstruction, such as automatic boundary cue detection and relaxing assumptions in shape from shading (e.g. orthographic projection, Lambertian surfaces).
6 0.74513119 305 cvpr-2013-Non-parametric Filtering for Geometric Detail Extraction and Material Representation
7 0.68574464 330 cvpr-2013-Photometric Ambient Occlusion
8 0.67318618 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
9 0.6449607 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
10 0.64038801 466 cvpr-2013-Whitened Expectation Propagation: Non-Lambertian Shape from Shading and Shadow
11 0.63475817 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
12 0.62879783 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials
13 0.60729349 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
14 0.59250772 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
15 0.58615929 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
16 0.5859884 409 cvpr-2013-Spectral Modeling and Relighting of Reflective-Fluorescent Scenes
17 0.58524996 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
18 0.55211252 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
19 0.54904008 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
20 0.54787719 428 cvpr-2013-The Episolar Constraint: Monocular Shape from Shadow Correspondence
topicId topicWeight
[(10, 0.134), (16, 0.045), (26, 0.062), (33, 0.249), (66, 0.199), (67, 0.044), (69, 0.071), (87, 0.099)]
simIndex simValue paperId paperTitle
1 0.93792516 75 cvpr-2013-Calibrating Photometric Stereo by Holistic Reflectance Symmetry Analysis
Author: Zhe Wu, Ping Tan
Abstract: Under unknown directional lighting, the uncalibrated Lambertian photometric stereo algorithm recovers the shape of a smooth surface up to the generalized bas-relief (GBR) ambiguity. We resolve this ambiguity from the halfvector symmetry, which is observed in many isotropic materials. Under this symmetry, a 2D BRDF slice with low-rank structure can be obtained from an image, if the surface normals and light directions are correctly recovered. In general, this structure is destroyed by the GBR ambiguity. As a result, we can resolve the ambiguity by restoring this structure. We develop a simple algorithm of auto-calibration from separable homogeneous specular reflection of real images. Compared with previous methods, this method takes a holistic approach to exploiting reflectance symmetry and produces superior results.
2 0.91904771 127 cvpr-2013-Discovering the Structure of a Planar Mirror System from Multiple Observations of a Single Point
Author: Ilya Reshetouski, Alkhazur Manakov, Ayush Bandhari, Ramesh Raskar, Hans-Peter Seidel, Ivo Ihrke
Abstract: We investigate the problem of identifying the position of a viewer inside a room of planar mirrors with unknown geometry in conjunction with the room’s shape parameters. We consider the observations to consist of angularly resolved depth measurements of a single scene point that is being observed via many multi-bounce interactions with the specular room geometry. Applications of this problem statement include areas such as calibration, acoustic echo cancelation and time-of-flight imaging. We theoretically analyze the problem and derive sufficient conditions for a combination of convex room geometry, observer, and scene point to be reconstructable. The resulting constructive algorithm is exponential in nature and, therefore, not directly applicable to practical scenarios. To counter the situation, we propose theoretically devised geometric constraints that enable an efficient pruning of the solution space and develop a heuristic randomized search algorithm that uses these constraints to obtain an effective solution. We demonstrate the effectiveness of our algorithm on extensive simulations as well as in a challenging real-world calibration scenario.
3 0.88580894 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials
Author: Zhenglong Zhou, Zhe Wu, Ping Tan
Abstract: We present a method to capture both 3D shape and spatially varying reflectance with a multi-view photometric stereo technique that works for general isotropic materials. Our data capture setup is simple, which consists of only a digital camera and a handheld light source. From a single viewpoint, we use a set of photometric stereo images to identify surface points with the same distance to the camera. We collect this information from multiple viewpoints and combine it with structure-from-motion to obtain a precise reconstruction of the complete 3D shape. The spatially varying isotropic bidirectional reflectance distributionfunction (BRDF) is captured by simultaneously inferring a set of basis BRDFs and their mixing weights at each surface point. According to our experiments, the captured shapes are accurate to 0.3 millimeters. The captured reflectance has relative root-mean-square error (RMSE) of 9%.
same-paper 4 0.8766799 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
Author: Jonathan T. Barron, Jitendra Malik
Abstract: In this paper we extend the “shape, illumination and reflectance from shading ” (SIRFS) model [3, 4], which recovers intrinsic scene properties from a single image. Though SIRFS performs well on images of segmented objects, it performs poorly on images of natural scenes, which contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we have a mixture of shapes and a mixture of illuminations, and those mixture components are embedded in a “soft” segmentation of the input image. We additionally use the noisy depth maps provided by RGB-D sensors (in this case, the Kinect) to improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications, or for any application involving RGB-D images.
5 0.87664187 344 cvpr-2013-Radial Distortion Self-Calibration
Author: José Henrique Brito, Roland Angst, Kevin Köser, Marc Pollefeys
Abstract: In cameras with radial distortion, straight lines in space are in general mapped to curves in the image. Although epipolar geometry also gets distorted, there is a set of special epipolar lines that remain straight, namely those that go through the distortion center. By finding these straight epipolar lines in camera pairs we can obtain constraints on the distortion center(s) without any calibration object or plumbline assumptions in the scene. Although this holds for all radial distortion models we conceptually prove this idea using the division distortion model and the radial fundamental matrix which allow for a very simple closed form solution of the distortion center from two views (same distortion) or three views (different distortions). The non-iterative nature of our approach makes it immune to local minima and allows finding the distortion center also for cropped images or those where no good prior exists. Besides this, we give comprehensive relations between different undistortion models and discuss advantages and drawbacks.
6 0.8518284 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
7 0.84362346 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
8 0.83887649 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
9 0.83864063 465 cvpr-2013-What Object Motion Reveals about Shape with Unknown BRDF and Lighting
10 0.83628654 54 cvpr-2013-BRDF Slices: Accurate Adaptive Anisotropic Appearance Acquisition
11 0.83167851 330 cvpr-2013-Photometric Ambient Occlusion
12 0.82891029 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems
14 0.82870018 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
15 0.82753128 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
16 0.82573032 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
17 0.82530934 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
18 0.82440597 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
19 0.82322997 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
20 0.82266915 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments