cvpr cvpr2013 cvpr2013-115 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: unkown-author
Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.
Reference: text
sentIndex sentText sentNum sentScore
1 In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. [sent-2, score-0.846]
2 Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. [sent-3, score-1.1]
3 While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. [sent-4, score-0.544]
4 Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. [sent-5, score-0.554]
5 A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. [sent-7, score-0.467]
6 We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars. [sent-8, score-0.477]
7 Introduction With the advent of inexpensive 3D cameras like the Microsoft Kinect, depth measurements are becoming increasingly available for low-cost applications. [sent-10, score-0.335]
8 × noiseless synthetic input depth map and the output of our algorithm (bottom), both by a factor of 3. [sent-15, score-0.468]
9 In our approach, fine details such as the penguin’s eyes, beak, and the subtle polygons across its body are mapped from corresponding patches at lesser depth, and boundaries appear more natural. [sent-16, score-0.216]
10 CIna depth maps orerc 1o7v6er ×ed 1 using rst tehreeo S techniques, depth resolution decreases as a function of increasing depth from the camera. [sent-24, score-1.129]
11 Such ancillary data, however, is often unavailable or difficult to obtain. [sent-26, score-0.123]
12 In this work, we consider the question of how far one can push depth SR using no ancillary data, proceeding instead by identifying and merging patch correspondences from within the input depth map itself. [sent-27, score-1.109]
13 Our observation is that—even in the absense of object repetition of the sort exemplified in Figure 1—real-world scenes tend to exhibit patchwise ‘self-similarity’ such as repetition of geometric primitives (e. [sent-28, score-0.117]
14 Man-made scenes or objects 1 1 1 111112222233111 depicted are dissimilar with respect to depth values. [sent-31, score-0.335]
15 In the corresponding point cloud (right), the analogous 3D patch pairs are similar as point sets related by appropriate rigid body motions in 3D. [sent-32, score-0.324]
16 It is primarily this observation that we exploit in this paper, coupled with the fact that under perspective projection, an object patch at lesser depth with respect to the camera is acquired with a higher spatial resolution than a corresponding patch situated at greater depth. [sent-35, score-0.976]
17 The key contribution of our work is to proceed not by reasoning in terms of patches of 2D pixels, but rather in terms of patches of 3D points. [sent-36, score-0.195]
18 In × × addition, we introduce a new 3D variant of PatchMatch to obtain a dense correspondence field in reasonable time and a simple, yet effective patch upscaling and merging technique to generate the output SR depth map. [sent-38, score-0.829]
19 Their strategy is to search for corresponding 5 5 pixel patches across a d tiosc sreeaterc chas focard ceo roref sdpoowndnsincagle 5d copies oefl the input image and to exploit sub-pixel shifts between correspondences. [sent-42, score-0.12]
20 hFoiwguervee r2, filalcuesstr saeteriso tuhsr eper ofbulnedma-s radius r of a center point and which we match with respect to 3D point similarity over 6 DoF rigid body motions in 3D. [sent-44, score-0.135]
21 [20], and Freeman and Liu [7] are image SR techniques against which we compare our algorithm in Section 3, by treating input depth maps as intensity images. [sent-52, score-0.407]
22 Previous work on depth SR can broadly be categorized into methods that (i) use a guiding color or intensity image at the target resolution, (ii) merge information contained in multiple aligned depth maps, or (iii) call on an external database of high-resolution depth exemplars. [sent-56, score-1.172]
23 We devote the remainder of this section to a discussion of representative or seminal techniques from the depth SR literature. [sent-57, score-0.335]
24 The most common depth SR strategy involves using an ancillary color or intensity image at the target resolution to guide the reconstruction of the SR depth map. [sent-59, score-0.972]
25 The underlying assumption is that changes in depth are colocated with edges in the guiding image. [sent-60, score-0.374]
26 [21] apply joint bilateral upscaling on a cost volume constructed from the low resolution input depth map, followed by Kopf et al. [sent-62, score-0.671]
27 [17] combines several depth maps acquired from slightly different viewpoints. [sent-69, score-0.37]
28 [10] produces outstanding results by fusing a sequence of depth maps generated by a tracked Kinect camera into a single 3D representation in real-time. [sent-71, score-0.37]
29 They propose to assemble the SR depth map from a collection of depth 1 1 1 111112222244222 Sx , Sx? [sent-75, score-0.723]
30 The point Px ∈ R3 is the pre-image of the pixel x of the input depth map. [sent-77, score-0.375]
31 x = g(Px) of the closer patch is by design not required to be one of the 3D points of the input depth map, hence P? [sent-79, score-0.557]
32 Algorithm Owing to the perspective projection that underlies image formation, object patches situated at a lesser depth with respect to the camera are imaged with a higher spatial resolution (i. [sent-85, score-0.644]
33 , a greater point density) than corresponding object patches at greater depth. [sent-87, score-0.148]
34 Our depth SR algorithm consists of two steps: (i) find, for each patch in the input depth map, a corresponding patch at lesser or equal depth with respect to the camera, and (ii) use the dense correspondence field to generate the SR output. [sent-88, score-1.545]
35 3D Point Patches Let g = (R, t) ∈ SE(3) denote a 6 DoF rigid body motionL ient 3gD =, w (Rh,etr)e R∈ ∈ S ES(O3)(3 d) annodte et a∈ 6 R D3o. [sent-96, score-0.182]
36 2 is to find an optimal rigid body motion g for each pixel x, mapping the patch corresponding to x to a valid matching patch at lesser or equal depth with respect to the camera. [sent-100, score-1.007]
37 We shall understand the patch corresponding to x—the further1 patch, for brevity—to be the set Sx ⊂ R3 of 3D points within a ra- × dius r of the pre-image Px ⊂= RZx ·K−1(x? [sent-101, score-0.189]
38 ∈ R3 of x, where Zx is the depth encoded at· x in the input depth map and K is the 3 3 camera calibration matrix (cf. [sent-103, score-0.723]
39 The 3D points of the corresponding closer patch Sx? [sent-106, score-0.222]
40 1We acknowledge that this is something of an abuse of terminology, since two points can be situated at equal depth with respect to the camera but be at different distances from it. [sent-107, score-0.44]
41 The function cb(x; g) evaluates normalized SSD over the points of the further patch Sx subject to each point’s respective nearest neighbor among the ‘backward’-transformed points g−1 (S? [sent-117, score-0.271]
42 ∈Sx Analogously, the function cf (x; g) evaluates normalized SSD over the points of the closer patch Sx? [sent-126, score-0.259]
43 For g to be deemed valid at x, we require that the depth of the sphere center point of the matched patch be less than or equal to that of the pre-image of x. [sent-140, score-0.601]
44 Given a pixel x and a rigid body motion g, we compute the matching cost c(x; g) according to ≥ c(x;g) = where α ∈ ≥ ? [sent-143, score-0.175]
45 [1]) in the aim of assigning to each × pixel x of the input depth map a 6 DoF rigid body motion in 3D, mapping Sx to a valid matching patch Sx? [sent-151, score-0.752]
46 at equal or lesser depth with respect to the camera. [sent-152, score-0.454]
47 PatchMatch was first introduced as a method for obtaining dense approximate nearest neighbor fields between pairs of n n pixel patches airne s2tD ne, assigning dtos ebaetcwh pixel x i onf an image eAl 1 1 1 111112222255333 Figure 4. [sent-153, score-0.242]
48 A filled variant of the disparity map of the Middlebury Cones data set (left) as input and a visualization of projected 3D displacements of the output of our dense correspondence search using conventional optical flow coloring, both overlayed sparsely with arrows for greater clarity (right). [sent-154, score-0.237]
49 a displacement vector mapping the patch centered at x to a matching patch in an image B with the objective of reconstructing one image in terms of patches from the other. [sent-156, score-0.458]
50 of the input depth map such that the depth of Px? [sent-167, score-0.723]
51 We then compute the rotation minimizing arc length between the patch normal vector at Px and that at Px? [sent-169, score-0.189]
52 [1]), we traverse the pixels x of our input depth map in scanline order—upper left to lower right for even iterations, lower right to upper left for odd—and adopt the rigid body motion assigned to a neighboring pixel if doing so yields an equal or lower cost. [sent-177, score-0.605]
53 Immediately following propagation at a given pixel x, we independently carry out k iterations of additional initialization and of perturbation of the translational and rotational components of gx, adopting the initialization or perturbation if doing so yields an equal or lower cost. [sent-180, score-0.272]
54 Rotational perturbation, which we carry out in a range that decreases with every iteration k, consists of random rotation around the normal at Px (1 DoF) and of random perturbation of the remaining two degrees of freedom of the rotation. [sent-184, score-0.126]
55 Patch Upscaling and Merging Having assigned a motion gx ∈ SE(3) to each pixel x of the input depth map, we generate an SER( depth map by merging interpolated depth values ofthe ‘backward’-transformed points g−x1 (S? [sent-188, score-1.288]
56 We begin, for each x, by (i) determining—with the help of contour polygonalization—the spatial extent of Sx at the target resolution, giving an ‘overlay mask’ over which we then (ii) generate an ‘overlay patch’ by interpolating depth values from the points g−x1 (S? [sent-190, score-0.388]
57 Next, we (iii) populate the SR depth map by merging the interpolated depth values of overlapping overlay patches, with the influence of each valid overlay patch weighted as a function of patch similarity. [sent-192, score-1.711]
58 Finally, we (iv) clean the SR depth map in a postprocessing step, removing small holes that might have arisen at object boundaries as a consequence of polygonalization. [sent-193, score-0.43]
59 The 2D pixels x of the input depth map to which the 3D points of Sx project define the spatial extent of Sx at the input resolution (cf. [sent-195, score-0.477]
60 x) of the matched patch are allowed to influence, since it is over these pixels that we compute the matching cost. [sent-198, score-0.224]
61 Upscaling the mask by the SR factor using NN interpolation gives a mask at the target resolution, but introduces disturbing jagged edges. [sent-199, score-0.338]
62 Douglas and Peucker [6]) of this NN upscaled mask, constrained such that approximated contours be at a distance of at most the SR factor—corresponding to a single pixel at the input resolution—from the NN upscaled contours. [sent-201, score-0.186]
63 We ignore recovered polygonalized contours whose area is less than or equal to the square ofthe SR factor, thereby removing flying pixels. [sent-202, score-0.151]
64 This polygonalized mask—to which we refer as the overlay mask of x—consists of all SR pixels that fall into one of the remaining polygonalized contours but fall into no contour that is nested inside another, in order to handle ˆx holes like in the lamp in Figure 5. [sent-203, score-0.514]
65 Center: NN upscaling of the depth map and mask by a factor of 2. [sent-207, score-0.682]
66 Right: Corresponding polygon approximation of the NN upscaled mask, which we term the ‘overlay mask’ corresponding to x. [sent-208, score-0.127]
67 In the merging step, it is only the SR pixels ˆx of the overlay mask of x that the ‘backward’-transformed points g−x1 (S? [sent-209, score-0.424]
68 x) of the matched patch are allowed to influence. [sent-210, score-0.224]
69 We interpolate, for the SR pixels ˆx of the overlay mask corresponding to x, depth values from the ‘backward’-transformed points g−x1 (S? [sent-212, score-0.685]
70 Since points transformed according to a rigid body motion in 3D are not guaranteed to project to a regular grid in general, we interpolate over the depth values of these transformed points using barycentric coordinates on a Delaunay triangulation of their projections to image space (cf. [sent-214, score-0.47]
71 The SR depth map is computed by working out, for each SR pixel xˆ, a weighted average of the corresponding interpolated depth values from the overlapping overlay patches. [sent-218, score-1.057]
72 The weight ωx of the interpolated depth values of the overlay patch assigned to x is given by exp(−γ · cb(x; gx)), where γ ∈ R+ controls the falloff to e0. [sent-219, score-0.818]
73 x (If− cγb( ·x c; gx) >) )β,, w βh r∈e γR+ ∈, we instead use the overlay patch at x given by t h∈e identity motion, ensuring that patches for which no good match was found do not undergo heavy degradation. [sent-220, score-0.511]
74 Since our polygon approximation guarantees only that the outlines of the polygon be within the SR factor of the outlines of the NN upscaled mask, it is possible that no overlay mask cover a given SR pixel. [sent-226, score-0.566]
75 Another possible cause for holes is if pixels within an overlay mask could not be interpolated owing to the spatial distribution of the projected points. [sent-228, score-0.491]
76 In that event, we dilate within the overlay mask with highest weight, again only over pixels identified as holes. [sent-229, score-0.35]
77 Evaluation We evaluate our method using depth data from stereo, ToF, laser scans and structured light. [sent-232, score-0.426]
78 Setting appropriate parameters r, β, and γ is largely intuitive upon visualization of the input point cloud, and depends on the scale, density, and relative depth ofpoint features one aims to capture. [sent-238, score-0.335]
79 1, all algorithm parameters were kept identical across Middlebury and laser scan tests, respectively. [sent-240, score-0.105]
80 —downscaled by NN interpolation by a factor of 2 and 4 and subsequently super resolved by the same factor, respectively, which we compare to ground truth. [sent-248, score-0.118]
81 Among depth SR methods that leverage a color or intensity image at the target resolution, we compare against Diebel and Thrun [5] and Yang et al. [sent-250, score-0.472]
82 We compare against NN upscaling to provide a rough baseline, although it introduces jagged edges and does nothing to improve the apparent depth measurement accuracy. [sent-256, score-0.52]
83 Table 1 also gives RMSE scores for three depth maps obtained from laser scans detailed in Mac Aodha et al. [sent-257, score-0.508]
84 , which we downscaled and subsequently super resolved by a factor of 4. [sent-258, score-0.154]
85 For the laser scans we compare to the original resolution since ground truth data was not available. [sent-259, score-0.18]
86 All RMSE and percent error scores were computed on 8 bit disparity maps . [sent-261, score-0.191]
87 In percent error, we are the top performer among example-based methods, and on a few occasions outperform the image-guided techniques. [sent-267, score-0.125]
88 Qualitative Evaluation In Figure 6 we show results on a data set of two similar egg cartons situated at different depths, obtained using the stereo algorithm of Bleyer et al. [sent-270, score-0.221]
89 We see that although our depth map appears pleasing, it in fact remains gently noisy if shaded as a mesh, owing to the great deal of noise in the input. [sent-276, score-0.527]
90 do not succeed in removing visible noise in their depth map, and introduce halo artefacts at the boundaries. [sent-282, score-0.335]
91 [8] on singleimage super resolution for color and intensity images, we presented a tailored depth super resolution algorithm that makes use of only the information contained in the input depth map. [sent-288, score-1.051]
92 In our evaluation, we showed our results to be highly competitive with methods leveraging ancillary data. [sent-290, score-0.123]
93 Kinectfusion: real-time 3D reconstruction and interaction using a moving depth camera. [sent-369, score-0.335]
94 [12] a depth SR method, all of which require an external database. [sent-527, score-0.373]
95 [21] are depth SR methods that use an image at the target resolution. [sent-529, score-0.388]
96 Our method is the top performer among example-based methods and on a few occasions outperforms Diebel and Thrun [5] and Yang et al. [sent-605, score-0.115]
97 2x nearest neighbor upscaling (b) and SR (c-e) on a stereo data set of two similar egg cartons obtained using the method of Bleyer et al. [sent-610, score-0.391]
98 0 ToF data set shown in Figure 2 for 4x nearest neighbor upscaling in (a) and 4x SR otherwise. [sent-620, score-0.233]
99 Below, we show shaded meshes for the preprocessed result of Mac Aodha et al. [sent-624, score-0.141]
100 Note that although we in (i) perform worse than (h) on the vase, we preserve fine detail better and do not introduce square patch artefacts. [sent-626, score-0.189]
wordName wordTfidf (topN-words)
[('depth', 0.335), ('sx', 0.292), ('sr', 0.283), ('aodha', 0.252), ('overlay', 0.242), ('mac', 0.238), ('patch', 0.189), ('patchmatch', 0.184), ('glasner', 0.159), ('upscaling', 0.151), ('cones', 0.126), ('ancillary', 0.123), ('dof', 0.114), ('px', 0.112), ('rmse', 0.112), ('mask', 0.108), ('diebel', 0.09), ('resolution', 0.089), ('thrun', 0.084), ('super', 0.083), ('patches', 0.08), ('tof', 0.079), ('venus', 0.079), ('lesser', 0.077), ('rigid', 0.076), ('tsukuba', 0.076), ('merging', 0.074), ('middlebury', 0.073), ('upscaled', 0.073), ('teddy', 0.072), ('freeman', 0.07), ('gx', 0.064), ('barnes', 0.064), ('perturbation', 0.064), ('situated', 0.063), ('carry', 0.062), ('bit', 0.062), ('cb', 0.062), ('esult', 0.061), ('polygonalized', 0.061), ('backward', 0.061), ('vase', 0.06), ('body', 0.059), ('ssd', 0.058), ('nn', 0.057), ('laser', 0.057), ('percent', 0.057), ('yang', 0.055), ('polygon', 0.054), ('map', 0.053), ('target', 0.053), ('interpolated', 0.052), ('shaded', 0.051), ('bilateral', 0.049), ('scan', 0.048), ('flying', 0.048), ('et', 0.047), ('owing', 0.047), ('noiseless', 0.045), ('neighbor', 0.044), ('preprocessed', 0.043), ('correspondence', 0.043), ('holes', 0.042), ('equal', 0.042), ('cartons', 0.041), ('douglas', 0.041), ('gently', 0.041), ('lidarboost', 0.041), ('patchwise', 0.041), ('sambridge', 0.041), ('schuon', 0.041), ('zooms', 0.041), ('bleyer', 0.041), ('pixel', 0.04), ('guiding', 0.039), ('external', 0.038), ('repetition', 0.038), ('nearest', 0.038), ('superresolution', 0.037), ('cf', 0.037), ('intensity', 0.037), ('variant', 0.037), ('disparity', 0.037), ('downscaled', 0.036), ('geophysical', 0.036), ('stereo', 0.036), ('factor', 0.035), ('matched', 0.035), ('maps', 0.035), ('proceed', 0.035), ('greater', 0.034), ('scans', 0.034), ('jagged', 0.034), ('egg', 0.034), ('besse', 0.034), ('nng', 0.034), ('occasions', 0.034), ('performer', 0.034), ('closer', 0.033), ('filled', 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
Author: unkown-author
Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.
2 0.29793793 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
Author: Ju Shen, Sen-Ching S. Cheung
Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.
3 0.19108202 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
4 0.18368454 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
Author: Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi
Abstract: We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image. Specifically, it computes depth for each pixel in the high resolution image using geodesic paths to the pixels whose depths are known from the low resolution one. Though this is closely related to the all-pairshortest-path problem which has O(n2 log n) complexity, we develop a novel approximation algorithm whose complexity grows linearly with the image size and achieve realtime performance. We compare our algorithm with the state of the art on the benchmark dataset and show that our approach provides more accurate depth upsampling with fewer artifacts. In addition, we show that the proposed algorithm is well suited for upsampling depth images using binary edge maps, an important sensor fusion application.
5 0.17994674 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression
Author: Jianchao Yang, Zhe Lin, Scott Cohen
Abstract: We propose a fast regression model for practical single image super-resolution based on in-place examples, by leveraging two fundamental super-resolution approaches— learning from an external database and learning from selfexamples. Our in-place self-similarity refines the recently proposed local self-similarity by proving that a patch in the upper scale image have good matches around its origin location in the lower scale image. Based on the in-place examples, a first-order approximation of the nonlinear mapping function from low- to high-resolution image patches is learned. Extensive experiments on benchmark and realworld images demonstrate that our algorithm can produce natural-looking results with sharp edges and preserved fine details, while the current state-of-the-art algorithms are prone to visual artifacts. Furthermore, our model can easily extend to deal with noise by combining the regression results on multiple in-place examples for robust estimation. The algorithm runs fast and is particularly useful for practical applications, where the input images typically contain diverse textures and they are potentially contaminated by noise or compression artifacts.
6 0.16367911 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
9 0.14455849 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
10 0.13574305 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
11 0.13195311 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors
12 0.12085134 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes
13 0.11955041 244 cvpr-2013-Large Displacement Optical Flow from Nearest Neighbor Fields
14 0.11834791 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
15 0.11657838 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
16 0.11302427 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
17 0.11192789 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera
18 0.10903331 424 cvpr-2013-Templateless Quasi-rigid Shape Modeling with Implicit Loop-Closure
19 0.10481998 393 cvpr-2013-Separating Signal from Noise Using Patch Recurrence across Scales
20 0.10259351 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
topicId topicWeight
[(0, 0.192), (1, 0.19), (2, 0.042), (3, 0.056), (4, -0.046), (5, -0.024), (6, -0.045), (7, 0.102), (8, -0.001), (9, -0.043), (10, -0.035), (11, -0.003), (12, 0.061), (13, 0.156), (14, 0.08), (15, -0.073), (16, -0.232), (17, -0.033), (18, 0.009), (19, -0.032), (20, 0.005), (21, 0.02), (22, -0.009), (23, -0.123), (24, 0.016), (25, 0.026), (26, -0.094), (27, -0.107), (28, 0.031), (29, -0.067), (30, -0.071), (31, -0.053), (32, 0.094), (33, -0.005), (34, -0.003), (35, -0.073), (36, 0.011), (37, -0.008), (38, -0.097), (39, 0.034), (40, 0.079), (41, -0.035), (42, 0.029), (43, 0.047), (44, -0.009), (45, 0.02), (46, -0.01), (47, -0.002), (48, -0.015), (49, 0.051)]
simIndex simValue paperId paperTitle
same-paper 1 0.97577995 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
Author: unkown-author
Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.
2 0.84060794 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
Author: Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi
Abstract: We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image. Specifically, it computes depth for each pixel in the high resolution image using geodesic paths to the pixels whose depths are known from the low resolution one. Though this is closely related to the all-pairshortest-path problem which has O(n2 log n) complexity, we develop a novel approximation algorithm whose complexity grows linearly with the image size and achieve realtime performance. We compare our algorithm with the state of the art on the benchmark dataset and show that our approach provides more accurate depth upsampling with fewer artifacts. In addition, we show that the proposed algorithm is well suited for upsampling depth images using binary edge maps, an important sensor fusion application.
3 0.78832918 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
Author: Ju Shen, Sen-Ching S. Cheung
Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.
4 0.76341802 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
5 0.73875457 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu
Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.
7 0.71513122 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
8 0.65539801 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera
9 0.65190756 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
10 0.6076929 219 cvpr-2013-In Defense of 3D-Label Stereo
11 0.60392189 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
12 0.59677589 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
13 0.5926466 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
14 0.58211559 195 cvpr-2013-HDR Deghosting: How to Deal with Saturation?
15 0.57742107 428 cvpr-2013-The Episolar Constraint: Monocular Shape from Shadow Correspondence
16 0.57463938 166 cvpr-2013-Fast Image Super-Resolution Based on In-Place Example Regression
17 0.55392611 169 cvpr-2013-Fast Patch-Based Denoising Using Approximated Patch Geodesic Paths
18 0.55174291 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation
19 0.54078555 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction
20 0.53208685 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
topicId topicWeight
[(10, 0.125), (16, 0.078), (26, 0.044), (28, 0.015), (33, 0.189), (67, 0.037), (69, 0.06), (87, 0.068), (92, 0.29)]
simIndex simValue paperId paperTitle
same-paper 1 0.75388235 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
Author: unkown-author
Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.
2 0.71272224 393 cvpr-2013-Separating Signal from Noise Using Patch Recurrence across Scales
Author: Maria Zontak, Inbar Mosseri, Michal Irani
Abstract: Recurrence of small clean image patches across different scales of a natural image has been successfully used for solving ill-posed problems in clean images (e.g., superresolution from a single image). In this paper we show how this multi-scale property can be extended to solve ill-posed problems under noisy conditions, such as image denoising. While clean patches are obscured by severe noise in the original scale of a noisy image, noise levels drop dramatically at coarser image scales. This allows for the unknown hidden clean patches to “naturally emerge ” in some coarser scale of the noisy image. We further show that patch recurrence across scales is strengthened when using directional pyramids (that blur and subsample only in one direction). Our statistical experiments show that for almost any noisy image patch (more than 99%), there exists a “good” clean version of itself at the same relative image coordinates in some coarser scale of the image.This is a strong phenomenon of noise-contaminated natural images, which can serve as a strong prior for separating the signal from the noise. Finally, incorporating this multi-scale prior into a simple denoising algorithm yields state-of-the-art denois- ing results.
3 0.66026264 29 cvpr-2013-A Video Representation Using Temporal Superpixels
Author: Jason Chang, Donglai Wei, John W. Fisher_III
Abstract: We develop a generative probabilistic model for temporally consistent superpixels in video sequences. In contrast to supervoxel methods, object parts in different frames are tracked by the same temporal superpixel. We explicitly model flow between frames with a bilateral Gaussian process and use this information to propagate superpixels in an online fashion. We consider four novel metrics to quantify performance of a temporal superpixel representation and demonstrate superior performance when compared to supervoxel methods.
4 0.64167804 287 cvpr-2013-Modeling Actions through State Changes
Author: Alireza Fathi, James M. Rehg
Abstract: In this paper we present a model of action based on the change in the state of the environment. Many actions involve similar dynamics and hand-object relationships, but differ in their purpose and meaning. The key to differentiating these actions is the ability to identify how they change the state of objects and materials in the environment. We propose a weakly supervised method for learning the object and material states that are necessary for recognizing daily actions. Once these state detectors are learned, we can apply them to input videos and pool their outputs to detect actions. We further demonstrate that our method can be used to segment discrete actions from a continuous video of an activity. Our results outperform state-of-the-art action recognition and activity segmentation results.
5 0.63235044 27 cvpr-2013-A Theory of Refractive Photo-Light-Path Triangulation
Author: Visesh Chari, Peter Sturm
Abstract: 3D reconstruction of transparent refractive objects like a plastic bottle is challenging: they lack appearance related visual cues and merely reflect and refract light from the surrounding environment. Amongst several approaches to reconstruct such objects, the seminal work of Light-Path triangulation [17] is highly popular because of its general applicability and analysis of minimal scenarios. A lightpath is defined as the piece-wise linear path taken by a ray of light as it passes from source, through the object and into the camera. Transparent refractive objects not only affect the geometric configuration of light-paths but also their radiometric properties. In this paper, we describe a method that combines both geometric and radiometric information to do reconstruction. We show two major consequences of the addition of radiometric cues to the light-path setup. Firstly, we extend the case of scenarios in which reconstruction is plausible while reducing the minimal re- quirements for a unique reconstruction. This happens as a consequence of the fact that radiometric cues add an additional known variable to the already existing system of equations. Secondly, we present a simple algorithm for reconstruction, owing to the nature of the radiometric cue. We present several synthetic experiments to validate our theories, and show high quality reconstructions in challenging scenarios.
6 0.63078701 118 cvpr-2013-Detecting Pulse from Head Motions in Video
7 0.62959856 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
8 0.62958002 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
9 0.62792319 349 cvpr-2013-Reconstructing Gas Flows Using Light-Path Approximation
10 0.62761998 447 cvpr-2013-Underwater Camera Calibration Using Wavelength Triangulation
11 0.62722367 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
12 0.62644184 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems
13 0.62631476 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
14 0.62598324 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
15 0.6259678 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
16 0.6256972 363 cvpr-2013-Robust Multi-resolution Pedestrian Detection in Traffic Scenes
17 0.62566316 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
18 0.6254068 454 cvpr-2013-Video Enhancement of People Wearing Polarized Glasses: Darkening Reversal and Reflection Reduction
19 0.62518734 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials
20 0.62510043 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation