cvpr cvpr2013 cvpr2013-397 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
Reference: text
sentIndex sentText sentNum sentScore
1 Simultaneous Super-Resolution of Depth and Images using a Single Camera Hee Seok Lee Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, 15 1-742, Seoul, Korea ult ra2 1@ snu . [sent-1, score-0.091]
2 Abstract In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. [sent-4, score-0.816]
3 The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. [sent-5, score-0.352]
4 In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. [sent-6, score-0.197]
5 Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. [sent-7, score-0.366]
6 In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. [sent-8, score-0.877]
7 Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. [sent-9, score-0.872]
8 The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation. [sent-10, score-0.594]
9 Introduction In 3D reconstruction with a single camera, the accuracy of camera pose and scene structure estimation is highly affected by the conditions of input images such as noise, contrast, blur, and resolution. [sent-12, score-0.331]
10 In particular, image resolution is an important factor for achieving sufficient accuracy ofvarious geometry-related computer vision algorithms including 3D reconstruction, since it influence the feature detection, localization and matching. [sent-13, score-0.145]
11 Note that small measurement error does not bring large errors in object position and camera pose when an object is close to the camera, while, it does significantly when kyoungmu @ s nu . [sent-15, score-0.303]
12 Therefore, it is necessary to enhance the image resolution to reduce the sensitivity to the image measurement error and achieve reliable and accurate 3D reconstruction. [sent-21, score-0.208]
13 Image super-resolution, the method for enhancing image resolution, has two different approaches: reconstructionbased approach and learning-based approach. [sent-22, score-0.101]
14 Therefore, finding accurate pixel-wise correspondences is the key for the success of the reconstruction-based super-resolution. [sent-25, score-0.138]
15 For general scenes, these correspondences can be obtained up to sub-pixel accuracy using optical flow algorithms. [sent-26, score-0.18]
16 Some iterative methods [4, 7] alternately estimate a high-resolution image and pixel correspondences, and show better results. [sent-28, score-0.139]
17 Note that if we employ the information about the 3D scene geometry, the super-resolution problem can be solved more efficiently since we can directly use it for enhancing the accuracy of the correspondences. [sent-30, score-0.101]
18 That is, with estimated camera poses, the problem of finding pairwise pixel correspondences through an image sequence can be converted into estimating the depth value of corresponding pixels. [sent-31, score-0.798]
19 Although this converted problem has an error source related to the camera pose error, because it is casted in a much lesser dimensional solution space than the original pairwise correspondence problem, it can be solved much easily and faster. [sent-32, score-0.305]
20 Therefore, depth reconstruction and super-resolution problems are interrelated and boost each other’s accuracy. [sent-33, score-0.555]
21 So, in this work, we combine the depth estimation and the high-resolution image estimation in a unified framework, and propose a simultaneous solution to both problems. [sent-34, score-0.75]
22 222888111 In the proposed method, the depth estimation and image super-resolution are formulated with a single convex energy function, which consists of data term and regularization term. [sent-35, score-0.848]
23 The solution is estimated by convex optimization of the energy function. [sent-36, score-0.399]
24 Although both pixel correspondences (re-parameterized by depth) and high-resolution image are estimated, the computational cost is not so expensive compared to the conventional high-resolution image estimation only because we do not use alternating methods like EM. [sent-37, score-0.489]
25 Additionally, due to the simultaneous estimation of depth and high-resolution image, the results of the two problems are greatly enhanced. [sent-38, score-0.591]
26 Related works In this section, we review some works that are similar to our work in combine 3D reconstruction and superresolution. [sent-40, score-0.126]
27 Then, we discuss the works on the primal-dual algorithm for 3D reconstruction or super-resolution. [sent-41, score-0.126]
28 3D reconstruction and image super resolution In [1, 9, 14, 5], the close relationship between superresolution and 3D scene structure is pointed out and their cooperative solution is studied. [sent-44, score-0.406]
29 Occlusions are effec- tively handled in their super-resolution method using depth information, but super-resolution does not contribute to depth map estimation in this method. [sent-46, score-1.017]
30 In [14], a method for increasing the accuracy of 3D video reconstruction using multiple static cameras is presented. [sent-47, score-0.126]
31 The 3D video is composed of texture images and 3D shapes, and increasing their accuracy is achieved by simultaneous super-resolution using MRF formulation and graph-cuts. [sent-48, score-0.216]
32 High-quality texture and 3D reconstruction is presented in [5] where texture and shape of a 3D model are alternately estimated with joint energy functional. [sent-49, score-0.468]
33 Compared to [5] our work has more challenging settings in which neither accurate camera motions nor initial pixel correspondences are available. [sent-50, score-0.392]
34 The authors formulate a full frame superresolution problem combined with a depth map estimation problem, and attempt to enhance the results of both problems. [sent-52, score-0.75]
35 However, their solution is not fully simultaneous but follows an EM-style alternating method instead. [sent-53, score-0.254]
36 They fix the current high-resolution image for the estimation of the depth map, and vice versa. [sent-54, score-0.46]
37 Graph-cut and iterated conditional modes (ICM) are used for the depth and highresolution image estimation, respectively, for each iteration, which result in an inevitably large computation cost. [sent-55, score-0.491]
38 In contrast, we search the globally optimum solution directly with a single convex energy function and achieve very fast optimization speed for dense real-time 3D reconstruction. [sent-56, score-0.421]
39 Primal-dual algorithm for 3D reconstruction and super-resolution The formulation of our algorithm is based on the variational approach, especially the primal-dual algorithm [2, 3, 6]. [sent-59, score-0.167]
40 The first-order primal-dual algorithm is a very effective tool for convex variational problems due to its parallelizable characteristics. [sent-60, score-0.164]
41 The first-order primal-dual algorithm has been applied recently for the 3D reconstruction and super-resolution problems. [sent-62, score-0.126]
42 In [10] and [13], a dense 3D reconstruction is studied and its real-time implementations are demonstrated. [sent-63, score-0.126]
43 They used conventional energy functions consisting of photometric consistency-based data term and L1 or Huber norm-based smoothness term, but achieved a breakthrough performance in computation time using the primaldual algorithm combined with the GPGPU implementation. [sent-64, score-0.43]
44 The reconstruction-based super-resolution is formulated by image downsampling, blurring, and warping, and then the latent high-resolution image is estimated with the Huber norm regularization. [sent-66, score-0.179]
45 This method achieves a fast computation of high-quality super-resolution comparable to other methods, but has certain limitations such that highly accurate initial image warping is required and no updating procedure is involved in estimating the super-resolution. [sent-67, score-0.15]
46 × Our novel combined 3D reconstruction and superresolution algorithm is also formulated in the first-order primal-dual framework. [sent-68, score-0.344]
47 However, unlike [10] and [13], the proposed super-resolution combined framework enables more accurate depth map estimation with respect to its resolution. [sent-69, score-0.599]
48 Our image super-resolution is also accelerated by finding pixel correspondences in a depth domain instead of optical flows between images with the help of camera geometry obtained from the 3D reconstruction. [sent-70, score-0.746]
49 Model In this work, we propose a new energy function for a simultaneous estimation of depth map and high-resolution image. [sent-72, score-0.819]
50 The inputs are M N size low-resolution image sequence Ij ∈ RMN and their corresponding camera poses Pj ∈ SE(3) with j ∈ {0, . [sent-73, score-0.125]
51 Let g ∈ be the latent super-resolution image with the gray scale, and d ∈ be the latent inverse depth map, where s is the predefined upscale factor. [sent-77, score-0.58]
52 The solution of g and d is estimated with respect to the reference view P0. [sent-78, score-0.205]
53 The energy function to solve this problem is composed of the data cost Edata based on the photometric constancy and the regularization cost Ereg for smoothing undesirable artifacts. [sent-79, score-0.621]
54 With Rs2MN 222888222 Rs2MN quence Ij and the super-resolution image g, induced by the depth map d: The photometric consistency should hold for Ij and the simulated low-resolution image D ∗ B ∗ g. [sent-80, score-0.608]
55 the parameter λ which controls the degree of regularization, the energy function has the form E(g, d) = Ereg + λEdata. [sent-81, score-0.139]
56 Data cost We start with the relationship between the highresolution image g for the reference image I0 and the lowresolution image Ij from an adjacent view. [sent-85, score-0.401]
57 With the camera internal parameter K including the focal length and the principal point, the reprojected 3D position X of pixel (x, y) in I0 with the inverse depth d(x, y) by the reference camera P0 is given by X = d(x1,y)K−1 · (x, y, 1)? [sent-86, score-0.875]
58 , and its projection to the adjacent view with Pj is calculated as h(KPj,0d(x1,y)K−1 · (x, y, 1)? [sent-87, score-0.089]
59 For notational simplicity, the non-bold characters g and d are used for the pixel-wise values g(x, y) and d(x, y), respectively, and their corresponding dual variables later. [sent-92, score-0.122]
60 We define the image warping W(Ij , d), which transforms the image Ij to the reference image I0, using the pixel projection and reprojection discussed above, W(Ij,d)(x,y) = Ij(h(KPj,0d1K−1· (x,y,1)? [sent-93, score-0.224]
61 (1) Then, by the photometric consistency between the reference image and the adjacent image, the equation I0(x,y) = Ij(h(KPj,0d1K−1·(x,y,1)? [sent-95, score-0.227]
62 By incorporating the image resolution degradation model, the equation (D ∗ B ∗ g)(x, y) = I0(x, y) = W(Ij, d)(x, y) (3) also holds for all j ∈ {0, . [sent-100, score-0.156]
63 Here, D and B are the downsampling and the blurring operator, respectively. [sent-104, score-0.18]
64 (4) To find the optimized value of d through an iterative update, we apply the first-order Taylor expansion to W(Ij , d) to approximate a change in image W(Ij , d) with respect to a small change of depth at the initial value d0, W(Ij,d) ? [sent-110, score-0.494]
65 (7) The blur kernel B is predefined with the simple Gaussian blur model, with the standard deviation s and the kernel size of (s 1)1/2. [sent-120, score-0.176]
66 2 shows an example of the convexity of data cost ρ(g, d) for different image points. [sent-129, score-0.116]
67 The shape of the cost function is obviously convex, but the shape of the function varies from image point to point according to the image gradient. [sent-130, score-0.116]
68 In a low texture region, the data cost is dominated by the high-resolution intensity g than the depth d. [sent-131, score-0.576]
69 Therefore, regularization is required to get a more plausible solution for depth d. [sent-132, score-0.567]
70 Regularization For image intensity g and inverse depth d, we use a Huber norm based regularization to get a smoothed and discontinuity-preserved result. [sent-141, score-0.665]
71 By combining the data cost (8) and the regularization (9), we get our objective energy function E(g, d), αd E(g,d) =? [sent-150, score-0.363]
72 (10) In the next section, we describe the solution of this energy function. [sent-156, score-0.218]
73 Initial depth estimation In the data cost (8), the first-order Taylor expansion, which can only handle a small update for g, and d is applied. [sent-160, score-0.576]
74 The initial value of g can be easily obtained by upscaling the input image at reference view using simple bicubic interpolation. [sent-162, score-0.319]
75 However, the initial value of d should be estimated using the low-resolution input sequence. [sent-163, score-0.113]
76 The cost function for initial depth estimation is easily obtained from Eq. [sent-164, score-0.644]
77 (8) and (10) by replacing B ∗ g and Iˆj with the low-resolution images I0 and Ij , respectively, and removing the regularization on g. [sent-165, score-0.108]
78 The resulting energy function for low-resolution depth map dˇ is E(ˇd) =? [sent-166, score-0.608]
79 The equation (11) is actually a conventional formulation for depth map estimation. [sent-173, score-0.519]
80 The optimization of this energy function is almost similar to the optimization of Eq. [sent-174, score-0.249]
81 (10), which will be explained below, so the optimization of (11) is skipped here. [sent-175, score-0.101]
82 Thus, a coarse-to-fine approach is used to approach the global optimum of d gradually by starting from an arbitrary initial solution, e. [sent-178, score-0.135]
83 The depth result obtained at the finest level is upscaled using bicubic interpolation and is fed to the optimization of (10) as an initial value. [sent-182, score-0.723]
84 High-resolution image and depth estimation Now we will describe a solution of Eq. [sent-185, score-0.539]
85 By interpreting our objective function (10) as the primal-dual formulation, we can rewrite it as a generic saddle point problem with the dual variables p and q, which corresponds to g and d, respectively: mg,id−nδmpP,aq(xp? [sent-187, score-0.122]
86 where the operator ∇∗ , the conjugate of ∇ as ∇∗ = −div, computes the divergence [2], and ¯g and d¯ are the intermediate variables for the convergence of algorithm. [sent-214, score-0.155]
87 The operators Rp,q and Rp,q are the resolvent operators that search lower energy values using subgradients. [sent-217, score-0.529]
88 The resolvent operators will be discussed in more detail. [sent-219, score-0.313]
89 Our regularization term (10) is a typical form used in [2]. [sent-220, score-0.108]
90 Thus, the resolvent operator of the dual variables is a pixelwise projection Rp,q(p,q) = ? [sent-221, score-0.469]
91 (14) On the other hand, the data cost has a difference with the standard form in previous primal-dual algorithm applications. [sent-224, score-0.116]
92 This difference comes from the summation of absolute value in the data cost for image sequence. [sent-225, score-0.116]
93 Since we use 222888444 a L1 norm for the difference between two images, there are some critical (non-differentiable) points in their summation. [sent-226, score-0.177]
94 Therefore, these non-differentiability should be handled in the optimization procedure. [sent-227, score-0.101]
95 The minimization of similar cost function is introduced in [13], but the solution space of [13] is for the depth map only, so the minimization can be efficiently achieved by evaluating and sorting all critical points. [sent-228, score-0.922]
96 On the other hand, the solution space of our problem is composed of depth map and image intensity, so there are J2 critical points. [sent-229, score-0.696]
97 Searching them is not straightforward, and thus optimization by evaluating and sorting critical points is inefficient. [sent-230, score-0.217]
98 Instead, the general gradient descent and critical point searching are combined to accelerate the minimization procedure. [sent-231, score-0.201]
99 (16) We divide the domain of resolvent operator based on the cost ρ and the magnitude of gradient ? [sent-248, score-0.463]
100 22, and apply the gradient descent search and critical point search, whRerg,d( )=⎪⎨ ⎧⎩⎪ (g,ifd ρ) −( g,τρd? [sent-250, score-0.103]
wordName wordTfidf (topN-words)
[('depth', 0.38), ('ij', 0.254), ('ijd', 0.236), ('resolvent', 0.236), ('huber', 0.18), ('energy', 0.139), ('correspondences', 0.138), ('simultaneous', 0.131), ('reconstruction', 0.126), ('bicubic', 0.126), ('camera', 0.125), ('downsampling', 0.121), ('cost', 0.116), ('operator', 0.111), ('highresolution', 0.111), ('regularization', 0.108), ('jd', 0.108), ('superresolution', 0.108), ('gpgpu', 0.105), ('critical', 0.103), ('enhancing', 0.101), ('upscale', 0.097), ('photometric', 0.097), ('resolution', 0.093), ('rmn', 0.092), ('ereg', 0.092), ('map', 0.089), ('edata', 0.087), ('seoul', 0.087), ('warping', 0.082), ('reference', 0.081), ('convex', 0.081), ('estimation', 0.08), ('solution', 0.079), ('alternately', 0.078), ('dual', 0.078), ('operators', 0.077), ('norm', 0.074), ('measurement', 0.072), ('initial', 0.068), ('blur', 0.068), ('optimum', 0.067), ('sgn', 0.066), ('holds', 0.063), ('inverse', 0.063), ('pixel', 0.061), ('formulated', 0.06), ('sorting', 0.059), ('blurring', 0.059), ('dw', 0.058), ('nu', 0.057), ('optimization', 0.055), ('lps', 0.052), ('casted', 0.052), ('dehomogenization', 0.052), ('seok', 0.052), ('upscaled', 0.052), ('ofvarious', 0.052), ('primaldual', 0.052), ('signum', 0.052), ('combined', 0.05), ('conventional', 0.05), ('adjacent', 0.049), ('converted', 0.049), ('ult', 0.049), ('kyoungmu', 0.049), ('interrelated', 0.049), ('minimization', 0.048), ('expansion', 0.046), ('handled', 0.046), ('pj', 0.046), ('taylor', 0.046), ('skipped', 0.046), ('composed', 0.045), ('estimated', 0.045), ('variables', 0.044), ('alternating', 0.044), ('lowresolution', 0.044), ('asri', 0.044), ('kyoung', 0.044), ('upscaling', 0.044), ('enhance', 0.043), ('derivatives', 0.043), ('optical', 0.042), ('quence', 0.042), ('breakthrough', 0.042), ('tively', 0.042), ('untextured', 0.042), ('snu', 0.042), ('odometry', 0.042), ('parallelizable', 0.042), ('interpolation', 0.042), ('variational', 0.041), ('predefined', 0.04), ('calculated', 0.04), ('reprojected', 0.04), ('cv', 0.04), ('intensity', 0.04), ('texture', 0.04), ('icm', 0.039)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999964 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
2 0.32427394 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
Author: Ju Shen, Sen-Ching S. Cheung
Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.
3 0.2879453 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: Motion blur frequently occurs in dense 3D reconstruction using a single moving camera, and it degrades the quality of the 3D reconstruction. To handle motion blur caused by rapid camera shakes, we propose a blur-aware depth reconstruction method, which utilizes a pixel correspondence that is obtained by considering the effect of motion blur. Motion blur is dependent on 3D geometry, thus parameterizing blurred appearance of images with scene depth given camera motion is possible and a depth map can be accurately estimated from the blur-considered pixel correspondence. The estimated depth is then converted intopixel-wise blur kernels, and non-uniform motion blur is easily removed with low computational cost. The obtained blur kernel is depth-dependent, thus it effectively addresses scene-depth variation, which is a challenging problem in conventional non-uniform deblurring methods.
4 0.24573748 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation
Author: Christian Häne, Christopher Zach, Andrea Cohen, Roland Angst, Marc Pollefeys
Abstract: Both image segmentation and dense 3D modeling from images represent an intrinsically ill-posed problem. Strong regularizers are therefore required to constrain the solutions from being ’too noisy’. Unfortunately, these priors generally yield overly smooth reconstructions and/or segmentations in certain regions whereas they fail in other areas to constrain the solution sufficiently. In this paper we argue that image segmentation and dense 3D reconstruction contribute valuable information to each other’s task. As a consequence, we propose a rigorous mathematical framework to formulate and solve a joint segmentation and dense reconstruction problem. Image segmentations provide geometric cues about which surface orientations are more likely to appear at a certain location in space whereas a dense 3D reconstruction yields a suitable regularization for the segmentation problem by lifting the labeling from 2D images to 3D space. We show how appearance-based cues and 3D surface orientation priors can be learned from training data and subsequently used for class-specific regularization. Experimental results on several real data sets highlight the advantages of our joint formulation.
5 0.21874438 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors
Author: Amaury Dame, Victor A. Prisacariu, Carl Y. Ren, Ian Reid
Abstract: We propose a formulation of monocular SLAM which combines live dense reconstruction with shape priors-based 3D tracking and reconstruction. Current live dense SLAM approaches are limited to the reconstruction of visible surfaces. Moreover, most of them are based on the minimisation of a photo-consistency error, which usually makes them sensitive to specularities. In the 3D pose recovery literature, problems caused by imperfect and ambiguous image information have been dealt with by using prior shape knowledge. At the same time, the success of depth sensors has shown that combining joint image and depth information drastically increases the robustness of the classical monocular 3D tracking and 3D reconstruction approaches. In this work we link dense SLAM to 3D object pose and shape recovery. More specifically, we automatically augment our SLAMsystem with object specific identity, together with 6D pose and additional shape degrees of freedom for the object(s) of known class in the scene, combining im- age data and depth information for the pose and shape recovery. This leads to a system that allows for full scaled 3D reconstruction with the known object(s) segmented from the scene. The segmentation enhances the clarity, accuracy and completeness of the maps built by the dense SLAM system, while the dense 3D data aids the segmentation process, yieldingfaster and more reliable convergence than when using 2D image data alone.
7 0.19108202 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
8 0.18223678 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
9 0.16619553 307 cvpr-2013-Non-uniform Motion Deblurring for Bilayer Scenes
10 0.16493264 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
11 0.16163173 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
12 0.1578642 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
13 0.14581124 380 cvpr-2013-Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
14 0.14530461 113 cvpr-2013-Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video
15 0.13873656 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
16 0.13833885 205 cvpr-2013-Hollywood 3D: Recognizing Actions in 3D Natural Scenes
17 0.13477936 24 cvpr-2013-A Principled Deep Random Field Model for Image Segmentation
18 0.12959962 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
19 0.12616992 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera
20 0.12546501 357 cvpr-2013-Revisiting Depth Layers from Occlusions
topicId topicWeight
[(0, 0.244), (1, 0.287), (2, 0.022), (3, 0.081), (4, -0.06), (5, 0.017), (6, -0.037), (7, 0.097), (8, 0.018), (9, -0.007), (10, -0.024), (11, 0.012), (12, -0.01), (13, 0.164), (14, 0.02), (15, -0.079), (16, -0.203), (17, 0.089), (18, -0.037), (19, -0.06), (20, -0.051), (21, -0.103), (22, -0.06), (23, -0.055), (24, 0.089), (25, 0.046), (26, -0.089), (27, 0.062), (28, 0.004), (29, -0.08), (30, -0.013), (31, 0.026), (32, 0.026), (33, -0.077), (34, -0.024), (35, -0.061), (36, 0.048), (37, 0.03), (38, -0.047), (39, -0.013), (40, 0.104), (41, -0.046), (42, -0.061), (43, 0.005), (44, -0.091), (45, -0.058), (46, -0.018), (47, -0.001), (48, 0.008), (49, 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 0.9845584 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
2 0.86586285 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
Author: Ju Shen, Sen-Ching S. Cheung
Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.
3 0.83890957 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images
Author: Ming-Yu Liu, Oncel Tuzel, Yuichi Taguchi
Abstract: We propose an algorithm utilizing geodesic distances to upsample a low resolution depth image using a registered high resolution color image. Specifically, it computes depth for each pixel in the high resolution image using geodesic paths to the pixels whose depths are known from the low resolution one. Though this is closely related to the all-pairshortest-path problem which has O(n2 log n) complexity, we develop a novel approximation algorithm whose complexity grows linearly with the image size and achieve realtime performance. We compare our algorithm with the state of the art on the benchmark dataset and show that our approach provides more accurate depth upsampling with fewer artifacts. In addition, we show that the proposed algorithm is well suited for upsampling depth images using binary edge maps, an important sensor fusion application.
4 0.83121526 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
Author: unkown-author
Abstract: We tackle the problem of jointly increasing the spatial resolution and apparent measurement accuracy of an input low-resolution, noisy, and perhaps heavily quantized depth map. In stark contrast to earlier work, we make no use of ancillary data like a color image at the target resolution, multiple aligned depth maps, or a database of highresolution depth exemplars. Instead, we proceed by identifying and merging patch correspondences within the input depth map itself, exploiting patchwise scene self-similarity across depth such as repetition of geometric primitives or object symmetry. While the notion of ‘single-image ’ super resolution has successfully been applied in the context of color and intensity images, we are to our knowledge the first to present a tailored analogue for depth images. Rather than reason in terms of patches of 2D pixels as others have before us, our key contribution is to proceed by reasoning in terms of patches of 3D points, with matched patch pairs related by a respective 6 DoF rigid body motion in 3D. In support of obtaining a dense correspondence field in reasonable time, we introduce a new 3D variant of Patch- Match. A third contribution is a simple, yet effective patch upscaling and merging technique, which predicts sharp object boundaries at the target resolution. We show that our results are highly competitive with those of alternative techniques leveraging even a color image at the target resolution or a database of high-resolution depth exemplars.
5 0.82770681 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
Author: Zhe Yang, Zhiwei Xiong, Yueyi Zhang, Jiao Wang, Feng Wu
Abstract: This paper proposes novel density modulated binary patterns for depth acquisition. Similar to Kinect, the illumination patterns do not need a projector for generation and can be emitted by infrared lasers and diffraction gratings. Our key idea is to use the density of light spots in the patterns to carry phase information. Two technical problems are addressed here. First, we propose an algorithm to design the patterns to carry more phase information without compromising the depth reconstruction from a single captured image as with Kinect. Second, since the carried phase is not strictly sinusoidal, the depth reconstructed from the phase contains a systematic error. We further propose a pixelbased phase matching algorithm to reduce the error. Experimental results show that the depth quality can be greatly improved using the phase carried by the density of light spots. Furthermore, our scheme can achieve 20 fps depth reconstruction with GPU assistance.
7 0.74109185 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
8 0.7334975 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
9 0.72315687 428 cvpr-2013-The Episolar Constraint: Monocular Shape from Shadow Correspondence
10 0.72220576 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors
11 0.71900523 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation
12 0.70585793 307 cvpr-2013-Non-uniform Motion Deblurring for Bilayer Scenes
13 0.702362 354 cvpr-2013-Relative Volume Constraints for Single View 3D Reconstruction
14 0.69968319 423 cvpr-2013-Template-Based Isometric Deformable 3D Reconstruction with Sampling-Based Focal Length Self-Calibration
15 0.69884765 407 cvpr-2013-Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera
16 0.69446337 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
17 0.6809265 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
18 0.661955 219 cvpr-2013-In Defense of 3D-Label Stereo
19 0.65970713 196 cvpr-2013-HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences
20 0.63116574 41 cvpr-2013-An Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision
topicId topicWeight
[(10, 0.13), (16, 0.054), (26, 0.051), (33, 0.317), (39, 0.219), (67, 0.04), (69, 0.024), (87, 0.1)]
simIndex simValue paperId paperTitle
1 0.94625133 381 cvpr-2013-Scene Parsing by Integrating Function, Geometry and Appearance Models
Author: Yibiao Zhao, Song-Chun Zhu
Abstract: Indoor functional objects exhibit large view and appearance variations, thus are difficult to be recognized by the traditional appearance-based classification paradigm. In this paper, we present an algorithm to parse indoor images based on two observations: i) The functionality is the most essentialproperty to define an indoor object, e.g. “a chair to sit on ”; ii) The geometry (3D shape) ofan object is designed to serve its function. We formulate the nature of the object function into a stochastic grammar model. This model characterizes a joint distribution over the function-geometryappearance (FGA) hierarchy. The hierarchical structure includes a scene category, , functional groups, , functional objects, functional parts and 3D geometric shapes. We use a simulated annealing MCMC algorithm to find the maximum a posteriori (MAP) solution, i.e. a parse tree. We design four data-driven steps to accelerate the search in the FGA space: i) group the line segments into 3D primitive shapes, ii) assign functional labels to these 3D primitive shapes, iii) fill in missing objects/parts according to the functional labels, and iv) synthesize 2D segmentation maps and verify the current parse tree by the Metropolis-Hastings acceptance probability. The experimental results on several challenging indoor datasets demonstrate theproposed approach not only significantly widens the scope ofindoor sceneparsing algorithm from the segmentation and the 3D recovery to the functional object recognition, but also yields improved overall performance.
2 0.93456334 240 cvpr-2013-Keypoints from Symmetries by Wave Propagation
Author: Samuele Salti, Alessandro Lanza, Luigi Di_Stefano
Abstract: The paper conjectures and demonstrates that repeatable keypoints based on salient symmetries at different scales can be detected by a novel analysis grounded on the wave equation rather than the heat equation underlying traditional Gaussian scale–space theory. While the image structures found by most state-of-the-art detectors, such as blobs and corners, occur typically on planar highly textured surfaces, salient symmetries are widespread in diverse kinds of images, including those related to untextured objects, which are hardly dealt with by current feature-based recognition pipelines. We provide experimental results on standard datasets and also contribute with a new dataset focused on untextured objects. Based on the positive experimental results, we hope to foster further research on the promising topic ofscale invariant analysis through the wave equation.
3 0.93236661 136 cvpr-2013-Discriminatively Trained And-Or Tree Models for Object Detection
Author: Xi Song, Tianfu Wu, Yunde Jia, Song-Chun Zhu
Abstract: This paper presents a method of learning reconfigurable And-Or Tree (AOT) models discriminatively from weakly annotated data for object detection. To explore the appearance and geometry space of latent structures effectively, we first quantize the image lattice using an overcomplete set of shape primitives, and then organize them into a directed acyclic And-Or Graph (AOG) by exploiting their compositional relations. We allow overlaps between child nodes when combining them into a parent node, which is equivalent to introducing an appearance Or-node implicitly for the overlapped portion. The learning of an AOT model consists of three components: (i) Unsupervised sub-category learning (i.e., branches of an object Or-node) with the latent structures in AOG being integrated out. (ii) Weaklysupervised part configuration learning (i.e., seeking the globally optimal parse trees in AOG for each sub-category). To search the globally optimal parse tree in AOG efficiently, we propose a dynamic programming (DP) algorithm. (iii) Joint appearance and structural parameters training under latent structural SVM framework. In experiments, our method is tested on PASCAL VOC 2007 and 2010 detection , benchmarks of 20 object classes and outperforms comparable state-of-the-art methods.
4 0.90265101 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
Author: Byung-soo Kim, Shili Xu, Silvio Savarese
Abstract: In this paper we focus on the problem of detecting objects in 3D from RGB-D images. We propose a novel framework that explores the compatibility between segmentation hypotheses of the object in the image and the corresponding 3D map. Our framework allows to discover the optimal location of the object using a generalization of the structural latent SVM formulation in 3D as well as the definition of a new loss function defined over the 3D space in training. We evaluate our method using two existing RGB-D datasets. Extensive quantitative and qualitative experimental results show that our proposed approach outperforms state-of-theart as methods well as a number of baseline approaches for both 3D and 2D object recognition tasks.
same-paper 5 0.89271641 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
Author: Hee Seok Lee, Kuoung Mu Lee
Abstract: In this paper, we propose a convex optimization framework for simultaneous estimation of super-resolved depth map and images from a single moving camera. The pixel measurement error in 3D reconstruction is directly related to the resolution of the images at hand. In turn, even a small measurement error can cause significant errors in reconstructing 3D scene structure or camera pose. Therefore, enhancing image resolution can be an effective solution for securing the accuracy as well as the resolution of 3D reconstruction. In the proposed method, depth map estimation and image super-resolution are formulated in a single energy minimization framework with a convex function and solved efficiently by a first-order primal-dual algorithm. Explicit inter-frame pixel correspondences are not required for our super-resolution procedure, thus we can avoid a huge computation time and obtain improved depth map in the accuracy and resolution as well as highresolution images with reasonable time. The superiority of our algorithm is demonstrated by presenting the improved depth map accuracy, image super-resolution results, and camera pose estimation.
6 0.882029 359 cvpr-2013-Robust Discriminative Response Map Fitting with Constrained Local Models
8 0.86724013 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
9 0.86013913 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
10 0.85590529 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.85378015 461 cvpr-2013-Weakly Supervised Learning for Attribute Localization in Outdoor Scenes
12 0.8508383 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
13 0.84951365 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
14 0.84876746 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
15 0.84835881 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
16 0.84823453 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
17 0.84799588 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
18 0.84781551 290 cvpr-2013-Motion Estimation for Self-Driving Cars with a Generalized Camera
19 0.84762543 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation
20 0.84735477 286 cvpr-2013-Mirror Surface Reconstruction from a Single Image