cvpr cvpr2013 cvpr2013-181 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
Reference: text
sentIndex sentText sentNum sentScore
1 Fusing Depth from Defocus and Stereo with Coded Apertures Yuichi Takeda Osaka University Shinsaku Hiura Hiroshima City University Kosuke Sato Osaka University Abstract In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. [sent-1, score-0.573]
2 Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. [sent-4, score-0.883]
3 In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. [sent-6, score-0.833]
4 There are, however, passive techniques to measure depth besides stereo imaging, such as depth from defocus (DFD), that utilize changes in the imaging characteristics of lenses depending on the distance to an object. [sent-16, score-0.962]
5 A problem with these techniques, however, is the less accuracy with a distant object because the size of defocus (the diameter of the circle of confusion) is limited by the effective diameter of the lens. [sent-18, score-0.71]
6 Therefore, in this paper, we propose a novel depth measurement technique that combines stereo imaging and DFD. [sent-19, score-0.577]
7 In addition, coded apertures are incorporated to optimize the blurring phenomenon of lenses. [sent-22, score-0.535]
8 Related work Stereo imaging is a typical passive depth measurement technique that has often been studied and used. [sent-26, score-0.435]
9 In other words, defocus is regarded as an undesired phenomenon and the aperture of the lens should be stopped down for sufficiently deep depth of field. [sent-31, score-0.971]
10 On the other hand, techniques using the blurring caused by lenses have been proposed as a passive depth measurement method. [sent-32, score-0.481]
11 While depth from focusing physically performs focusing, DFD analyzes the amount of blurring in images taken using a fixed lens and has been actively studied[3, 5, 7]. [sent-33, score-0.481]
12 A technique that changes aperture geometry[4, 6, 12, 15] cannot image a dynamic scene. [sent-43, score-0.586]
13 Techniques to change the geometry of the lens aperture are closely related to a technique involving what are known as coded apertures[3, 5, 14, 16]. [sent-44, score-0.959]
14 proposed a technique to improve the accuracy of depth estimation in shallow depth of field by combining coded apertures with stereo imaging[13]. [sent-46, score-1.006]
15 In contrast, we propose a technique combining stereo imaging and cues of depth estimation based on the same principle as DFD. [sent-48, score-0.529]
16 Therefore, by their method, it is impossible to handle 2 images with both disparity and different focus distance. [sent-53, score-0.476]
17 Fusing DFD and stereo This section describes a proposed technique to create a depth map of a scene and blur-free (extended depth-offield) images. [sent-59, score-0.454]
18 This technique combines the 3 elements of stereo imaging, DFD, and coded apertures via a single point spread function (PSF). [sent-60, score-0.639]
19 Like normal stereo imaging, 2 cameras are positioned in parallel, but the distance between the lens and image sensor is changed so that they differ for the 2 cameras. [sent-64, score-0.442]
20 In addition, each lens is equipped with a coded aperture mask that optimizes the geometry of blurred images of a point light source, i. [sent-65, score-1.02]
21 The geometry of the coded aperture mask used is described in Section 3. [sent-68, score-0.746]
22 Normal stereo imaging calculates a binocular disparity by comparing templates sliding on the image. [sent-71, score-0.896]
23 In contrast, our technique prepares a PSF, which includes both blurring and binocular disparity as values corresponding to the depth. [sent-72, score-0.833]
24 Consequently, the depth of a scene can be determined using information on both binocular disparity and defocus. [sent-77, score-0.846]
25 noted[1 1], larger diameter of the lens aperture will result in accurate estimate of depth as a stereo camera with a longer baseline. [sent-79, score-1.247]
26 However, increases in the numerical aperture are limited by lens design. [sent-80, score-0.625]
27 Focal length 222111000 is determined by the size of the image sensor and angle of view of the scene being imaged, so the diameter of the lens aperture itself is limited by the size of the image sensor. [sent-81, score-0.925]
28 However, depth cannot be estimated with stereo imaging when there are no changes in image information in the direction of an epipolar line. [sent-83, score-0.541]
29 Typically, the accuracy of reconstruction of the original image increases with less blurring, regardless of whether or not a coded aperture is used. [sent-87, score-0.744]
30 This should increase the accuracy of reconstruction of the original image in comparison to stereo imaging with coded apertures[13]. [sent-90, score-0.549]
31 Expression of binocular disparity with a PSF Blurred images can be represented by a convolution of blur-free original image and PSF. [sent-93, score-0.62]
32 Images will also have the binocular disparity corresponding to the distance to an object. [sent-94, score-0.647]
33 pid(u, v) and d represent the PSF and disparity of each camera with respect to depth. [sent-97, score-0.486]
34 When we assume the origin of the disparity set at left image, the disparity is expressed by only d = dR and dL = 0. [sent-98, score-0.892]
35 In other words, a reconstructed image when d is assumed to be the disparity will be estimated as Xˆd(ω) =? [sent-121, score-0.432]
36 recreated based on this reconstructed image, and the disparity is determined in comparison to the input image, dˆ = argmdini=? [sent-125, score-0.486]
37 If the awshseumree Fd disparity d differs from the actual disparity, reconstruction of the original image in Eq. [sent-133, score-0.472]
38 Moreover, error will appear in both the size of the PSF and disparity of the blurred image, so the residual in Eq. [sent-135, score-0.511]
39 As a result, the extent of disparity and blurring is handled in an integrated manner and the depth of the scene is estimated. [sent-137, score-0.789]
40 An example of the relationship between disparity and circle of confusion. [sent-140, score-0.555]
41 Relationship between binocular disparity and the diameter of the circle of confusion As described in Section 3. [sent-146, score-1.027]
42 In contrast, our study determined depth using both the extent of blurring and disparity in stereo vision, so the relationship between the two must be identified by calibration. [sent-149, score-1.002]
43 Here, the relationship between the diame- ter of the circle of confusion and binocular disparity was derived and it was found to be linear even if the focused distances of 2 cameras are different. [sent-150, score-0.995]
44 Figure 3 depicts lenses with the same aperture diameter D and parallel optical axes in a stereo camera consisting of 2 cameras with differing focus distances. [sent-152, score-1.223]
45 Here, disparity is assumed to be 0 when the distance to the object is infinite. [sent-157, score-0.459]
46 The difference of disparity d in disparity d1 with regard to the in-focus point P1 and disparity d2 with regard to out-offocus point P2 can be determined by: d = d2− d1= ? [sent-158, score-1.326]
47 When point P2 on object (8) p2 is imaged by the image sensor on plane q1, the diameter of the circle of confusion c is determined by c =b2−b2 b1D (9) using lens aperture diameter D. [sent-161, score-1.346]
48 (7), the ratio of disparity d and the diameter of the circle of confusion c when blurring occurs during focusing is: dc=b2b−1b2 b1? [sent-163, score-1.013]
49 Accordingly, the diameter of the circle of confusion c and disparity d are proportional. [sent-166, score-0.839]
50 The ratio of the two is the ratio of the lens aperture diameter D to the baseline length l. [sent-167, score-0.942]
51 Typically, in a normal stereo camera with parallel optical axes, the disparity on a plane perpendicular to the optical axis is constant. [sent-170, score-0.888]
52 As a result, the disparity is no longer constant. [sent-172, score-0.432]
53 Therefore, image IL provides an image bb21-fold larger than image IR when blurring due to the finite lens aperture diameter D is ignored. [sent-175, score-0.976]
54 This step results in disparity with respect to a plane perpendicular to the optical axis that is constant, regardless of the distance to the object. [sent-178, score-0.606]
55 Even if the object is not on the optical axis of the left camera IL, the aforementioned linear relationship between disparity and the diameter of the circle of confusion is kept. [sent-179, score-1.012]
56 Also apparent is the fact that the linear relationship between disparity and the diameter of the circle of confusion is retained for the scaled image from the left camera IL as well. [sent-180, score-0.951]
57 Conversion from the binocular disparity to the diameter , bb12 bb12 of the circle of confusion is rather easy through use of the relationship as described thus far. [sent-182, score-1.057]
58 The binocular disparity and size of the image of the spot are both determined, and then the relationship between binocular disparity and the diameter of the circle of confusion can be determined at other distances using linear interpolation. [sent-184, score-1.753]
59 Measurements of the binocular disparity and diameter of the circle of confusion when the bright spot was actually placed at various distances are shown in Figure 4. [sent-186, score-1.091]
60 The lens is almost focused to infinity, so when binocular disparity is 0 the diameter of the circle of confusion will also be 0. [sent-189, score-1.208]
61 The graph shows that the binocular disparity and diameter of the circle of confusion can easily be converted back and forth. [sent-190, score-1.046]
62 Shape of the aperture The PSF for an image taken by a camera with coded aperture represents scaling of the aperture geometry in accordance with the depth. [sent-193, score-1.739]
63 As indicated in a previous section, the scale of the PSF changes with binocular disparity in a linear fashion. [sent-194, score-0.654]
64 Simulation experiment The depth estimation technique proposed in this paper combines coded stereo imaging and DFD. [sent-207, score-0.782]
65 This section describes the results of a simulation that compared this technique to stereo method with coded apertures[13], DFD with different focus distance and a single coded aperture[3], and DFD with coded aperture pairs[15]. [sent-208, score-1.599]
66 The horizontal striped pattern had only edges parallel to the epipolar line of stereo rig, but it was deliberately used to assess the performance of defocus depth cue. [sent-218, score-0.704]
67 Thus, the shape of the lens aperture was first specified and then images seen from various viewpoints in the aperture were created. [sent-224, score-1.121]
68 Next, images multiplied by the transmittance {0, 1} of the mask at iangdeisv miduualtli points on hthee t aperture were averaged toe provide a blurred image. [sent-225, score-0.572]
69 Images were translated so that the disparity at the focus distance would be 0, allowing adjustment of the focus distance. [sent-226, score-0.547]
70 (c) Two aperture shapes used in Technique 4 : Coded Aperture Pair. [sent-238, score-0.471]
71 Plane A aims to minimize the amount of defocus in both of front and back of the plane because the depth of field in the back of a focused plane is deeper than front. [sent-244, score-0.481]
72 Detailed settings for each measurement methods are in following: Technique 1: Proposed method The same coded aperture was used for the left and right cameras, but the focus distance for each lens differed. [sent-246, score-1.045]
73 To incorporate disparity depth cue, the viewpoint was shifted horizontally with just the baseline length. [sent-248, score-0.637]
74 Technique 2: Stereo imaging with coded apertures[13] The same coded aperture was used for the left and right cameras, and the focus distance was also the same; both were focused on plane A. [sent-250, score-1.287]
75 The shape of coded aperture and viewpoint for the 2 images was the same, and the focus distance is set to plane B for input image 1 and to plane A for input image 2, just like in the Technique 1. [sent-254, score-0.973]
76 Technique 4: Coded aperture pair[15] Distance measurement according to Zhou et al. [sent-255, score-0.519]
77 Disparity ranged upto 60 pixels because of the baseline length specified, and this coincides with the disparity search range. [sent-261, score-0.492]
78 The aperture diameter was 1/3 of the baseline length. [sent-262, score-0.723]
79 Results Two input images and an output disparity map (cropped area within the red box) for each technique are shown in 811 when the scene texture was a horizontal stripe pattern. [sent-265, score-0.591]
80 Although DFD and aperture pair do not have disparity, we converted the estimated depth to a corresponding disparity value for comparison. [sent-266, score-1.094]
81 The averages of absolute error of disparity using 4 methods and 3 textures are summarized in Figure 12. [sent-268, score-0.455]
82 With the checkerboard pattern and dappled pattern, depth was estimated with higher accuracy using the proposed technique and coded aperture stereo than when using DFD and aperture pair. [sent-270, score-1.755]
83 This is because a long baseline in stereo imaging helps to increase the resolution of depth estimation. [sent-271, score-0.481]
84 With a horizontal striped pattern, the aperture pair technique of 222111444 (a) Left input(b) Right input(c) Disparity (a) Left input(b) Right input(c) Disparity (a) Input 1(b) Input 2(c) Disparity Figure 8. [sent-275, score-0.617]
85 This is because the aperture shapes (Figure 7(c)) are decentered to the top or bottom of the aperture, so the images obtained are similar to images captured at the center of gravity of the aperture opening. [sent-287, score-0.942]
86 The shape of the coded aperture was Zhou’s code of σ = 0. [sent-307, score-0.748]
87 The setting of aperture blade of the lens was F=2. [sent-309, score-0.625]
88 0 (fully open) to prevent vignetting, and we conformed no vignetting of the coded aperture occurs at the edges of the image. [sent-310, score-0.769]
89 The relationship between the extent of disparity and PSF was calibrated using the method in Section 3. [sent-311, score-0.491]
90 The coded aperture in place on the lens and the system that was actually used are both shown in Figure 13. [sent-316, score-0.878]
91 The disparity map and an image that has been deblurred (blur-free) are shown in Figure 15. [sent-320, score-0.46]
92 Based on the disparity map, depth has been estimated correctly for the most part with evident texture. [sent-322, score-0.604]
93 Conclusion This paper has proposed a technique combining stereo imaging and DFD with coded apertures. [sent-328, score-0.61]
94 This is achieved by utilizing 2 cameras with different focus distances like a stereo camera and by expressing binocular disparity and defocus as a single PSF. [sent-329, score-1.177]
95 Experimental results indicated that the technique was able to determine the distance to object with a texture parallel to an epipolar line, which is something stereo imaging could not accomplish. [sent-330, score-0.484]
96 Other topics for the future are utilizing different aperture shapes for both lenses in the device proposed in this manuscript and performing experiments using 3 or more cameras. [sent-336, score-0.547]
97 A new approach for estimating depth by fusing stereo and defocus information. [sent-345, score-0.53]
98 Image and depth from a conventional camera with a coded aperture. [sent-372, score-0.479]
99 Coded aperture stereo - for extension of depth of field and refocusing. [sent-434, score-0.82]
100 Dappled photography: mask enhanced cameras for heterodyned light fields and coded aperture refocusing. [sent-444, score-0.871]
wordName wordTfidf (topN-words)
[('aperture', 0.471), ('disparity', 0.432), ('dfd', 0.318), ('coded', 0.253), ('diameter', 0.219), ('binocular', 0.188), ('stereo', 0.177), ('depth', 0.172), ('lens', 0.154), ('defocus', 0.152), ('psf', 0.14), ('blurring', 0.132), ('apertures', 0.128), ('imaging', 0.099), ('confusion', 0.095), ('circle', 0.093), ('pid', 0.09), ('cameras', 0.084), ('technique', 0.081), ('blurred', 0.079), ('lenses', 0.076), ('plane', 0.065), ('dappled', 0.062), ('epipolar', 0.059), ('fid', 0.055), ('wiener', 0.055), ('camera', 0.054), ('zhou', 0.051), ('checkerboard', 0.049), ('measurement', 0.048), ('distances', 0.046), ('focus', 0.044), ('takeda', 0.042), ('hiura', 0.042), ('light', 0.041), ('parallel', 0.041), ('simulation', 0.04), ('optical', 0.037), ('magnification', 0.036), ('deblurring', 0.035), ('passive', 0.035), ('striped', 0.035), ('changes', 0.034), ('fourier', 0.033), ('baseline', 0.033), ('customization', 0.031), ('gheta', 0.031), ('osaka', 0.031), ('determined', 0.03), ('relationship', 0.03), ('focal', 0.03), ('horizontal', 0.03), ('fusing', 0.029), ('extent', 0.029), ('il', 0.029), ('box', 0.028), ('left', 0.028), ('deblurred', 0.028), ('otf', 0.028), ('equipment', 0.028), ('opengl', 0.028), ('distance', 0.027), ('length', 0.027), ('distant', 0.027), ('focused', 0.027), ('photography', 0.026), ('vignetting', 0.026), ('rajagopalan', 0.026), ('deconvolution', 0.025), ('viewpoints', 0.025), ('mp', 0.025), ('axis', 0.024), ('scene', 0.024), ('code', 0.024), ('schechner', 0.024), ('input', 0.024), ('creation', 0.024), ('repetitive', 0.024), ('focusing', 0.023), ('shallow', 0.023), ('textures', 0.023), ('phenomenon', 0.022), ('mask', 0.022), ('perpendicular', 0.021), ('dl', 0.021), ('featured', 0.021), ('reconstruction', 0.02), ('actual', 0.02), ('right', 0.02), ('axes', 0.02), ('ir', 0.02), ('converted', 0.019), ('ratio', 0.019), ('accordance', 0.019), ('pattern', 0.019), ('edges', 0.019), ('planes', 0.019), ('correspondence', 0.018), ('techniques', 0.018), ('bright', 0.018)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
2 0.34905493 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
Author: Chen Li, Shuochen Su, Yasuyuki Matsushita, Kun Zhou, Stephen Lin
Abstract: We present a method that enhances the performance of depth-from-defocus (DFD) through the use of shading information. DFD suffers from important limitations namely coarse shape reconstruction and poor accuracy on textureless surfaces that can be overcome with the help of shading. We integrate both forms of data within a Bayesian framework that capitalizes on their relative strengths. Shading data, however, is challenging to recover accurately from surfaces that contain texture. To address this issue, we propose an iterative technique that utilizes depth information to improve shading estimation, which in turn is used to elevate depth estimation in the presence of textures. With this approach, we demonstrate improvements over existing DFD techniques, as well as effective shape reconstruction of textureless surfaces. – –
3 0.32160082 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields
Author: Bastian Goldluecke, Sven Wanner
Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.
4 0.27977824 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
5 0.20858274 155 cvpr-2013-Exploiting the Power of Stereo Confidences
Author: David Pfeiffer, Stefan Gehrig, Nicolai Schneider
Abstract: Applications based on stereo vision are becoming increasingly common, ranging from gaming over robotics to driver assistance. While stereo algorithms have been investigated heavily both on the pixel and the application level, far less attention has been dedicated to the use of stereo confidence cues. Mostly, a threshold is applied to the confidence values for further processing, which is essentially a sparsified disparity map. This is straightforward but it does not take full advantage of the available information. In this paper, we make full use of the stereo confidence cues by propagating all confidence values along with the measured disparities in a Bayesian manner. Before using this information, a mapping from confidence values to disparity outlier probability rate is performed based on gathered disparity statistics from labeled video data. We present an extension of the so called Stixel World, a generic 3D intermediate representation that can serve as input for many of the applications mentioned above. This scheme is modified to directly exploit stereo confidence cues in the underlying sensor model during a maximum a poste- riori estimation process. The effectiveness of this step is verified in an in-depth evaluation on a large real-world traffic data base of which parts are made publicly available. We show that using stereo confidence cues allows both reducing the number of false object detections by a factor of six while keeping the detection rate at a near constant level.
6 0.19704217 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching
7 0.16658995 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras
8 0.15993051 219 cvpr-2013-In Defense of 3D-Label Stereo
9 0.1583268 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
10 0.13617021 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
11 0.1349574 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display
12 0.13062534 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
13 0.11562056 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation
15 0.11014871 108 cvpr-2013-Dense 3D Reconstruction from Severely Blurred Images Using a Single Moving Camera
16 0.10959827 307 cvpr-2013-Non-uniform Motion Deblurring for Bilayer Scenes
17 0.10778977 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
19 0.10012787 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
20 0.099748522 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
topicId topicWeight
[(0, 0.139), (1, 0.248), (2, 0.026), (3, 0.073), (4, -0.044), (5, -0.001), (6, -0.064), (7, 0.078), (8, 0.035), (9, 0.063), (10, -0.053), (11, 0.026), (12, 0.205), (13, 0.075), (14, -0.188), (15, 0.095), (16, -0.187), (17, -0.101), (18, 0.042), (19, -0.033), (20, -0.078), (21, 0.089), (22, 0.188), (23, 0.1), (24, -0.024), (25, 0.004), (26, 0.021), (27, -0.063), (28, 0.072), (29, 0.017), (30, 0.096), (31, 0.016), (32, -0.089), (33, 0.012), (34, -0.012), (35, -0.016), (36, -0.04), (37, 0.034), (38, -0.03), (39, -0.005), (40, 0.055), (41, -0.024), (42, 0.042), (43, 0.011), (44, -0.052), (45, -0.056), (46, 0.03), (47, -0.006), (48, -0.028), (49, -0.073)]
simIndex simValue paperId paperTitle
same-paper 1 0.93932074 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
2 0.78761506 155 cvpr-2013-Exploiting the Power of Stereo Confidences
Author: David Pfeiffer, Stefan Gehrig, Nicolai Schneider
Abstract: Applications based on stereo vision are becoming increasingly common, ranging from gaming over robotics to driver assistance. While stereo algorithms have been investigated heavily both on the pixel and the application level, far less attention has been dedicated to the use of stereo confidence cues. Mostly, a threshold is applied to the confidence values for further processing, which is essentially a sparsified disparity map. This is straightforward but it does not take full advantage of the available information. In this paper, we make full use of the stereo confidence cues by propagating all confidence values along with the measured disparities in a Bayesian manner. Before using this information, a mapping from confidence values to disparity outlier probability rate is performed based on gathered disparity statistics from labeled video data. We present an extension of the so called Stixel World, a generic 3D intermediate representation that can serve as input for many of the applications mentioned above. This scheme is modified to directly exploit stereo confidence cues in the underlying sensor model during a maximum a poste- riori estimation process. The effectiveness of this step is verified in an in-depth evaluation on a large real-world traffic data base of which parts are made publicly available. We show that using stereo confidence cues allows both reducing the number of false object detections by a factor of six while keeping the detection rate at a near constant level.
3 0.77090144 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields
Author: Bastian Goldluecke, Sven Wanner
Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.
4 0.7679587 219 cvpr-2013-In Defense of 3D-Label Stereo
Author: Carl Olsson, Johannes Ulén, Yuri Boykov
Abstract: It is commonly believed that higher order smoothness should be modeled using higher order interactions. For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e.g. tangent planes. This general paradigm has been criticized due to perceived computational complexity of optimization in higher-dimensional label space. Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. Our theoretical and experimental re- sults demonstrate advantages over state-of-the-art methods for 2nd order smoothness stereo. 1
5 0.76486182 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
6 0.69588393 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching
7 0.63671213 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
9 0.49888501 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns
10 0.49795738 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints
11 0.47260672 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
12 0.44935149 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation
13 0.44260836 283 cvpr-2013-Megastereo: Constructing High-Resolution Stereo Panoramas
14 0.42622757 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
15 0.42478168 279 cvpr-2013-Manhattan Scene Understanding via XSlit Imaging
16 0.41232988 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display
17 0.40761068 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image
18 0.40392199 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs
19 0.40336782 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
20 0.40069616 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera
topicId topicWeight
[(10, 0.145), (16, 0.029), (26, 0.035), (33, 0.186), (67, 0.04), (69, 0.068), (87, 0.1), (96, 0.016), (99, 0.281)]
simIndex simValue paperId paperTitle
same-paper 1 0.75114387 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
Author: Won Hwa Kim, Moo K. Chung, Vikas Singh
Abstract: The analysis of 3-D shape meshes is a fundamental problem in computer vision, graphics, and medical imaging. Frequently, the needs of the application require that our analysis take a multi-resolution view of the shape ’s local and global topology, and that the solution is consistent across multiple scales. Unfortunately, the preferred mathematical construct which offers this behavior in classical image/signal processing, Wavelets, is no longer applicable in this general setting (data with non-uniform topology). In particular, the traditional definition does not allow writing out an expansion for graphs that do not correspond to the uniformly sampled lattice (e.g., images). In this paper, we adapt recent results in harmonic analysis, to derive NonEuclidean Wavelets based algorithms for a range of shape analysis problems in vision and medical imaging. We show how descriptors derived from the dual domain representation offer native multi-resolution behavior for characterizing local/global topology around vertices. With only minor modifications, the framework yields a method for extracting interest/key points from shapes, a surprisingly simple algorithm for 3-D shape segmentation (competitive with state of the art), and a method for surface alignment (without landmarks). We give an extensive set of comparison results on a large shape segmentation benchmark and derive a uniqueness theorem for the surface alignment problem.
3 0.70128512 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
Author: Zhong Zhang, Chunheng Wang, Baihua Xiao, Wen Zhou, Shuang Liu, Cunzhao Shi
Abstract: In this paper, we propose a novel method for cross-view action recognition via a continuous virtual path which connects the source view and the target view. Each point on this virtual path is a virtual view which is obtained by a linear transformation of the action descriptor. All the virtual views are concatenated into an infinite-dimensional feature to characterize continuous changes from the source to the target view. However, these infinite-dimensional features cannot be used directly. Thus, we propose a virtual view kernel to compute the value of similarity between two infinite-dimensional features, which can be readily used to construct any kernelized classifiers. In addition, there are a lot of unlabeled samples from the target view, which can be utilized to improve the performance of classifiers. Thus, we present a constraint strategy to explore the information contained in the unlabeled samples. The rationality behind the constraint is that any action video belongs to only one class. Our method is verified on the IXMAS dataset, and the experimental results demonstrate that our method achieves better performance than the state-of-the-art methods.
4 0.68975753 73 cvpr-2013-Bringing Semantics into Focus Using Visual Abstraction
Author: C. Lawrence Zitnick, Devi Parikh
Abstract: Relating visual information to its linguistic semantic meaning remains an open and challenging area of research. The semantic meaning of images depends on the presence of objects, their attributes and their relations to other objects. But precisely characterizing this dependence requires extracting complex visual information from an image, which is in general a difficult and yet unsolved problem. In this paper, we propose studying semantic information in abstract images created from collections of clip art. Abstract images provide several advantages. They allow for the direct study of how to infer high-level semantic information, since they remove the reliance on noisy low-level object, attribute and relation detectors, or the tedious hand-labeling of images. Importantly, abstract images also allow the ability to generate sets of semantically similar scenes. Finding analogous sets of semantically similar real images would be nearly impossible. We create 1,002 sets of 10 semantically similar abstract scenes with corresponding written descriptions. We thoroughly analyze this dataset to discover semantically important features, the relations of words to visual features and methods for measuring semantic similarity.
5 0.67435247 130 cvpr-2013-Discriminative Color Descriptors
Author: Rahat Khan, Joost van_de_Weijer, Fahad Shahbaz Khan, Damien Muselet, Christophe Ducottet, Cecile Barat
Abstract: Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-basedmodels, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.
6 0.66000003 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
7 0.65746933 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
8 0.65385765 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems
9 0.65363079 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
11 0.65262145 126 cvpr-2013-Diffusion Processes for Retrieval Revisited
12 0.65220773 298 cvpr-2013-Multi-scale Curve Detection on Surfaces
13 0.65203679 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
14 0.65182805 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments
15 0.65042388 414 cvpr-2013-Structure Preserving Object Tracking
16 0.64946699 279 cvpr-2013-Manhattan Scene Understanding via XSlit Imaging
17 0.6493929 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking
18 0.64738333 393 cvpr-2013-Separating Signal from Noise Using Patch Recurrence across Scales
19 0.64711189 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning
20 0.64710379 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models