cvpr cvpr2013 cvpr2013-219 knowledge-graph by maker-knowledge-mining

219 cvpr-2013-In Defense of 3D-Label Stereo

Source: pdf

Author: Carl Olsson, Johannes Ulén, Yuri Boykov

Abstract: It is commonly believed that higher order smoothness should be modeled using higher order interactions. For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e.g. tangent planes. This general paradigm has been criticized due to perceived computational complexity of optimization in higher-dimensional label space. Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. Our theoretical and experimental re- sults demonstrate advantages over state-of-the-art methods for 2nd order smoothness stereo. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 s e Abstract It is commonly believed that higher order smoothness should be modeled using higher order interactions. [sent-5, score-0.16]

2 For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. [sent-6, score-0.161]

3 Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. [sent-7, score-0.557]

4 In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e. [sent-8, score-0.399]

5 Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. [sent-12, score-0.429]

6 One reason for their popularity is that when applying movemaking algorithms such as α-expansion [4] or fusion moves [8] they often result in submodular moves, allowing efficient computation using min-cut/max-flow algorithms [4]. [sent-18, score-0.333]

7 Many basic optimization methods for stereo use scalar (1D) disparity labels. [sent-19, score-0.369]

8 ca regularization potentials assign higher cost to surfaces with larger tilt [4]. [sent-27, score-0.165]

9 To address more general scenes our paper follows the popular trend of using 2nd derivative surface regularization for stereo [9, 15]. [sent-29, score-0.308]

10 [15] retain the scalar disparity labels while using triple-cliques to penalize 2nd derivatives of the reconstructed surface. [sent-32, score-0.384]

11 In contrast, Li and Zucker [9] use 3D-labels corresponding to tangent planes to encode 2nd order disparity map smoothness as pairwise interactions. [sent-35, score-0.776]

12 The first term computes the difference of the disparity assignment, and the disparity predicted by the neighboring tangent plane. [sent-38, score-0.817]

13 ) The second term penalizes the (squared) angular difference between neighboring tangent plane normals. [sent-40, score-0.459]

14 [15] that only use a disparity estimate at each pixel, the approach by Li and Zucker [9] requires discretization of a much larger label space. [sent-44, score-0.243]

15 As shown in [15], this specific approach results in disparity maps that are inferior to those of Woodford et al. [sent-46, score-0.243]

16 The discussed limitations of [9] may have helped to promote the general perception of triple interactions of scalar disparity labels as a superior approach for modeling 2nd order smoothness. [sent-47, score-0.443]

17 This makes it possible to precompute and penalize 2nd order smoothness via 111777223088 a unary term. [sent-50, score-0.203]

18 There is however no smoothness interaction between models and therefore it is not possible to combine local models into global ones. [sent-51, score-0.15]

19 In this paper we propose a new 3D-label stereo algorithm encoding 2nd order smoothness of the disparity map with pairwise interactions. [sent-53, score-0.529]

20 We show how to properly measure 2nd derivative of the reconstructed surface using pairwise cliques when the labels are tangent planes. [sent-55, score-0.498]

21 Instead of using a fixed set of locally precomputed tangents, we adaptively generate new surface proposals based on the current surface estimate. [sent-56, score-0.316]

22 [15] we show that our formulation is submodular when using planar proposals, and verify experimentally that Roof duality [11] labels much more pixels for general proposals with our formulation. [sent-59, score-0.378]

23 We show that the use of even higher order labels (that encode higher order derivatives) further extends the class of submodular functions. [sent-61, score-0.222]

24 In addition we present a version of our method that works with depth rather than disparity and, therefore, does not require rectified cameras. [sent-62, score-0.43]

25 The possibility to decrease the energy for each fusion move is an attractive feature, however there is no guarantee on how many variables will be labeled in each fusion move. [sent-91, score-0.317]

26 For submodular fusion moves we are guaranteed to label all variables. [sent-95, score-0.333]

27 A Second Order Multi-View Stereo Smoothness Prior for In this section we present a second order smoothness prior for dense multi-view stereo reconstruction. [sent-98, score-0.223]

28 To each viewing ray we will assign a plane that locally represents the surface geometry close to the ray. [sent-101, score-0.334]

29 The intersection between the ray and the plane will be the estimated 3D point. [sent-102, score-0.165]

30 By interpreting the planes as a tangents of the viewed 3D surface we can encourage smooth solutions by penalizing neighboring 3D-points that deviate largely from neighboring tangents. [sent-103, score-0.485]

31 Rectified Cameras and Disparity Maps We will start by assuming that the cameras have been rectified, since this allows us to work in disparity space. [sent-106, score-0.279]

32 gT pheix goal no tfh hsete irmeoa gise t Io estimate the function D : I R that gives a disparity value → feostri emaacthe pixel ninc ttihoen image. [sent-111, score-0.264]

33 →To R Re tahchat pixel p we wariillty assign a tangent plane that locally approximates this function. [sent-112, score-0.401]

34 We can think of these tangents as samples of the disparity function and its derivatives. [sent-113, score-0.368]

35 By the function TpD : I R we → twioilnl mean tsh dee tangent sa. [sent-114, score-0.243]

36 o→n o Rf wthee whole image, that is TpD (x) = D (p) + ∇D (p)T (x − p), (4) where D (p) and ∇D (p) is the assigned disparity and disparity gradient (dw ∇ithD respect htoe athssei image csopoarridtyina anted system) at pixel p. [sent-116, score-0.507]

37 We define a pairwise interaction between neighboring pixels as Vpq = |TpD (q) − D (q) |, (5) That is, Vpq measures the curve’s deviation from the tangent plane, see Figure 1. [sent-117, score-0.384]

38 Intuitively, if the surface is smooth then 111777223 919 for parallel viewing rays. [sent-118, score-0.194]

39 (7) That is, Vpq measures the second derivative at p in direction q p of the underlying disparity function. [sent-123, score-0.332]

40 However directly penalizing 2nd derivative of the depth function is not a good idea. [sent-128, score-0.228]

41 In general the projection of a plane will not yield a linear depth function unless the camera is affine (which can be seen from (11) below). [sent-129, score-0.213]

42 Therefore we will instead measure the deviation from the tangent plane along the viewing ray. [sent-131, score-0.417]

43 = 1 and ap ∈ R we denote the tangent plane ∈at p given by =the 1 equation npTx + ap = 0. [sent-138, score-0.349]

44 between the viewing ray at q and the tangent plane at p. [sent-140, score-0.476]

45 We let Tpd : I R+ → abet qth aen depth tfaunncgetinotn olfa nthee a tangent plane atd point p, t Rhat is, Q? [sent-141, score-0.456]

46 We can calculate the tangent function using Tpd(q) = −npTapqh. [sent-143, score-0.243]

47 (11) (Here we are assuming that the viewing ray is not completely contained in the tangent plane. [sent-144, score-0.37]

48 In contrast disparity is inversely proportional to depth and will therefore be linear. [sent-146, score-0.35]

49 Given the estimated tangent plane at p and the depth at q the interaction computes the distance between the estimated 3D point and the tangent plane along the view- ing ray. [sent-154, score-0.853]

50 The smoothness term will penalize deviations from planes and thereby encourage solutions with small second derivatives. [sent-155, score-0.29]

51 Submodularity of Fusion Moves In this section we will show that fusion moves [8] with our interactions are often submodular. [sent-158, score-0.226]

52 Given a current disparity function D and a proposal function P the fusion move a flluonwcsti pixels aton change tphoesiarl l a fbuenlcst ifornom P Pth teh tangents of D to the tangents of P. [sent-159, score-0.705]

53 In what follows we will use Vpq(D, P) = |TpD (q) − P (q) | , (13) to mean the penalty for assigning p the tangent plane from D and q the tangent plane from P. [sent-160, score-0.723]

54 Candidate Planes We first show that fusion moves where the candidate function P is a plane result in submodular terms. [sent-165, score-0.439]

55 1 If the proposal P is a plane then the fusion wPirothp any function Df t hise a rsoupbomsaold Pula isr move. [sent-168, score-0.293]

56 General Candidates Next we derive some more general sufficient conditions for submodularity of the fusion move. [sent-173, score-0.183]

57 2ca Ivef) b botehtw Deen a p a Pnd a q eth ceonn vtehxe (inorte raaltcetironnasVpq and Vqp are submodular for the fusion move. [sent-176, score-0.263]

58 To see this we first note that if both D and P are convex othe sne they are e b foitrhs tb nooutned ethda ftr iofm b obtehlo Dw by dth Peir a tangent planes. [sent-177, score-0.243]

59 (26) In the case of a plane proposal P we see that min(Vpq(D, D), t) ≤ (27) min ? [sent-191, score-0.203]

60 Epq(D, D) +Epq(P, P) ≤ Epq(D, P) +Epq(P, D), (30) showing that planar proposals also generate submodular interactions with this energy. [sent-201, score-0.391]

61 General Order Smoothness Priors In Section 2 we used tangent planes to create our smoothness prior. [sent-210, score-0.441]

62 For example, if we to each pixel assign a quadratic function instead of a tangent we have an interaction that penalizes 3rd derivatives. [sent-215, score-0.365]

63 1it is easy to see that if our proposals fulfill ApP (q) = P (q) , (32) then the fusion move will be submodular. [sent-218, score-0.301]

64 For example if we only use the zero order expansion (constant functions/ fronto-parallel planes) then we find that fusion moves with constant depth proposals are submodular. [sent-220, score-0.482]

65 eWriem weniltls use n beoitghh tbhoer disparity v cehrosisoenn (Section 2. [sent-234, score-0.243]

66 Depth Depth, 1st derivative Depth, 1st, 2nd derivative . [sent-242, score-0.178]

67 Constant functions Constant 1st derivative Constant 2nd derivative . [sent-245, score-0.178]

68 Characterization of Pairwise interactions, unary terms and submodular proposals for different types of labels. [sent-248, score-0.336]

69 (a) - Image, (b) - depth map using only the data term, (c) - depth map computed with regularization. [sent-251, score-0.258]

70 (a) × - Image, (b) - depth map using only the data term, (c) - depth map computed with The data term Ep is a unary term that depends on the tangent plane at p. [sent-255, score-0.718]

71 For each depth we use a planar homography hto\ project one Fofo trh eea neighboring images ainnator thhoem coegnrtearimage. [sent-257, score-0.218]

72 In principle we could make the NCC depend on the tilt of the tangent as well, however storing the samples of such a function would require lots of memory. [sent-261, score-0.281]

73 We also add an extra cost to assignments of planes which are roughly parallel to the viewing rays. [sent-265, score-0.224]

74 We us the extra cost (1 − npTvp)2k, (34) where np is the normal of the plane assigned to p and vp 111777333422 is the direction of the viewing ray in p (in the 3D space the viewing ray direction will be p/ ? [sent-267, score-0.399]

75 1 Proposal Generation To generate proposals we use similar heuristics to those of [15]. [sent-275, score-0.155]

76 • • To generate planar proposals we randomly select a point annerda a espm laanll neighborhood. [sent-276, score-0.214]

77 Using tmhel y be ssetl eloctca al maximum of the normalized cross correlation for each viewing ray we create a 3D cloud to the neighborhood and fit a plane using RANSAC. [sent-277, score-0.262]

78 • We use a filtering process that takes the current assignment, computes gth per corresponding 3 thDe points, aa sndsi gfnoreach pixel fits a plane to its neighboring 3D points. [sent-279, score-0.179]

79 Finally we have a proposal that just incFrienaaslelys/decreases the depth/disparity of all proposals with a small random step size. [sent-280, score-0.221]

80 For their data sets we computed a depth map for the middle image (nr 4) and used the remaining 6 images to compute the cross correlations needed for the data term. [sent-284, score-0.158]

81 The effects of the regularization term can be seen by comparing the surface generated from the data term without regularization (b) and the one with regularization (c) (the data term is particularly weak in the bowling data set because of the large texture less region). [sent-287, score-0.394]

82 GlobalStereo also penalize second derivative but use triple cliques with scalar disparity labels. [sent-292, score-0.512]

83 The comparison is performed on the Middlebury data set consisting of stereo pairs of rectified images. [sent-294, score-0.172]

84 In the fusion moves we use ”improve” [11] after running RD to label all unlabelled variables. [sent-319, score-0.236]

85 First we consider the Segpln-proposals which are 14 piecewise planar proposals generated from a segmentation (see [15]). [sent-321, score-0.253]

86 In Figure 4 we started from the same randomized disparity function with tangents parallel to the image plane. [sent-322, score-0.389]

87 We kept track of how many variables where unlabeled after RD for both methods and presented the numbers in Table 2 and the resulting disparity maps in Figure 4. [sent-324, score-0.269]

88 Note that the fusion move for our method is only submodular if we fuse one planar function at a time. [sent-325, score-0.347]

89 The SegPln proposals are piecewise planar and the regularization at transitions between planes may not be submodular. [sent-326, score-0.406]

90 We also test our regularization on the full pipeline of GlobalStereo which uses all three types of proposals (Segpln, SameUni and Smooth). [sent-327, score-0.212]

91 (b-d) are estimated disparity maps after fusing the 14 SegPln proposals. [sent-331, score-0.271]

92 In (f-h) we present the unlabelled variables summed over all 14 proposals scaled 0–14. [sent-332, score-0.226]

93 TsukubaVenusTeddyConesAverage Non occ All Disc Non occ All Disc Non occ All Disc Non occ All Disc Our4. [sent-334, score-0.236]

94 M Tidhed ceblaussreys [ are non goc thcleu sdaemd regions, aalsl , pixels iasn bde regions near depth d%is ocfo nptiixneulsiti beesi. [sent-364, score-0.153]

95 Conclusions In this paper we advocated a largely overlooked approach to stereo with 2nd order smoothness regularization. [sent-384, score-0.253]

96 In contrast to popular approaches where triple cliques are used for representing 2nd order surface derivatives, we proposed to use pairwise interactions with 3Dlabels. [sent-385, score-0.288]

97 We showed that this leads to simpler optimization problems and in many cases (nearly) submodular fusion moves. [sent-386, score-0.286]

98 (a) - Image, (b) - depth map using only the data term, (c) (c) - depth map computed with regularization. [sent-406, score-0.258]

99 (a) - Image, (b) - depth map using only the data term, (c) - depth map computed with regularization. [sent-408, score-0.258]

100 Global stereo reconstruction under second order smoothness priors. [sent-499, score-0.223]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('vpq', 0.624), ('tpd', 0.282), ('tangent', 0.243), ('disparity', 0.243), ('globalstereo', 0.181), ('proposals', 0.155), ('submodular', 0.142), ('epq', 0.132), ('tangents', 0.125), ('fusion', 0.121), ('tpp', 0.121), ('depth', 0.107), ('plane', 0.106), ('smoothness', 0.102), ('segpln', 0.101), ('zucker', 0.101), ('planes', 0.096), ('stereo', 0.092), ('derivative', 0.089), ('rectified', 0.08), ('triple', 0.08), ('woodford', 0.078), ('surface', 0.07), ('moves', 0.07), ('viewing', 0.068), ('proposal', 0.066), ('submodularity', 0.062), ('ncc', 0.062), ('middlebury', 0.06), ('occ', 0.059), ('ray', 0.059), ('planar', 0.059), ('regularization', 0.057), ('disc', 0.053), ('neighboring', 0.052), ('derivatives', 0.052), ('interaction', 0.048), ('non', 0.046), ('bowling', 0.045), ('unlabelled', 0.045), ('teddy', 0.043), ('pairwise', 0.041), ('apd', 0.04), ('maths', 0.04), ('np', 0.039), ('unary', 0.039), ('surfaces', 0.039), ('piecewise', 0.039), ('assignments', 0.039), ('tilt', 0.038), ('cameras', 0.036), ('term', 0.036), ('vqp', 0.036), ('smooth', 0.035), ('interactions', 0.035), ('scalar', 0.034), ('swedish', 0.033), ('penalize', 0.033), ('cliques', 0.033), ('penalizing', 0.032), ('olsson', 0.031), ('canadian', 0.031), ('yuri', 0.031), ('venus', 0.031), ('assign', 0.031), ('min', 0.031), ('overlooked', 0.03), ('tsukuba', 0.03), ('qh', 0.03), ('conf', 0.03), ('cross', 0.029), ('order', 0.029), ('proposition', 0.029), ('fusing', 0.028), ('cones', 0.028), ('cloth', 0.028), ('ionf', 0.028), ('bleyer', 0.027), ('lempitsky', 0.026), ('variables', 0.026), ('move', 0.025), ('regular', 0.025), ('penalty', 0.025), ('energy', 0.024), ('discontinuity', 0.023), ('simpler', 0.023), ('encourage', 0.023), ('labels', 0.022), ('penalizes', 0.022), ('pattern', 0.022), ('map', 0.022), ('rd', 0.022), ('precomputed', 0.021), ('itn', 0.021), ('parallel', 0.021), ('couple', 0.021), ('roof', 0.021), ('pixel', 0.021), ('ph', 0.021), ('rother', 0.02)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999988 219 cvpr-2013-In Defense of 3D-Label Stereo

Author: Carl Olsson, Johannes Ulén, Yuri Boykov

2 0.20934606 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields

Author: Bastian Goldluecke, Sven Wanner

Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.

3 0.16966406 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision

Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann

Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.

4 0.15993051 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures

Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato

Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.

5 0.14015688 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

Author: Sven Wanner, Christoph Straehle, Bastian Goldluecke

Abstract: Wepresent thefirst variationalframeworkfor multi-label segmentation on the ray space of 4D light fields. For traditional segmentation of single images, , features need to be extractedfrom the 2Dprojection ofa three-dimensional scene. The associated loss of geometry information can cause severe problems, for example if different objects have a very similar visual appearance. In this work, we show that using a light field instead of an image not only enables to train classifiers which can overcome many of these problems, but also provides an optimal data structure for label optimization by implicitly providing scene geometry information. It is thus possible to consistently optimize label assignment over all views simultaneously. As a further contribution, we make all light fields available online with complete depth and segmentation ground truth data where available, and thus establish the first benchmark data set for light field analysis to facilitate competitive further development of algorithms.

6 0.12969035 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching

7 0.11803633 155 cvpr-2013-Exploiting the Power of Stereo Confidences

8 0.11395466 455 cvpr-2013-Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions

9 0.10934843 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

10 0.090725832 377 cvpr-2013-Sample-Specific Late Fusion for Visual Category Recognition

11 0.088399909 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

12 0.085651092 326 cvpr-2013-Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation

13 0.085239545 111 cvpr-2013-Dense Reconstruction Using 3D Object Shape Priors

14 0.082545221 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D

15 0.081149086 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

16 0.079134919 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

17 0.078457743 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation

18 0.07545428 333 cvpr-2013-Plane-Based Content Preserving Warps for Video Stabilization

19 0.075449027 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery

20 0.070486277 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.136), (1, 0.158), (2, 0.029), (3, 0.039), (4, 0.011), (5, -0.056), (6, -0.04), (7, 0.054), (8, -0.016), (9, 0.039), (10, 0.022), (11, 0.016), (12, 0.077), (13, 0.04), (14, -0.117), (15, 0.062), (16, -0.135), (17, -0.057), (18, 0.041), (19, -0.025), (20, -0.107), (21, 0.074), (22, 0.125), (23, 0.066), (24, -0.013), (25, -0.011), (26, -0.02), (27, -0.012), (28, 0.027), (29, -0.042), (30, 0.047), (31, 0.026), (32, -0.002), (33, -0.018), (34, 0.004), (35, 0.006), (36, -0.001), (37, -0.02), (38, -0.021), (39, -0.001), (40, 0.087), (41, -0.0), (42, 0.014), (43, 0.033), (44, -0.045), (45, 0.049), (46, 0.001), (47, 0.005), (48, -0.016), (49, 0.011)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9317131 219 cvpr-2013-In Defense of 3D-Label Stereo

Author: Carl Olsson, Johannes Ulén, Yuri Boykov

2 0.85364228 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures

Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato

3 0.80595791 155 cvpr-2013-Exploiting the Power of Stereo Confidences

Author: David Pfeiffer, Stefan Gehrig, Nicolai Schneider

Abstract: Applications based on stereo vision are becoming increasingly common, ranging from gaming over robotics to driver assistance. While stereo algorithms have been investigated heavily both on the pixel and the application level, far less attention has been dedicated to the use of stereo confidence cues. Mostly, a threshold is applied to the confidence values for further processing, which is essentially a sparsified disparity map. This is straightforward but it does not take full advantage of the available information. In this paper, we make full use of the stereo confidence cues by propagating all confidence values along with the measured disparities in a Bayesian manner. Before using this information, a mapping from confidence values to disparity outlier probability rate is performed based on gathered disparity statistics from labeled video data. We present an extension of the so called Stixel World, a generic 3D intermediate representation that can serve as input for many of the applications mentioned above. This scheme is modified to directly exploit stereo confidence cues in the underlying sensor model during a maximum a poste- riori estimation process. The effectiveness of this step is verified in an in-depth evaluation on a large real-world traffic data base of which parts are made publicly available. We show that using stereo confidence cues allows both reducing the number of false object detections by a factor of six while keeping the detection rate at a near constant level.

4 0.79987472 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision

Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann

5 0.7393505 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching

Author: Xing Mei, Xun Sun, Weiming Dong, Haitao Wang, Xiaopeng Zhang

Abstract: This paper presents a novel tree-based cost aggregation method for dense stereo matching. Instead of employing the minimum spanning tree (MST) and its variants, a new tree structure, ”Segment-Tree ”, is proposed for non-local matching cost aggregation. Conceptually, the segment-tree is constructed in a three-step process: first, the pixels are grouped into a set of segments with the reference color or intensity image; second, a tree graph is created for each segment; and in the final step, these independent segment graphs are linked to form the segment-tree structure. In practice, this tree can be efficiently built in time nearly linear to the number of the image pixels. Compared to MST where the graph connectivity is determined with local edge weights, our method introduces some ’non-local’ decision rules: the pixels in one perceptually consistent segment are more likely to share similar disparities, and therefore their connectivity within the segment should be first enforced in the tree construction process. The matching costs are then aggregated over the tree within two passes. Performance evaluation on 19 Middlebury data sets shows that the proposed method is comparable to previous state-of-the-art aggregation methods in disparity accuracy and processing speed. Furthermore, the tree structure can be refined with the estimated disparities, which leads to consistent scene segmentation and significantly better aggregation results.

6 0.73231262 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields

7 0.63329613 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

8 0.58699709 326 cvpr-2013-Patch Match Filter: Efficient Edge-Aware Filtering Meets Randomized Search for Fast Correspondence Field Estimation

9 0.54935127 114 cvpr-2013-Depth Acquisition from Density Modulated Binary Patterns

10 0.54270041 230 cvpr-2013-Joint 3D Scene Reconstruction and Class Segmentation

11 0.53597277 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation

12 0.49507612 41 cvpr-2013-An Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision

13 0.49496743 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo

14 0.48247191 397 cvpr-2013-Simultaneous Super-Resolution of Depth and Images Using a Single Camera

15 0.47944295 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D

16 0.46808335 232 cvpr-2013-Joint Geodesic Upsampling of Depth Images

17 0.44175974 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

18 0.43802032 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

19 0.43560836 128 cvpr-2013-Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space

20 0.41991439 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(9, 0.268), (10, 0.106), (16, 0.027), (26, 0.031), (28, 0.012), (33, 0.234), (67, 0.025), (69, 0.048), (80, 0.014), (87, 0.114), (92, 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.78993881 219 cvpr-2013-In Defense of 3D-Label Stereo

Author: Carl Olsson, Johannes Ulén, Yuri Boykov

2 0.73032141 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation

Author: Mingsheng Long, Guiguang Ding, Jianmin Wang, Jiaguang Sun, Yuchen Guo, Philip S. Yu

Abstract: Sparse coding learns a set of basis functions such that each input signal can be well approximated by a linear combination of just a few of the bases. It has attracted increasing interest due to its state-of-the-art performance in BoW based image representation. However, when labeled and unlabeled images are sampled from different distributions, they may be quantized into different visual words of the codebook and encoded with different representations, which may severely degrade classification performance. In this paper, we propose a Transfer Sparse Coding (TSC) approach to construct robust sparse representations for classifying cross-distribution images accurately. Specifically, we aim to minimize the distribution divergence between the labeled and unlabeled images, and incorporate this criterion into the objective function of sparse coding to make the new representations robust to the distribution difference. Experiments show that TSC can significantly outperform state-ofthe-art methods on three types of computer vision datasets.

3 0.73008364 245 cvpr-2013-Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras

Author: Ju Shen, Sen-Ching S. Cheung

Abstract: The recent popularity of structured-light depth sensors has enabled many new applications from gesture-based user interface to 3D reconstructions. The quality of the depth measurements of these systems, however, is far from perfect. Some depth values can have significant errors, while others can be missing altogether. The uncertainty in depth measurements among these sensors can significantly degrade the performance of any subsequent vision processing. In this paper, we propose a novel probabilistic model to capture various types of uncertainties in the depth measurement process among structured-light systems. The key to our model is the use of depth layers to account for the differences between foreground objects and background scene, the missing depth value phenomenon, and the correlation between color and depth channels. The depth layer labeling is solved as a maximum a-posteriori estimation problem, and a Markov Random Field attuned to the uncertainty in measurements is used to spatially smooth the labeling process. Using the depth-layer labels, we propose a depth correction and completion algorithm that outperforms oth- er techniques in the literature.

4 0.72299659 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

Author: Horst Possegger, Sabine Sternig, Thomas Mauthner, Peter M. Roth, Horst Bischof

Abstract: Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on a common 2D ground-plane. To overcome these limitations, we introduce the concept of an occupancy volume exploiting the full geometry and the objects ’ center of mass and develop an efficient algorithm for 3D object tracking. Individual objects are tracked using the local mass density scores within a particle filter based approach, constrained by a Voronoi partitioning between nearby trackers. Our method benefits from the geometric knowledge given by the occupancy volume to robustly extract features and train classifiers on-demand, when volumetric information becomes unreliable. We evaluate our approach on several challenging real-world scenarios including the public APIDIS dataset. Experimental evaluations demonstrate significant improvements compared to state-of-theart methods, while achieving real-time performance. – –

5 0.72187364 222 cvpr-2013-Incorporating User Interaction and Topological Constraints within Contour Completion via Discrete Calculus

Author: Jia Xu, Maxwell D. Collins, Vikas Singh

Abstract: We study the problem of interactive segmentation and contour completion for multiple objects. The form of constraints our model incorporates are those coming from user scribbles (interior or exterior constraints) as well as information regarding the topology of the 2-D space after partitioning (number of closed contours desired). We discuss how concepts from discrete calculus and a simple identity using the Euler characteristic of a planar graph can be utilized to derive a practical algorithm for this problem. We also present specialized branch and bound methods for the case of single contour completion under such constraints. On an extensive dataset of ∼ 1000 images, our experimOenn tasn suggest vthea dt a assmetal ol fa m∼ou 1n0t0 of ismidaeg knowledge can give strong improvements over fully unsupervised contour completion methods. We show that by interpreting user indications topologically, user effort is substantially reduced.

6 0.71968597 298 cvpr-2013-Multi-scale Curve Detection on Surfaces

7 0.71850759 155 cvpr-2013-Exploiting the Power of Stereo Confidences

8 0.71724892 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery

9 0.71654677 39 cvpr-2013-Alternating Decision Forests

10 0.71618426 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences

11 0.71604294 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments

12 0.71590889 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display

13 0.71585494 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision

14 0.71579689 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

15 0.71494836 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields

16 0.71479279 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances

17 0.71443963 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking

18 0.71400434 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials

19 0.71398562 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path

20 0.71390057 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images