cvpr cvpr2013 cvpr2013-147 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
Reference: text
sentIndex sentText sentNum sentScore
1 nz Abstract With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. [sent-4, score-1.766]
2 Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. [sent-5, score-0.99]
3 We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. [sent-6, score-1.176]
4 We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. [sent-8, score-1.756]
5 Introduction A vast amount of algorithms to solve the stereo problem have been proposed with the target to yield improved error statistics on popular benchmarking datasets. [sent-11, score-0.518]
6 Recently, this issue has been approached through definition of a more challenging benchmark [9], and further improvements on performance of stereo solvers are anticipated. [sent-13, score-0.447]
7 We illustrate this for the stereo case: If, in a worst case scenario, one of the two cameras fails, dense matching results can be computed, but these are not reliable in any location. [sent-16, score-0.536]
8 de quite effective, by plotting consistency gaps over disparity errors, see Figure 1. [sent-22, score-0.418]
9 Applications where accurate stereo confidence measures are essential in raising reliability of computer vision include sparse [19] or dense [16] 3D scene reconstructions. [sent-23, score-0.95]
10 This has initiated attempts to combine several confidence measures with the aim of achieving superior accuracy in detection of bad matching estimates. [sent-26, score-0.661]
11 Previous solutions [14, 17] were based on a very limited set of features capturing confidence and were tested only on data not presenting much challenge to stereo. [sent-27, score-0.365]
12 In this paper, we employ strong energy based confidence clues and use a larger and significantly more challenging stereo dataset introduced recently [9], where results compare much better to real-world scenarios than was the case with benchmarks proposed previously. [sent-28, score-0.88]
13 Section 3 details challenges in defining confidence for matching tasks, compiles some proposals for stereo confidence definition and introduces new confidence definitions used in this paper. [sent-30, score-1.718]
14 Section 4 explains the machine learning framework used for confidence accuracy improvements. [sent-31, score-0.365]
15 Related Work Kong and Tao [14] proposed a stereo matcher, where distributions of labels for good, bad and foreground fattening affected disparities are estimated in a MAP-MRF framework based on horizontal texture and distances to closest foreground objects drawn from ground truth. [sent-36, score-0.654]
16 [17] derived binary confidence labels by learning from a larger set of features amenable to hardware processing using decision trees and ANNs. [sent-38, score-0.461]
17 Error of SGM stereo result to ground truth plotted against left-right difference of corresponding points in disparity maps of both views. [sent-40, score-0.89]
18 For optical flow, Gehrig and Scharw¨ achter [8] used Gaussian mixtures to model a feature space composed of spatial and temporal flow variance, residual flow energy and structure tensor eigenvalues on small image patches. [sent-44, score-0.231]
19 Multi-cue confidence was defined as classification outcome according to the highest class posterior. [sent-46, score-0.365]
20 This is expressed in the idea that a confidence measure can successfully select best fitting results from multiple algorithms, ignoring the fact that flow is often undefined, e. [sent-56, score-0.482]
21 Regarding above mentioned confidence features for opti- cal flow [2], image gradients in conjunction with flow variance are likely to detect lowly textured areas in input images with high variance in flow. [sent-60, score-0.722]
22 However, in stereo and motion alike, reasons for failure may not be restricted to low texturedness. [sent-62, score-0.487]
23 Hence, using a more diversified set of confidence measures as contributing features is very likely to result in improved accuracy for good or bad pixel detection due to consideration for an increased number of possible reasons for algorithm failure. [sent-63, score-0.676]
24 So, in the following section, we discuss various stereo confidence measures proposed in the literature, and attempt to motivate a selection of most promising measures. [sent-64, score-0.978]
25 Confidence Measures for Stereo Causes for errors in disparity estimation within a global stereo optimization framework can be based on inappropriate model assumptions, highly nonconvex energies causing multiple strong local minima or numerically instable global minima. [sent-66, score-0.915]
26 Assuming error prediction worked out, we would know error magnitudes and could plug these into the stereo estimation model to improve stereo results directly. [sent-68, score-1.069]
27 However, we can only hope to gain knowlege about suitability of signals to provide good estimates of stereo disparities in most cases, e. [sent-69, score-0.534]
28 In the absense of a strong theoretical foundation to account for properties of global energies in commonplace stereo aggregation schemes, many spatially local stereo confidence measures have been proposed [5, 13]. [sent-73, score-1.443]
29 However, evaluation has been carried out for a local stereo matching algorithm and on a small dataset only. [sent-74, score-0.536]
30 Below we briefly discuss the most prominent proposals for stereo confidence. [sent-76, score-0.493]
31 To clarify the intention behind defining confidence measures for matching, we would like to point out again, that confidence is not supposed to be a measure for potential disparity error magnitudes. [sent-77, score-1.411]
32 For low confidence matching situations, no improved or specialist algorithm may exist for obtaining a solution. [sent-80, score-0.454]
33 Good confidence measures detect areas that cannot be matched reliably. [sent-81, score-0.55]
34 333000666 In the following definitions, c refers to matching costs resulting from a Semi global matching (SGM) [11] aggregation scheme. [sent-82, score-0.315]
35 Curvature of a parabola fit to matching costs c for subpixel estimation at a pixel p is frequently considered to be a confidence measure. [sent-83, score-0.545]
36 The peak ratio measure is widely used in descriptor matching to reject correspondences with close matching costs which are believed to be ambiguous. [sent-86, score-0.447]
37 In the following, d1 denominates the disparity with lowest associated cost c(p, d1) and d2 is a disparity where c(p, d2) is a local minimum with second lowest cost at pixel p. [sent-87, score-0.894]
38 The peak ratio for a disparity at pixel p is then defined as Γ0(p) = c(p, d1)/c(p, d2) . [sent-88, score-0.553]
39 i Entropy somf disparity costs for controlling a diffusion process in cost aggregation [20] attracted some attention as a potential confidence measure. [sent-93, score-0.949]
40 1, consistency between left and right disparity is an established criterion for identification of mismatches and occlusions [11]. [sent-108, score-0.453]
41 The definition requires disparity maps Dl and Dr of left and right image: Γ3(p) = ? [sent-109, score-0.418]
42 This motivates the definition of horizontal gradient as a confidence measure: Γ4(p) =? [sent-119, score-0.402]
43 However, Γ5 may be less suitable if used in conjunction with stereo algorithms that frequently locate discontinuities well. [sent-125, score-0.508]
44 This may be the case in segmentation based stereo approaches. [sent-126, score-0.447]
45 A measure coined disparity ambiguity here is introduced to capture potential error magnitudes for the case of mismatches resulting from matching ambiguities (which may be detected by peak ratio Γ0 defined above). [sent-127, score-0.86]
46 Γ6(p) = |Dl1(p) − D2l(p)| Although not beneficial as a confidence measure itself, inclusion of disparity ambiguity into a learning framework is an attempt to separate small from large errors in image locations where the peak ratio may fail as explained above. [sent-128, score-1.081]
47 As an additional confidence measure, we use Zero mean Sum of Absolute Differences (ZSAD) matching costs between (left and right) image intensities Il and Ir for the winning disparity d1: Γ7(p) = ZSAD ? [sent-129, score-1.031]
48 Another proposal for confidence is what we call semi global × energy: We compute the sum of data and smoothness term in a small neighborhood for each pixel, choosing a patch size of 25 25 and aggregate along emerging rays in eight dsiizreec otfio 2n5s r f 2or5 t ahnedse a experiments. [sent-133, score-0.497]
49 gT ehem feeragtiunrge risa ydsef i nne edig hint analogy to the SGM objective energy, but with the winning disparity d1 = Dp fixed: Γ8(p) = ? [sent-134, score-0.454]
50 b1 and b2 are distinct penalties for different magnitudes of disparity map gradient, and t is a decision function. [sent-139, score-0.569]
51 333000777 Feature Vector Setup We define one feature vector f7 ∈ R7, containing only information derived from input images and computed disparity maps. [sent-140, score-0.418]
52 Features for lower scales are separately extracted from stereo computed on down-scaled images and not by downscaling of feature maps. [sent-143, score-0.447]
53 Vector f7 can be computed for arbitrary stereo results. [sent-145, score-0.447]
54 This feature vector is therefore defined only for stereo schemes with pixel-wise cost computations for each matching candidate. [sent-147, score-0.565]
55 Ensemble sures Learning for Confidence Mea- In the following, we explain the machine learning approach chosen for combining confidence measures. [sent-149, score-0.365]
56 We choose a classification approach instead of regression, as confidence measures do not contain matching error magnitude information as explained previously. [sent-152, score-0.637]
57 Each decision tree in the random forest partitions feature space recursively by greedily choosing a feature and a binary test thereupon, which minimizes an entropy based objective function. [sent-155, score-0.237]
58 Experiments Stereo estimates are computed using semi global matching stereo [11] (penalties b1 = 20, b2 = 100) with a binary census data term on 7 7 matching windows. [sent-164, score-0.813]
59 The choice cofe nthsuiss algorithm oisn d 7ue × ×to 7 b meastt cohvienrgal wl performance on unconstrained image data in terms of stereo accuracy [12, 21] as well as computational costs low enough for on-line results in, e. [sent-165, score-0.583]
60 We restrict our experiments to this powerful stereo algorithm, as we are not interested in stereo errors introduced through weak models. [sent-168, score-0.944]
61 In an effort to reduce adaptation to a specific matching problem domain, these frames are selected such that a variety of different challenges are posed to the stereo algorithm, including textureless areas, very large baseline, repetitive structures, transparencies and specular reflections. [sent-171, score-0.664]
62 Samples of the above described feature vector are collected only in locations where data term values for stereo matching are available (that is, these are not set to be invalid) on all scales and for all disparity candidates. [sent-173, score-0.954]
63 The intention is to avoid biases in learning and classification due to nonuniform scaling of some of the used cost function based features in the presence of undefined matching cost values. [sent-175, score-0.238]
64 Area under curve measures of our result (red), in comparison to four confidence measures that usually perform best. [sent-177, score-0.67]
65 As confidence measures generally contain no information about error magnitudes, solving a regression problem for feature combination is not likely to yield the intended results. [sent-180, score-0.548]
66 The class boundary is defined by a threshold of 3 px between ground truth disparities and stereo estimates, in line with the default of the KITTI online evaluation. [sent-182, score-0.534]
67 Due to very high quality of stereo results on KITTI in general, these two classes are highly unbalanced, which may deterioate class model quality and result in unnecesary computational costs due to high data volumes. [sent-184, score-0.538]
68 Generalisation error is monitored within the random forest framework by computing out of bag errors for increasingly large stratified random subsets of the training set. [sent-187, score-0.244]
69 Combined confidence measures for f7 and f23 alike are defined as the posterior probability of the bad disparity class. [sent-191, score-1.027]
70 Confidence measures, including decision forest results, are compared using the sparsification strategy: Pixels in disparity maps are successively removed, in the order of descending confidence measure values, until the disparity map is empty. [sent-192, score-1.498]
71 If the area under the resulting curve (AUC) is smaller than for concurrent confidence measures, it indicates that this measure is more accurate. [sent-194, score-0.437]
72 AUC values are normalized such that confidence measures discarding pixels randomly yield a value of 0. [sent-195, score-0.503]
73 Results Area under the curve (AUC) values for the proposed RDF23 confidence measure indicate superior accuracy compared to best performing of all single confidence measures on 193 out of 194 frames on the KITTI dataset, see Fig. [sent-198, score-1.005]
74 In the presence of frequent gross stereo errors which are generally detected well by all features including the semi global energy feature proposed, the RDF23 results still show a slight improvement, see Fig. [sent-203, score-0.668]
75 Even if a single contributing confidence measure fails (see Fig. [sent-205, score-0.512]
76 Outstanding accuracy gains from RDF23 results are not achieved if the confidence feature set is reduced to such 333000999 Figure 3. [sent-207, score-0.365]
77 Area under curve measures of our result when the feature set is reduced to information from disparity maps and image intensities. [sent-208, score-0.585]
78 Again, we compare to best performing single confidence measures. [sent-209, score-0.395]
79 variables that can be obtained solely from disparity maps and image intensities, assuming the stereo algorithm be a black box (see Fig. [sent-213, score-0.865]
80 In RDF23 estimation, disparity variance, perturbation, peak ratio and left-right difference have the largest contribution according to Gini importance in decision forest estimation (see Tab. [sent-216, score-0.739]
81 In the reduced feature set f7, Gini importance is highest for the disparity variance variable as well (see Tab. [sent-218, score-0.565]
82 Note, however, that stereo estimates are almost perfect for this frame. [sent-234, score-0.472]
83 despite the most important variable according to the Gini measure in both feature spaces being disparity variance. [sent-236, score-0.496]
84 The perturbation measure attracting higher variable importance on a smaller scale suggests that confidence may be more appropriate to be looked upon at superpixel level. [sent-237, score-0.579]
85 KITTI Frame 123, resulting in a significant amount of SGM stereo errors (approx. [sent-241, score-0.497]
86 30 percent), results in all confidence measures responding well. [sent-242, score-0.503]
87 Though one of the contributing measures, SGM energy, fails on Frame 151, our method results in superior accuracy compared to all single measures over the entire sparsification range. [sent-245, score-0.354]
88 For the only instance on KITTI Frame 30, error rates of the stereo algorithm are very low. [sent-248, score-0.492]
89 Undefined stereo values due to occluded regions cannot be handled separately in this study, as corresponding ground truth data is not yet made public in KITTI. [sent-251, score-0.472]
90 Yet, separate evaluations, as done in stereo benchmarking, would be of interest. [sent-253, score-0.447]
91 Conclusion We have demonstrated that learning a classifier on multivariate confidence measures is an appropriate approach to increase accuracy in stereo error detection if a suitable set of confidence features is selected. [sent-287, score-1.36]
92 In particular, variance based features on image intensities and matching results as previously applied to the optical flow problem are insufficient for consistently outperforming contributing confidence measures in stereo analysis. [sent-288, score-1.317]
93 Visualization of true positives (green), false positives (red), true negatives (blue) and false negatives (yellow) according to the denominations given in the plot of Fig. [sent-293, score-0.308]
94 flaws in the ground truth data [10] used here, advantages of the proposed method are larger where stereo is more challenging and hence produces more error prone results. [sent-296, score-0.517]
95 Yet, to shed light on this, new challenges for stereo need to be defined (and come with ground truth), beyond what is present in KITTI data. [sent-297, score-0.513]
96 This would help to shift attention to specific problems which need to be addressed before stereo vision systems can confidently be used in applicantions relevant to safety, such as driver assistance systems. [sent-299, score-0.447]
97 Quantitative evaluation of matching methods and validity measures for stereo vision. [sent-329, score-0.674]
98 A quantitative evaluation of confidence measures for stereo vision. [sent-381, score-0.95]
99 Binary confidence evaluation for a stereo vision based depth field processor SoC. [sent-419, score-0.84]
100 A simple stereo algorithm to recover precise object boundaries and smooth surfaces. [sent-425, score-0.447]
wordName wordTfidf (topN-words)
[('stereo', 0.447), ('disparity', 0.418), ('confidence', 0.365), ('kitti', 0.276), ('measures', 0.138), ('sgm', 0.137), ('zsad', 0.126), ('gehrig', 0.125), ('sparsification', 0.112), ('semi', 0.107), ('contributing', 0.104), ('peak', 0.095), ('perturbation', 0.092), ('costs', 0.091), ('matching', 0.089), ('aodha', 0.083), ('scharw', 0.076), ('forest', 0.076), ('flow', 0.074), ('gini', 0.072), ('bad', 0.069), ('variance', 0.068), ('decision', 0.066), ('disparities', 0.062), ('entropy', 0.058), ('census', 0.056), ('negatives', 0.054), ('undefined', 0.054), ('positives', 0.053), ('magnitudes', 0.053), ('auc', 0.053), ('workshops', 0.051), ('fattening', 0.051), ('featurescalepermutationgini', 0.051), ('motten', 0.051), ('vigra', 0.051), ('errors', 0.05), ('areas', 0.047), ('false', 0.047), ('proposals', 0.046), ('aggregation', 0.046), ('error', 0.045), ('iwr', 0.045), ('cofe', 0.045), ('achter', 0.045), ('finder', 0.045), ('haeusler', 0.045), ('importance', 0.044), ('measure', 0.043), ('ensembles', 0.042), ('ambiguity', 0.042), ('merrell', 0.042), ('hirschm', 0.042), ('challenges', 0.041), ('failure', 0.04), ('ratio', 0.04), ('editors', 0.04), ('energy', 0.038), ('lbe', 0.037), ('stratified', 0.037), ('alike', 0.037), ('intention', 0.037), ('automotive', 0.037), ('tree', 0.037), ('frame', 0.037), ('gradient', 0.037), ('bag', 0.036), ('winning', 0.036), ('frames', 0.035), ('discontinuities', 0.035), ('variable', 0.035), ('mismatches', 0.035), ('ide', 0.032), ('rescaling', 0.032), ('posteriors', 0.032), ('intensities', 0.032), ('prediction', 0.032), ('penalties', 0.032), ('amenable', 0.03), ('benchmarks', 0.03), ('performing', 0.03), ('lecture', 0.029), ('curve', 0.029), ('cost', 0.029), ('heidelberg', 0.029), ('motivate', 0.028), ('laser', 0.028), ('dq', 0.028), ('fail', 0.028), ('depth', 0.028), ('notes', 0.027), ('textureless', 0.026), ('benchmarking', 0.026), ('par', 0.026), ('gross', 0.026), ('repetitive', 0.026), ('conjunction', 0.026), ('ground', 0.025), ('estimates', 0.025), ('emerging', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
2 0.50374079 155 cvpr-2013-Exploiting the Power of Stereo Confidences
Author: David Pfeiffer, Stefan Gehrig, Nicolai Schneider
Abstract: Applications based on stereo vision are becoming increasingly common, ranging from gaming over robotics to driver assistance. While stereo algorithms have been investigated heavily both on the pixel and the application level, far less attention has been dedicated to the use of stereo confidence cues. Mostly, a threshold is applied to the confidence values for further processing, which is essentially a sparsified disparity map. This is straightforward but it does not take full advantage of the available information. In this paper, we make full use of the stereo confidence cues by propagating all confidence values along with the measured disparities in a Bayesian manner. Before using this information, a mapping from confidence values to disparity outlier probability rate is performed based on gathered disparity statistics from labeled video data. We present an extension of the so called Stixel World, a generic 3D intermediate representation that can serve as input for many of the applications mentioned above. This scheme is modified to directly exploit stereo confidence cues in the underlying sensor model during a maximum a poste- riori estimation process. The effectiveness of this step is verified in an in-depth evaluation on a large real-world traffic data base of which parts are made publicly available. We show that using stereo confidence cues allows both reducing the number of false object detections by a factor of six while keeping the detection rate at a near constant level.
3 0.28585866 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching
Author: Xing Mei, Xun Sun, Weiming Dong, Haitao Wang, Xiaopeng Zhang
Abstract: This paper presents a novel tree-based cost aggregation method for dense stereo matching. Instead of employing the minimum spanning tree (MST) and its variants, a new tree structure, ”Segment-Tree ”, is proposed for non-local matching cost aggregation. Conceptually, the segment-tree is constructed in a three-step process: first, the pixels are grouped into a set of segments with the reference color or intensity image; second, a tree graph is created for each segment; and in the final step, these independent segment graphs are linked to form the segment-tree structure. In practice, this tree can be efficiently built in time nearly linear to the number of the image pixels. Compared to MST where the graph connectivity is determined with local edge weights, our method introduces some ’non-local’ decision rules: the pixels in one perceptually consistent segment are more likely to share similar disparities, and therefore their connectivity within the segment should be first enforced in the tree construction process. The matching costs are then aggregated over the tree within two passes. Performance evaluation on 19 Middlebury data sets shows that the proposed method is comparable to previous state-of-the-art aggregation methods in disparity accuracy and processing speed. Furthermore, the tree structure can be refined with the estimated disparities, which leads to consistent scene segmentation and significantly better aggregation results.
4 0.27977824 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
5 0.27389959 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields
Author: Bastian Goldluecke, Sven Wanner
Abstract: Unlike traditional images which do not offer information for different directions of incident light, a light field is defined on ray space, and implicitly encodes scene geometry data in a rich structure which becomes visible on its epipolar plane images. In this work, we analyze regularization of light fields in variational frameworks and show that their variational structure is induced by disparity, which is in this context best understood as a vector field on epipolar plane image space. We derive differential constraints on this vector field to enable consistent disparity map regularization. Furthermore, we show how the disparity field is related to the regularization of more general vector-valued functions on the 4D ray space of the light field. This way, we derive an efficient variational framework with convex priors, which can serve as a fundament for a large class of inverse problems on ray space.
6 0.2157006 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation
7 0.18614325 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues
9 0.16966406 219 cvpr-2013-In Defense of 3D-Label Stereo
10 0.13705236 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs
11 0.12994993 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
12 0.12988631 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes
13 0.12243173 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
14 0.11359098 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
15 0.099348396 117 cvpr-2013-Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera
16 0.097972594 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method
17 0.093871623 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
18 0.093368702 115 cvpr-2013-Depth Super Resolution by Rigid Body Self-Similarity in 3D
19 0.09137436 380 cvpr-2013-Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
20 0.079142317 311 cvpr-2013-Occlusion Patterns for Object Class Detection
topicId topicWeight
[(0, 0.203), (1, 0.144), (2, 0.031), (3, 0.022), (4, 0.036), (5, -0.065), (6, 0.002), (7, 0.028), (8, -0.004), (9, 0.082), (10, 0.007), (11, 0.069), (12, 0.235), (13, 0.033), (14, -0.121), (15, 0.103), (16, -0.222), (17, -0.262), (18, 0.15), (19, -0.065), (20, -0.142), (21, 0.112), (22, 0.263), (23, 0.203), (24, -0.113), (25, -0.096), (26, -0.008), (27, -0.088), (28, 0.057), (29, 0.015), (30, 0.004), (31, 0.077), (32, -0.167), (33, -0.004), (34, 0.013), (35, 0.054), (36, -0.042), (37, 0.024), (38, 0.03), (39, -0.026), (40, 0.078), (41, 0.038), (42, 0.037), (43, 0.025), (44, -0.048), (45, 0.078), (46, 0.032), (47, 0.017), (48, -0.038), (49, -0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.96542436 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
2 0.95722139 155 cvpr-2013-Exploiting the Power of Stereo Confidences
Author: David Pfeiffer, Stefan Gehrig, Nicolai Schneider
Abstract: Applications based on stereo vision are becoming increasingly common, ranging from gaming over robotics to driver assistance. While stereo algorithms have been investigated heavily both on the pixel and the application level, far less attention has been dedicated to the use of stereo confidence cues. Mostly, a threshold is applied to the confidence values for further processing, which is essentially a sparsified disparity map. This is straightforward but it does not take full advantage of the available information. In this paper, we make full use of the stereo confidence cues by propagating all confidence values along with the measured disparities in a Bayesian manner. Before using this information, a mapping from confidence values to disparity outlier probability rate is performed based on gathered disparity statistics from labeled video data. We present an extension of the so called Stixel World, a generic 3D intermediate representation that can serve as input for many of the applications mentioned above. This scheme is modified to directly exploit stereo confidence cues in the underlying sensor model during a maximum a poste- riori estimation process. The effectiveness of this step is verified in an in-depth evaluation on a large real-world traffic data base of which parts are made publicly available. We show that using stereo confidence cues allows both reducing the number of false object detections by a factor of six while keeping the detection rate at a near constant level.
3 0.82422429 384 cvpr-2013-Segment-Tree Based Cost Aggregation for Stereo Matching
Author: Xing Mei, Xun Sun, Weiming Dong, Haitao Wang, Xiaopeng Zhang
Abstract: This paper presents a novel tree-based cost aggregation method for dense stereo matching. Instead of employing the minimum spanning tree (MST) and its variants, a new tree structure, ”Segment-Tree ”, is proposed for non-local matching cost aggregation. Conceptually, the segment-tree is constructed in a three-step process: first, the pixels are grouped into a set of segments with the reference color or intensity image; second, a tree graph is created for each segment; and in the final step, these independent segment graphs are linked to form the segment-tree structure. In practice, this tree can be efficiently built in time nearly linear to the number of the image pixels. Compared to MST where the graph connectivity is determined with local edge weights, our method introduces some ’non-local’ decision rules: the pixels in one perceptually consistent segment are more likely to share similar disparities, and therefore their connectivity within the segment should be first enforced in the tree construction process. The matching costs are then aggregated over the tree within two passes. Performance evaluation on 19 Middlebury data sets shows that the proposed method is comparable to previous state-of-the-art aggregation methods in disparity accuracy and processing speed. Furthermore, the tree structure can be refined with the estimated disparities, which leads to consistent scene segmentation and significantly better aggregation results.
4 0.73120379 181 cvpr-2013-Fusing Depth from Defocus and Stereo with Coded Apertures
Author: Yuichi Takeda, Shinsaku Hiura, Kosuke Sato
Abstract: In this paper we propose a novel depth measurement method by fusing depth from defocus (DFD) and stereo. One of the problems of passive stereo method is the difficulty of finding correct correspondence between images when an object has a repetitive pattern or edges parallel to the epipolar line. On the other hand, the accuracy of DFD method is inherently limited by the effective diameter of the lens. Therefore, we propose the fusion of stereo method and DFD by giving different focus distances for left and right cameras of a stereo camera with coded apertures. Two types of depth cues, defocus and disparity, are naturally integrated by the magnification and phase shift of a single point spread function (PSF) per camera. In this paper we give the proof of the proportional relationship between the diameter of defocus and disparity which makes the calibration easy. We also show the outstanding performance of our method which has both advantages of two depth cues through simulation and actual experiments.
5 0.6910792 219 cvpr-2013-In Defense of 3D-Label Stereo
Author: Carl Olsson, Johannes Ulén, Yuri Boykov
Abstract: It is commonly believed that higher order smoothness should be modeled using higher order interactions. For example, 2nd order derivatives for deformable (active) contours are represented by triple cliques. Similarly, the 2nd order regularization methods in stereo predominantly use MRF models with scalar (1D) disparity labels and triple clique interactions. In this paper we advocate a largely overlooked alternative approach to stereo where 2nd order surface smoothness is represented by pairwise interactions with 3D-labels, e.g. tangent planes. This general paradigm has been criticized due to perceived computational complexity of optimization in higher-dimensional label space. Contrary to popular beliefs, we demonstrate that representing 2nd order surface smoothness with 3D labels leads to simpler optimization problems with (nearly) submodular pairwise interactions. Our theoretical and experimental re- sults demonstrate advantages over state-of-the-art methods for 2nd order smoothness stereo. 1
7 0.62266195 431 cvpr-2013-The Variational Structure of Disparity and Regularization of 4D Light Fields
8 0.54747874 362 cvpr-2013-Robust Monocular Epipolar Flow Estimation
9 0.48336944 188 cvpr-2013-Globally Consistent Multi-label Assignment on the Ray Space of 4D Light Fields
10 0.47382429 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs
11 0.4126229 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method
12 0.40937996 345 cvpr-2013-Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues
13 0.3872802 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
14 0.36392063 21 cvpr-2013-A New Perspective on Uncalibrated Photometric Stereo
15 0.32704541 112 cvpr-2013-Dense Segmentation-Aware Descriptors
16 0.3187134 128 cvpr-2013-Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space
17 0.318459 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration
18 0.30724868 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
19 0.30233639 283 cvpr-2013-Megastereo: Constructing High-Resolution Stereo Panoramas
20 0.29231215 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
topicId topicWeight
[(10, 0.137), (16, 0.047), (26, 0.063), (28, 0.015), (33, 0.227), (67, 0.052), (69, 0.043), (87, 0.143), (97, 0.184)]
simIndex simValue paperId paperTitle
same-paper 1 0.85266584 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
Author: Ralf Haeusler, Rahul Nair, Daniel Kondermann
Abstract: With the aim to improve accuracy of stereo confidence measures, we apply the random decision forest framework to a large set of diverse stereo confidence measures. Learning and testing sets were drawnfrom the recently introduced KITTI dataset, which currently poses higher challenges to stereo solvers than other benchmarks with ground truth for stereo evaluation. We experiment with semi global matching stereo (SGM) and a census dataterm, which is the best performing realtime capable stereo method known to date. On KITTI images, SGM still produces a significant amount of error. We obtain consistently improved area under curve values of sparsification measures in comparison to best performing single stereo confidence measures where numbers of stereo errors are large. More specifically, our method performs best in all but one out of 194 frames of the KITTI dataset.
2 0.83212841 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
Author: Martin Hofmann, Daniel Wolf, Gerhard Rigoll
Abstract: We generalize the network flow formulation for multiobject tracking to multi-camera setups. In the past, reconstruction of multi-camera data was done as a separate extension. In this work, we present a combined maximum a posteriori (MAP) formulation, which jointly models multicamera reconstruction as well as global temporal data association. A flow graph is constructed, which tracks objects in 3D world space. The multi-camera reconstruction can be efficiently incorporated as additional constraints on the flow graph without making the graph unnecessarily large. The final graph is efficiently solved using binary linear programming. On the PETS 2009 dataset we achieve results that significantly exceed the current state of the art.
Author: Jia Xu, Maxwell D. Collins, Vikas Singh
Abstract: We study the problem of interactive segmentation and contour completion for multiple objects. The form of constraints our model incorporates are those coming from user scribbles (interior or exterior constraints) as well as information regarding the topology of the 2-D space after partitioning (number of closed contours desired). We discuss how concepts from discrete calculus and a simple identity using the Euler characteristic of a planar graph can be utilized to derive a practical algorithm for this problem. We also present specialized branch and bound methods for the case of single contour completion under such constraints. On an extensive dataset of ∼ 1000 images, our experimOenn tasn suggest vthea dt a assmetal ol fa m∼ou 1n0t0 of ismidaeg knowledge can give strong improvements over fully unsupervised contour completion methods. We show that by interpreting user indications topologically, user effort is substantially reduced.
4 0.82894349 298 cvpr-2013-Multi-scale Curve Detection on Surfaces
Author: Michael Kolomenkin, Ilan Shimshoni, Ayellet Tal
Abstract: This paper extends to surfaces the multi-scale approach of edge detection on images. The common practice for detecting curves on surfaces requires the user to first select the scale of the features, apply an appropriate smoothing, and detect the edges on the smoothed surface. This approach suffers from two drawbacks. First, it relies on a hidden assumption that all the features on the surface are of the same scale. Second, manual user intervention is required. In this paper, we propose a general framework for automatically detecting the optimal scale for each point on the surface. We smooth the surface at each point according to this optimal scale and run the curve detection algorithm on the resulting surface. Our multi-scale algorithm solves the two disadvantages of the single-scale approach mentioned above. We demonstrate how to realize our approach on two commonly-used special cases: ridges & valleys and relief edges. In each case, the optimal scale is found in accordance with the mathematical definition of the curve.
5 0.82886767 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
Author: Horst Possegger, Sabine Sternig, Thomas Mauthner, Peter M. Roth, Horst Bischof
Abstract: Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on a common 2D ground-plane. To overcome these limitations, we introduce the concept of an occupancy volume exploiting the full geometry and the objects ’ center of mass and develop an efficient algorithm for 3D object tracking. Individual objects are tracked using the local mass density scores within a particle filter based approach, constrained by a Voronoi partitioning between nearby trackers. Our method benefits from the geometric knowledge given by the occupancy volume to robustly extract features and train classifiers on-demand, when volumetric information becomes unreliable. We evaluate our approach on several challenging real-world scenarios including the public APIDIS dataset. Experimental evaluations demonstrate significant improvements compared to state-of-theart methods, while achieving real-time performance. – –
6 0.82854557 408 cvpr-2013-Spatiotemporal Deformable Part Models for Action Detection
7 0.82683945 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display
8 0.82682168 39 cvpr-2013-Alternating Decision Forests
9 0.82335287 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
10 0.82155526 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
11 0.82075942 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments
12 0.81963366 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
13 0.8186878 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
14 0.81761003 155 cvpr-2013-Exploiting the Power of Stereo Confidences
15 0.81735849 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems
16 0.81700546 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis
17 0.81690502 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
18 0.81409836 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
19 0.81306535 279 cvpr-2013-Manhattan Scene Understanding via XSlit Imaging
20 0.8129186 303 cvpr-2013-Multi-view Photometric Stereo with Spatially Varying Isotropic Materials