cvpr cvpr2013 cvpr2013-373 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Victor Fragoso, Matthew Turk
Abstract: We present SWIGS, a Swift and efficient Guided Sampling method for robust model estimation from image feature correspondences. Our method leverages the accuracy of our new confidence measure (MR-Rayleigh), which assigns a correctness-confidence to a putative correspondence in an online fashion. MR-Rayleigh is inspired by Meta-Recognition (MR), an algorithm that aims to predict when a classifier’s outcome is correct. We demonstrate that by using a Rayleigh distribution, the prediction accuracy of MR can be improved considerably. Our experiments show that MR-Rayleigh tends to predict better than the often-used Lowe ’s ratio, Brown’s ratio, and the standard MR under a range of imaging conditions. Furthermore, our homography estimation experiment demonstrates that SWIGS performs similarly or better than other guided sampling methods while requiring fewer iterations, leading to fast and accurate model estimates.
Reference: text
sentIndex sentText sentNum sentScore
1 Our method leverages the accuracy of our new confidence measure (MR-Rayleigh), which assigns a correctness-confidence to a putative correspondence in an online fashion. [sent-4, score-0.38]
2 Furthermore, our homography estimation experiment demonstrates that SWIGS performs similarly or better than other guided sampling methods while requiring fewer iterations, leading to fast and accurate model estimates. [sent-8, score-0.356]
3 Introduction Many computer vision tasks have to deal with incorrect image feature correspondences to estimate various types of models, such as homography, camera matrix, and others. [sent-10, score-0.247]
4 , geometrical and matching information) to compute a set of confidences that are used to select image feature correspondences for generating models. [sent-18, score-0.355]
5 In this work, we focus on exploiting the matching scores to compute these confidences. [sent-19, score-0.257]
6 Assigning a confidence on the fly can be important for applications where environmental conditions can drastically change the matching scores distributions for correct and incorrect image feature correspondences, or when real-time performance is desired. [sent-48, score-0.827]
7 Moreover, we show that MR-Rayleigh can be used to predict correct image feature correspondences more accurately than Lowe’s ratio [ 14], Brown’s ratio [3], and MetaRecognition [ 18] under a variety of different imaging conditions. [sent-49, score-0.571]
8 Predicting when an image feature correspondence is correct is used and can be extremely beneficial in image based localization [12, 13, 16], where the prediction is used to keep “good” matches, and other applications. [sent-50, score-0.33]
9 MR-Rayleigh: A confidence metric that allows more accurate predictions of correct matches and enables an 222777667088 efficient and quicker guided sampling, under the assumption that every correspondence has its own correct and incorrect matching scores distributions. [sent-53, score-1.425]
10 SWIGS: A fast and efficient guided sampling process for robust model estimation based on MR-Rayleigh confidences that only has a single parameter to tune and does not need an offline stage. [sent-55, score-0.386]
11 Previous work There exists a rich literature about computing weights or confidences to bias the selection of image feature correspondences to generate models in a robust model estimation process. [sent-57, score-0.246]
12 These approaches in general exploit prior information such as matching scores ([2, 11, 19]) or geometrical cues ([5, 17]) to compute these weights. [sent-58, score-0.257]
13 1 we show an overview of the main loop in a robust model estimation, where the confidences or weights are used to select feature correspondences and generate hypotheses. [sent-60, score-0.246]
14 In this section, we review the approaches that use match- ing scores as priors to compute a sampling strategy, as well as methods for predicting correct correspondences. [sent-62, score-0.476]
15 Prediction of correct matches Lowe’s ratio [14] has been one of the most efficient and widely used heuristics for predicting the correctness of a putative correspondence. [sent-66, score-0.771]
16 The ratio compares the first nearest neighbor matching score against the second nearest neighbor matching score. [sent-67, score-0.39]
17 This ratio exploits the fact that correct matching scores tend to be distant from the incorrect matching scores, consequently producing lower values (assuming a distance-based matching score). [sent-68, score-0.9]
18 [3] extend Lowe’s ratio by comparing the first nearest neighbor matching score against the average of the second nearest neighbor matching scores of multiple correspondences. [sent-71, score-0.538]
19 A more elaborate method for predicting correct matches was introduced by Cech et al. [sent-73, score-0.481]
20 We show that using the closest matching scores to the nearest-neighbor score can reveal useful information about the correctness of a putative correspondence, which boosts the prediction performance considerably. [sent-78, score-0.622]
21 Guided sampling using matching scores Tordoff and Murray [19] calculate the correctnessconfidence by considering the matching scores and the number ofcorrespondences to which a feature was matched. [sent-83, score-0.643]
22 Then, the probability that a match is correct, given all the matching scores of its potential matches, is calculated and used for biasing the selection of matches that are more likely to be valid. [sent-85, score-0.555]
23 The BEEM’s global search mechanism [11] estimates the correct and incorrect correspondence distributions for a given pair of images by considering Lowe’s ratio as the random variable. [sent-86, score-0.596]
24 Subsequently, BEEM estimates the mixing parameter between the computed distributions, and calculates the correctness confidences using the distributions and the mixing parameter. [sent-88, score-0.274]
25 BEEM then assumes that the statistics of the matching scores are fixed for the given pair of images. [sent-89, score-0.257]
26 An important feature of this method, which is similar to our approach, is that it computes the confidences on the fly as it only requires the similarity scores of every feature. [sent-91, score-0.376]
27 Hence, BLOGS considers that the statistics of the matching scores are defined per correspondence. [sent-92, score-0.288]
28 In contrast with most of the previous approaches, except BLOGS, SWIGS assumes that every correspondence has its own correct and incorrect matching scores distributions. [sent-93, score-0.704]
29 To compute the confidence for every match, we exploit information from the tail of the incorrect matching score distribution. [sent-94, score-0.627]
30 Swift guided sampling In this section we describe the cess used in SWIGS. [sent-97, score-0.244]
31 Given a query of reference descriptors {r}jn=1, a tohfe r e bfeesrte putative correspondence neighbor rule: keypoint matching prodescriptor qi and a pool feature matcher decides following the nearest- j? [sent-98, score-0.618]
32 In practice, the minimum matching score can belong to a correct or incorrect match due to several nuisances; e. [sent-105, score-0.557]
33 , a minimum matching score produced by an incorrect image − 222777667199 correspondence can be obtained when the scene contains repetitive structures. [sent-107, score-0.491]
34 We can consider the sequence of matching scores {si,1, . [sent-108, score-0.257]
35 , si,n} for a single query descriptor qi as a sequence composed by scores generated independently from a correct matching-scores distribution Fc and an incorrect matching-scores distribution F c¯. [sent-111, score-0.683]
36 The correct score (if any) can be the second, third, or other ranked score in the sequence, and hence we must consider overlapping distributions. [sent-113, score-0.329]
37 The objective of MR is to predict the correctness of a classifier; in our context we are interested in knowing whether a putative match is likely to be correct or incorrect. [sent-117, score-0.458]
38 To achieve this objective, MR considers a ranked sequence of scores for a given query and selects the best ranked k scores s1:k (the k lowest scores). [sent-118, score-0.482]
39 Meta-Recognition’s goal is to classify s1 as correct or incorrect, and a threshold α corresponding to the crossover of Fc and F c¯ suffices for the task. [sent-127, score-0.249]
40 Under the assumption that Fc is predominantly to the left of F c¯, MR-Weibull uses W (the CCDF of the tail model) for testing whether s1 is an outlier, in which case it is classified as a correct match (see Fig. [sent-131, score-0.361]
41 Nevertheless, the tail-fitting process in MR-Weibull can be affected by correct matching scores that are present in s2:k causing a bad model of the tail W and affecting the prediction. [sent-133, score-0.569]
42 Moreover, R decays gradually as soon as the matching score approaches the region of the tail of F c¯. [sent-148, score-0.344]
43 Hence, MR-Rayleigh assigns a higher confidence to those matching scores that fall to the left of F c¯ and a lower confidence to those that fall near F c¯, in contrast with MRWeibull, which assigns the confidence of one over the support of Fc (illustrated Fig. [sent-149, score-0.826]
44 Hence, MR-Weibull can assign a high confidence to scores corresponding to incorrect matches that fall near the distribution’s crossover, yielding false-alarms. [sent-151, score-0.712]
45 Guided Sampling using MR-Rayleigh The main idea of guided sampling for model fitting from feature correspondences is to use the computed confidences {cl}lN=1 ofbeing a correct match, where lindicates the index {ofc a putative correspondence. [sent-154, score-0.781]
46 (a) The lowest matching score is produced either by fc (the matching scores distribution of correct matches) or f c¯ (the matching scores distribution of incorrect matches). [sent-165, score-1.278]
47 (b) Meta-Recognition models the tail of f c¯ with a Weibull distribution w to then calculate a confidence using the CCDF and predict correctness (d). [sent-167, score-0.455]
48 MR-Rayleigh approximates the support of fc by computing a Rayleigh distribution r from data taken from the tail to calculate a confidence using the CCDF and predict correctness (d). [sent-168, score-0.557]
49 scores obtained from the query and reference images to calculate the required Lowe’s ratios. [sent-169, score-0.361]
50 (7) where cl is the confidence assigned to the l-th correspondence, ml is the best similarity score, and mlr and mlc are the two closest similarity scores obtained from a similarity matrix. [sent-178, score-0.468]
51 BLOGS assigns a higher confidence when the greatest similarity score is high and is distant from its closest scores, and the confidence is severely penalized when its closest scores are near the greatest similarity score. [sent-179, score-0.631]
52 We calculate the confidence of being a correct match using MR-Rayleigh (see Sec. [sent-180, score-0.406]
53 dence, and σˆ is calculated with its k−1 closest scores nScWe,I aGnSd σav iosci dasl any density ietsst kim−a1tio clno saensdts/ ocro an o(sffline stage; instead, it calculates a confidence on the fly. [sent-183, score-0.332]
54 Experiments To assess the performance of SWIGS, we present two experiments: correct matches prediction accuracy, and a guided sampling experiment for estimating homography from image feature correspondences. [sent-189, score-0.846]
55 We used OpenCV’s implementation of SIFT [14] and SURF [1] for describing the keypoints and we included (non-optimized) C++ code to calculate MR-Rayleigh, MR-Weibull, Brown’s ratio, and Lowe’s ratio into the brute-force feature matcher in OpenCV. [sent-199, score-0.265]
56 With these modifications, the matcher returns either a confidence (MR-Rayleigh or MR-Weibull) or a ratio (Lowe’s or Brown’s) for every putative correspondence. [sent-200, score-0.474]
57 We matched the reference keypoints (found in the reference image) against the query keypoints (detected per query image) for every sub-dataset. [sent-201, score-0.47]
58 We then identified the correct matches by evaluating the following statement ? [sent-202, score-0.432]
59 (9) where xq and xr are the query and reference keypoints, H is the homography transformation provided in the dataset that relates the reference and the query image, and ? [sent-205, score-0.436]
60 Those matches that did not comply were labeled as incorrect matches. [sent-207, score-0.392]
61 These identified correct correspondences were verified manually and used as our ground truth in our experiments. [sent-208, score-0.287]
62 Subsequently, we then identified the correct correspondences in a similar manner as described earlier but using the affine transformations instead of the homographies in Eq. [sent-213, score-0.287]
63 Correct match prediction experiment In this experiment we are interested in measuring the performance of MR-Rayleigh on detecting correct matches; and we use the labeled correct correspondences as our ground truth. [sent-217, score-0.649]
64 We considered a True-Positive when the predictor accurately detects a correct match, and a FalsePositive when the predictor inaccurately detected a positive match, i. [sent-218, score-0.277]
65 709 were good thresholds for SIFT and SURF matches respectively, while for Lowe’s ratio (LWR) we used the recommended threshold of τLWR = 0. [sent-230, score-0.379]
66 The top row corresponds to SIFT matches and the bottom row to SURF matches, and each column presents results for a different imaging condition; with the exception of the first column, which presents the results over all imaging conditions. [sent-234, score-0.351]
67 3 that MR-Rayleigh (MRR) outperformed MR-Weibull (MRW), Lowe’s ratio (LWR), and Brown’s ratio (BR) over all imaging conditions for SIFT and SURF matches. [sent-242, score-0.249]
68 Consequently, MR-Weibull can struggle in detecting correct matches when a low FalsePositive rate is required. [sent-249, score-0.513]
69 Lowe’s ratio in gen- eral performs competitively for SIFT and SURF matches, whereas, Brown’s ratio tends to perform competitively for SIFT matches but tends to fall short for SURF matches. [sent-251, score-0.568]
70 We also conducted an experiment on detecting correct matches per descriptor on the entire testing dataset using the thresholds found during our tuning stage. [sent-252, score-0.499]
71 From the results of this experiment we can conclude that Lowe’s ratio returned the lowest False-Positive rate (FPR) regardless of the descriptor. [sent-256, score-0.268]
72 MR-Weibull produced the highest True-Positive rate for SIFT matches but with the highest False-Positive rate, while MR-Rayleigh produced a high True-Positive rate and a low False-Positive rate. [sent-257, score-0.479]
73 For SURF matches MR-Rayleigh produced the highest TruePositive rate and a low False-Positive rate. [sent-258, score-0.364]
74 6081 False Positives rate False Positives rate False Positives rate False Positives rate False Positives rate False Positives rate LEUVEN WALL False Positives rate False Positives rate False Positives rate False Positives rate Figure 3. [sent-290, score-0.81]
75 ROC curves for evaluating correct matches predictions by using MR-Weibull (MRW), MR-Rayleigh (MRR), Lowe’s ratio (LRW), and Brown’s ratio (BR). [sent-291, score-0.63]
76 The top row presents the results for SIFT matches and bottom row for SURF matches. [sent-292, score-0.249]
77 Matlab was used to obtain the distributions required for the two Guided-Sampling [19] methods and to fit Weibull and Generalized Extreme Value distributions for correct and incorrect matches respectively (see Fig. [sent-298, score-0.673]
78 We implemented only the prior estimation stage of BEEM and BLOGS’ global search mechanism, as we aim at comparing the confidence mechanism used for data sampling in a robust estimation. [sent-300, score-0.27]
79 We used OpenCV’s findHomography function (without the RANSAC option) and the correct matches identified by each method to estimate the homography. [sent-301, score-0.432]
80 We executed the experiment 5000 times with a stopping criterion of 100% of correct matches found and a maximum of 1000 iterations, since we are interested in applications that have a limited budget of iterations; an iteration is a completion of the loop in Fig. [sent-302, score-0.468]
81 We report the median of the number of iterations a method took to find the best model within the required number of iterations and the median of the percentage of correct matches that the best model found considered as a correct match. [sent-304, score-0.781]
82 Fitted distributions for SIFT matches (left) and SURF matches (right) used in GEN. [sent-306, score-0.547]
83 The percentage of correct matches are presented in the first and third rows, while the iterations are in the second and fourth rows. [sent-312, score-0.523]
84 We can observe that SWIGS tends to require in general fewer iterations than the other methods (second and fourth rows) to find models that consider a comparable or higher percentage of correct matches within the allowed number of iterations (first and third row). [sent-315, score-0.622]
85 The GEN method struggles more to find models that consider a high percentage of correct matches in scenes with repetitive textures, e. [sent-317, score-0.475]
86 BLOGS and a random sampling (Uniform) method perform similarly in finding models that consider a high portion of the correct matches. [sent-325, score-0.279]
87 The experimental results presented in this section demonstrate that SWIGS can perform similarly or better in finding models that consider a good portion of correct matches in a dense matching scenario. [sent-326, score-0.541]
88 The experiments also show that SWIGS tends to require fewer iterations than the other guiding sampling methods without sacrificing the number of correct matches found. [sent-327, score-0.627]
89 Moreover, this confirms that MR-Rayleigh confidences tend to identify good matches, and these confidences yield an efficient and accurate sampling strategy. [sent-328, score-0.38]
90 6 we show two different sets of correct image feature correspondences found with SWIGS and MLESAC. [sent-330, score-0.287]
91 Conclusions and future directions We have introduced MR-Rayleigh, a confidence measure based on Meta-Recognition (MR) [18] for predicting correct image feature correspondences more accurately. [sent-332, score-0.477]
92 CorectSIFTmatchesfoundbetwentherf enc im- age (top-right) and query image (top-left) of the Graf dataset and correct SURF matches between reference image (bottom-right) and query image (bottom-left) of the Boat dataset using SWIGS and MLESAC. [sent-334, score-0.715]
93 222777777644 est matching scores produced by the matcher when comparing the query descriptor against the reference descriptors. [sent-335, score-0.565]
94 MR-Rayleigh assigns a higher confidence when the lowest matching score is closer to zero and gradually decays as it gets closer to the tail of the incorrect matching scores distribution. [sent-336, score-0.979]
95 Our experiments showed that MR-Rayleigh outperformed Lowe’s ratio, Brown’s ratio, and MR-Weibull in predicting correct matches across several image correspondences obtained in different imaging conditions. [sent-338, score-0.636]
96 This prediction is efficient to compute and can be useful in many applications such as image-based localization where only good matches are kept; in estimating the inlier-ratio, which can be used to estimate the maximum number of iterations in RANSAC, and others. [sent-339, score-0.361]
97 We also presented SWIGS, an efficient method to sample data in a guided manner for robust model fitting that exploits the confidence delivered by MR-Rayleigh. [sent-340, score-0.289]
98 , BEEM [11] and Guided-MLESAC [19]) that assume a correct or incorrect matching score distribution for a pair of images or for an entire dataset, SWIGS considers that every query feature has a correct and incorrect matching scores distributions. [sent-343, score-1.291]
99 SWIGS then computes the confidence of every correspondence on the fly and uses these confidences for sampling matches to estimate a model such as a homography. [sent-344, score-0.803]
100 Our homography estimation experiment suggests that SWIGS achieves competitive or better results than BEEM’s and BLOGS’s [2] guided sampling mechanisms, and Tordoff and Murray’s guided MLESAC [19]. [sent-345, score-0.504]
wordName wordTfidf (topN-words)
[('swigs', 0.443), ('matches', 0.249), ('lowe', 0.237), ('beem', 0.19), ('blogs', 0.187), ('correct', 0.183), ('surf', 0.161), ('guided', 0.148), ('scores', 0.148), ('incorrect', 0.143), ('confidences', 0.142), ('confidence', 0.141), ('tail', 0.129), ('brown', 0.128), ('ccdf', 0.126), ('bikes', 0.112), ('matching', 0.109), ('putative', 0.108), ('fpr', 0.104), ('correspondences', 0.104), ('query', 0.103), ('fc', 0.102), ('ratio', 0.099), ('sampling', 0.096), ('matcher', 0.094), ('graf', 0.094), ('correspondence', 0.089), ('weibull', 0.087), ('mr', 0.086), ('mrweibull', 0.084), ('correctness', 0.083), ('rate', 0.081), ('reference', 0.077), ('homography', 0.076), ('tpr', 0.075), ('positives', 0.073), ('score', 0.073), ('sift', 0.072), ('boat', 0.071), ('wall', 0.067), ('br', 0.065), ('lwr', 0.063), ('swift', 0.063), ('rayleigh', 0.062), ('leuven', 0.06), ('prediction', 0.058), ('cl', 0.057), ('mlesac', 0.056), ('tordoff', 0.056), ('iterations', 0.054), ('fly', 0.054), ('false', 0.054), ('lowest', 0.052), ('imaging', 0.051), ('predicting', 0.049), ('distributions', 0.049), ('match', 0.049), ('roc', 0.048), ('predictor', 0.047), ('tends', 0.045), ('repetitive', 0.043), ('closest', 0.043), ('falsepositive', 0.042), ('fragoso', 0.042), ('goshen', 0.042), ('metarecognition', 0.042), ('mlc', 0.042), ('mrrayleigh', 0.042), ('mrw', 0.042), ('seitfro', 0.042), ('xl', 0.042), ('assigns', 0.042), ('epipolar', 0.04), ('keypoints', 0.039), ('qi', 0.038), ('ubc', 0.037), ('mlr', 0.037), ('mrr', 0.037), ('fourth', 0.037), ('opencv', 0.036), ('experiment', 0.036), ('predict', 0.035), ('crossover', 0.035), ('bark', 0.035), ('cech', 0.035), ('ransac', 0.034), ('distribution', 0.034), ('produced', 0.034), ('calculate', 0.033), ('mechanism', 0.033), ('decays', 0.033), ('spec', 0.033), ('every', 0.032), ('considers', 0.031), ('threshold', 0.031), ('fall', 0.031), ('tuning', 0.031), ('rows', 0.03), ('trees', 0.029), ('median', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method
Author: Victor Fragoso, Matthew Turk
Abstract: We present SWIGS, a Swift and efficient Guided Sampling method for robust model estimation from image feature correspondences. Our method leverages the accuracy of our new confidence measure (MR-Rayleigh), which assigns a correctness-confidence to a putative correspondence in an online fashion. MR-Rayleigh is inspired by Meta-Recognition (MR), an algorithm that aims to predict when a classifier’s outcome is correct. We demonstrate that by using a Rayleigh distribution, the prediction accuracy of MR can be improved considerably. Our experiments show that MR-Rayleigh tends to predict better than the often-used Lowe ’s ratio, Brown’s ratio, and the standard MR under a range of imaging conditions. Furthermore, our homography estimation experiment demonstrates that SWIGS performs similarly or better than other guided sampling methods while requiring fewer iterations, leading to fast and accurate model estimates.
2 0.14223564 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition
Author: Qiang Hao, Rui Cai, Zhiwei Li, Lei Zhang, Yanwei Pang, Feng Wu, Yong Rui
Abstract: 3D model-based object recognition has been a noticeable research trend in recent years. Common methods find 2D-to-3D correspondences and make recognition decisions by pose estimation, whose efficiency usually suffers from noisy correspondences caused by the increasing number of target objects. To overcome this scalability bottleneck, we propose an efficient 2D-to-3D correspondence filtering approach, which combines a light-weight neighborhoodbased step with a finer-grained pairwise step to remove spurious correspondences based on 2D/3D geometric cues. On a dataset of 300 3D objects, our solution achieves ∼10 times speed improvement over the baseline, with a comparable recognition accuracy. A parallel implementation on a quad-core CPU can run at ∼3fps for 1280× 720 images.
3 0.12664673 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla
Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France
4 0.12269121 254 cvpr-2013-Learning SURF Cascade for Fast and Accurate Object Detection
Author: Jianguo Li, Yimin Zhang
Abstract: This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the wellknown Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade. Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the stateof-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.
5 0.12182499 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition
Author: Song Cao, Noah Snavely
Abstract: Recognizing the location of a query image by matching it to a database is an important problem in computer vision, and one for which the representation of the database is a key issue. We explore new ways for exploiting the structure of a database by representing it as a graph, and show how the rich information embedded in a graph can improve a bagof-words-based location recognition method. In particular, starting from a graph on a set of images based on visual connectivity, we propose a method for selecting a set of subgraphs and learning a local distance function for each using discriminative techniques. For a query image, each database image is ranked according to these local distance functions in order to place the image in the right part of the graph. In addition, we propose a probabilistic method for increasing the diversity of these ranked database images, again based on the structure of the image graph. We demonstrate that our methods improve performance over standard bag-of-words methods on several existing location recognition datasets.
6 0.11436476 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
7 0.11075393 155 cvpr-2013-Exploiting the Power of Stereo Confidences
8 0.10724881 361 cvpr-2013-Robust Feature Matching with Alternate Hough and Inverted Hough Transforms
9 0.097972594 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
10 0.095476091 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
11 0.093697153 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching
12 0.084329829 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
13 0.083766043 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval
14 0.082619883 380 cvpr-2013-Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
15 0.081681274 325 cvpr-2013-Part Discovery from Partial Correspondence
16 0.077176295 99 cvpr-2013-Cross-View Image Geolocalization
17 0.073657423 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration
18 0.072925605 456 cvpr-2013-Visual Place Recognition with Repetitive Structures
19 0.072009481 378 cvpr-2013-Sampling Strategies for Real-Time Action Recognition
20 0.071085945 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs
topicId topicWeight
[(0, 0.164), (1, 0.019), (2, 0.007), (3, -0.002), (4, 0.06), (5, -0.0), (6, -0.018), (7, -0.034), (8, -0.019), (9, -0.015), (10, -0.022), (11, 0.055), (12, 0.109), (13, -0.011), (14, 0.001), (15, -0.157), (16, -0.002), (17, -0.038), (18, 0.092), (19, -0.047), (20, 0.042), (21, -0.028), (22, 0.016), (23, 0.078), (24, 0.021), (25, -0.11), (26, -0.03), (27, -0.004), (28, 0.053), (29, 0.073), (30, -0.01), (31, -0.016), (32, -0.086), (33, -0.037), (34, 0.057), (35, 0.028), (36, 0.042), (37, 0.023), (38, 0.001), (39, 0.047), (40, 0.022), (41, -0.054), (42, 0.001), (43, -0.003), (44, -0.05), (45, 0.021), (46, 0.109), (47, 0.003), (48, 0.041), (49, 0.051)]
simIndex simValue paperId paperTitle
same-paper 1 0.95945108 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method
Author: Victor Fragoso, Matthew Turk
Abstract: We present SWIGS, a Swift and efficient Guided Sampling method for robust model estimation from image feature correspondences. Our method leverages the accuracy of our new confidence measure (MR-Rayleigh), which assigns a correctness-confidence to a putative correspondence in an online fashion. MR-Rayleigh is inspired by Meta-Recognition (MR), an algorithm that aims to predict when a classifier’s outcome is correct. We demonstrate that by using a Rayleigh distribution, the prediction accuracy of MR can be improved considerably. Our experiments show that MR-Rayleigh tends to predict better than the often-used Lowe ’s ratio, Brown’s ratio, and the standard MR under a range of imaging conditions. Furthermore, our homography estimation experiment demonstrates that SWIGS performs similarly or better than other guided sampling methods while requiring fewer iterations, leading to fast and accurate model estimates.
2 0.8098861 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition
Author: Qiang Hao, Rui Cai, Zhiwei Li, Lei Zhang, Yanwei Pang, Feng Wu, Yong Rui
Abstract: 3D model-based object recognition has been a noticeable research trend in recent years. Common methods find 2D-to-3D correspondences and make recognition decisions by pose estimation, whose efficiency usually suffers from noisy correspondences caused by the increasing number of target objects. To overcome this scalability bottleneck, we propose an efficient 2D-to-3D correspondence filtering approach, which combines a light-weight neighborhoodbased step with a finer-grained pairwise step to remove spurious correspondences based on 2D/3D geometric cues. On a dataset of 300 3D objects, our solution achieves ∼10 times speed improvement over the baseline, with a comparable recognition accuracy. A parallel implementation on a quad-core CPU can run at ∼3fps for 1280× 720 images.
3 0.7316618 361 cvpr-2013-Robust Feature Matching with Alternate Hough and Inverted Hough Transforms
Author: Hsin-Yi Chen, Yen-Yu Lin, Bing-Yu Chen
Abstract: We present an algorithm that carries out alternate Hough transform and inverted Hough transform to establish feature correspondences, and enhances the quality of matching in both precision and recall. Inspired by the fact that nearby features on the same object share coherent homographies in matching, we cast the task of feature matching as a density estimation problem in the Hough space spanned by the hypotheses of homographies. Specifically, we project all the correspondences into the Hough space, and determine the correctness of the correspondences by their respective densities. In this way, mutual verification of relevant correspondences is activated, and the precision of matching is boosted. On the other hand, we infer the concerted homographies propagated from the locally grouped features, and enrich the correspondence candidates for each feature. The recall is hence increased. The two processes are tightly coupled. Through iterative optimization, plausible enrichments are gradually revealed while more correct correspondences are detected. Promising experimental results on three benchmark datasets manifest the effectiveness of the proposed approach.
4 0.67142236 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval
Author: Danfeng Qin, Christian Wengert, Luc Van_Gool
Abstract: Many recent object retrieval systems rely on local features for describing an image. The similarity between a pair of images is measured by aggregating the similarity between their corresponding local features. In this paper we present a probabilistic framework for modeling the feature to feature similarity measure. We then derive a query adaptive distance which is appropriate for global similarity evaluation. Furthermore, we propose a function to score the individual contributions into an image to image similarity within the probabilistic framework. Experimental results show that our method improves the retrieval accuracy significantly and consistently. Moreover, our result compares favorably to the state-of-the-art.
5 0.66939557 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition
Author: Song Cao, Noah Snavely
Abstract: Recognizing the location of a query image by matching it to a database is an important problem in computer vision, and one for which the representation of the database is a key issue. We explore new ways for exploiting the structure of a database by representing it as a graph, and show how the rich information embedded in a graph can improve a bagof-words-based location recognition method. In particular, starting from a graph on a set of images based on visual connectivity, we propose a method for selecting a set of subgraphs and learning a local distance function for each using discriminative techniques. For a query image, each database image is ranked according to these local distance functions in order to place the image in the right part of the graph. In addition, we propose a probabilistic method for increasing the diversity of these ranked database images, again based on the structure of the image graph. We demonstrate that our methods improve performance over standard bag-of-words methods on several existing location recognition datasets.
6 0.64370918 456 cvpr-2013-Visual Place Recognition with Repetitive Structures
7 0.63667256 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition
8 0.62523091 99 cvpr-2013-Cross-View Image Geolocalization
9 0.60072488 234 cvpr-2013-Joint Spectral Correspondence for Disparate Image Matching
10 0.5958342 240 cvpr-2013-Keypoints from Symmetries by Wave Propagation
11 0.57857686 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration
12 0.56122041 162 cvpr-2013-FasT-Match: Fast Affine Template Matching
13 0.55587173 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
14 0.55276847 47 cvpr-2013-As-Projective-As-Possible Image Stitching with Moving DLT
15 0.54582077 352 cvpr-2013-Recovering Stereo Pairs from Anaglyphs
16 0.53806746 290 cvpr-2013-Motion Estimation for Self-Driving Cars with a Generalized Camera
17 0.53090858 317 cvpr-2013-Optimal Geometric Fitting under the Truncated L2-Norm
18 0.52238679 74 cvpr-2013-CLAM: Coupled Localization and Mapping with Efficient Outlier Handling
19 0.51175874 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
20 0.50983059 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization
topicId topicWeight
[(10, 0.086), (16, 0.048), (26, 0.04), (28, 0.012), (33, 0.27), (67, 0.053), (69, 0.062), (74, 0.214), (87, 0.13)]
simIndex simValue paperId paperTitle
same-paper 1 0.8715719 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method
Author: Victor Fragoso, Matthew Turk
Abstract: We present SWIGS, a Swift and efficient Guided Sampling method for robust model estimation from image feature correspondences. Our method leverages the accuracy of our new confidence measure (MR-Rayleigh), which assigns a correctness-confidence to a putative correspondence in an online fashion. MR-Rayleigh is inspired by Meta-Recognition (MR), an algorithm that aims to predict when a classifier’s outcome is correct. We demonstrate that by using a Rayleigh distribution, the prediction accuracy of MR can be improved considerably. Our experiments show that MR-Rayleigh tends to predict better than the often-used Lowe ’s ratio, Brown’s ratio, and the standard MR under a range of imaging conditions. Furthermore, our homography estimation experiment demonstrates that SWIGS performs similarly or better than other guided sampling methods while requiring fewer iterations, leading to fast and accurate model estimates.
2 0.86759192 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
Author: Bo Zheng, Yibiao Zhao, Joey C. Yu, Katsushi Ikeuchi, Song-Chun Zhu
Abstract: In this paper, we present an approach for scene understanding by reasoning physical stability of objects from point cloud. We utilize a simple observation that, by human design, objects in static scenes should be stable with respect to gravity. This assumption is applicable to all scene categories and poses useful constraints for the plausible interpretations (parses) in scene understanding. Our method consists of two major steps: 1) geometric reasoning: recovering solid 3D volumetric primitives from defective point cloud; and 2) physical reasoning: grouping the unstable primitives to physically stable objects by optimizing the stability and the scene prior. We propose to use a novel disconnectivity graph (DG) to represent the energy landscape and use a Swendsen-Wang Cut (MCMC) method for optimization. In experiments, we demonstrate that the algorithm achieves substantially better performance for i) object segmentation, ii) 3D volumetric recovery of the scene, and iii) better parsing result for scene understanding in comparison to state-of-the-art methods in both public dataset and our own new dataset.
Author: Pradipto Das, Chenliang Xu, Richard F. Doell, Jason J. Corso
Abstract: The problem of describing images through natural language has gained importance in the computer vision community. Solutions to image description have either focused on a top-down approach of generating language through combinations of object detections and language models or bottom-up propagation of keyword tags from training images to test images through probabilistic or nearest neighbor techniques. In contrast, describing videos with natural language is a less studied problem. In this paper, we combine ideas from the bottom-up and top-down approaches to image description and propose a method for video description that captures the most relevant contents of a video in a natural language description. We propose a hybrid system consisting of a low level multimodal latent topic model for initial keyword annotation, a middle level of concept detectors and a high level module to produce final lingual descriptions. We compare the results of our system to human descriptions in both short and long forms on two datasets, and demonstrate that final system output has greater agreement with the human descriptions than any single level.
4 0.83993191 269 cvpr-2013-Light Field Distortion Feature for Transparent Object Recognition
Author: Kazuki Maeno, Hajime Nagahara, Atsushi Shimada, Rin-Ichiro Taniguchi
Abstract: Current object-recognition algorithms use local features, such as scale-invariant feature transform (SIFT) and speeded-up robust features (SURF), for visually learning to recognize objects. These approaches though cannot apply to transparent objects made of glass or plastic, as such objects take on the visual features of background objects, and the appearance ofsuch objects dramatically varies with changes in scene background. Indeed, in transmitting light, transparent objects have the unique characteristic of distorting the background by refraction. In this paper, we use a single-shot light?eld image as an input and model the distortion of the light ?eld caused by the refractive property of a transparent object. We propose a new feature, called the light ?eld distortion (LFD) feature, for identifying a transparent object. The proposal incorporates this LFD feature into the bag-of-features approach for recognizing transparent objects. We evaluated its performance in laboratory and real settings.
Author: Jia Xu, Maxwell D. Collins, Vikas Singh
Abstract: We study the problem of interactive segmentation and contour completion for multiple objects. The form of constraints our model incorporates are those coming from user scribbles (interior or exterior constraints) as well as information regarding the topology of the 2-D space after partitioning (number of closed contours desired). We discuss how concepts from discrete calculus and a simple identity using the Euler characteristic of a planar graph can be utilized to derive a practical algorithm for this problem. We also present specialized branch and bound methods for the case of single contour completion under such constraints. On an extensive dataset of ∼ 1000 images, our experimOenn tasn suggest vthea dt a assmetal ol fa m∼ou 1n0t0 of ismidaeg knowledge can give strong improvements over fully unsupervised contour completion methods. We show that by interpreting user indications topologically, user effort is substantially reduced.
6 0.82114965 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
7 0.82002586 155 cvpr-2013-Exploiting the Power of Stereo Confidences
8 0.81600666 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
9 0.81548768 107 cvpr-2013-Deformable Spatial Pyramid Matching for Fast Dense Correspondences
10 0.81524855 337 cvpr-2013-Principal Observation Ray Calibration for Tiled-Lens-Array Integral Imaging Display
11 0.81504083 443 cvpr-2013-Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances
12 0.8146596 147 cvpr-2013-Ensemble Learning for Confidence Measures in Stereo Vision
13 0.8141275 396 cvpr-2013-Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback
14 0.81209654 39 cvpr-2013-Alternating Decision Forests
15 0.81183255 72 cvpr-2013-Boundary Detection Benchmarking: Beyond F-Measures
16 0.81161481 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
17 0.81108534 298 cvpr-2013-Multi-scale Curve Detection on Surfaces
18 0.8109237 209 cvpr-2013-Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking
19 0.81075072 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments
20 0.80972862 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data