cvpr cvpr2013 cvpr2013-260 knowledge-graph by maker-knowledge-mining

260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition


Source: pdf

Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla

Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Learning and calibrating per-location classifiers for visual place recognition Petr Gron a´t1,2,3 Guillaume Obozinski4 Josef Sivic1,3 Tom a´ˇ s Pajdla2 1INRIA 2Czech Technical University in Prague 4Ecole des Ponts ParisTech – geotagged image database (right). [sent-1, score-0.812]

2 We cast the problem as a classification task and learn a classifier for each location in the database. [sent-2, score-0.239]

3 We develop a non-parametric procedure to calibrate the outputs of the large number of per-location classifiers without the need for additional positive training data. [sent-3, score-0.32]

4 Abstract The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. [sent-4, score-0.846]

5 First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. [sent-7, score-0.765]

6 Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. [sent-8, score-0.481]

7 The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. [sent-9, score-0.405]

8 Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. [sent-10, score-0.581]

9 Introduction Visual place recognition [7, 13, 27] is a challenging task as the query and database images may depict the same 3D structure (e. [sent-13, score-0.806]

10 In addition, the geotagged database may be very large. [sent-16, score-0.334]

11 Similar to other work in large scale place recognition [7, 13, 27] and image retrieval [20, 21, 28], we build on the bag-of-visual-words representation [6, 28] and describe each image by a set of quantized local invariant features, such as SURF [1] or SIFT [17]. [sent-18, score-0.308]

12 The vectors are usually normalized to have unit L2 norm and the similarity between the query and a database vector is then measured by their dot product. [sent-20, score-0.525]

13 While in image retrieval databases are typically unstructured collections of images, place recognition databases are usually structured: images have geotags, are localized on a map and depict a consistent 3D world. [sent-25, score-0.448]

14 Knowing the structure of the database can lead to significant improvements in both speed and accuracy of place recognition. [sent-26, score-0.358]

15 In this work, we also take advantage of geotags as an available form of supervision and investigate whether the place recognition problem can be cast as a classification task. [sent-28, score-0.47]

16 While visual classifiers were investigated for landmark recognition [14], where many photographs are available for each of the landmarks, in this work we wish to train a classifier for each location on the map in a similar manner to per-exemplar classification in object recognition [18]. [sent-29, score-0.394]

17 At query time, the query photograph is localized by transferring the GPS tag of the best scoring location classifier. [sent-32, score-0.893]

18 While learning classifiers for each place may be appealing, calibrating outputs of the individual classifiers is a critical issue. [sent-33, score-0.51]

19 In object recognition [18], it is addressed in a separate calibration stage on a held-out set of training data. [sent-34, score-0.354]

20 This is not possible in the place recognition set-up as only a small number, typically one to five, of positive training images are available for each location (e. [sent-35, score-0.483]

21 To address this issue, we propose a calibration procedure inspired by the use of pvalues in statistics and based on ranking the score of a query image amongst scores of other images in the database. [sent-38, score-0.899]

22 Per-location classifiers for place recognition We are given tf-idf vectors dj, one for each database image j. [sent-43, score-0.491]

23 Instead of approaching the problem directly as a large multiclass classification problem, we tackle the problem by learning a per-exemplar linear SVM classifier [18] for each database image j. [sent-45, score-0.239]

24 Similar to [13], we use the available geotags to construct the negative set Nj for tehaech a image j. [sent-46, score-0.326]

25 Tehotea negative ssettr uisc tcto hnest nruecgtaetdiv so as tNo concentrate difficult negative examples, i. [sent-47, score-0.376]

26 The positive set Pj is represented by the only positive example, which is dj i Ptself. [sent-51, score-0.243]

27 Each SVM classifier produces a score sj which is a priori not comparable with the score of the other classifiers. [sent-52, score-0.447]

28 A calibration of these scores will therefore be key to convert them to comparable scores fj . [sent-53, score-0.678]

29 This calibration problem is more difficult than usual given that we only have a single positive example and will be addressed in section 3. [sent-54, score-0.381]

30 SVM classifier learns a score sj of the form Each linear sj(q) = wjTq + bj (2) where wj is a weight vector re-weighting contributions of individual visual words and bj is the bias specific for image j. [sent-56, score-0.579]

31 o Gr wj a tnhde btriaaisn bj s suectsh Pthata nthde N score difference between dj and the closest neighbor from its negative set Nj is maximiazendd. [sent-58, score-0.558]

32 ∈Njh(−wjTx − bj), (3) where the first term is the regularizer, the second term is the loss on the positive training data weighted by scalar parameter C1, and the third term is the loss on the negative training data weighted by scalar parameter C2. [sent-62, score-0.371]

33 wj and bj are learned separately for each database image j in turn. [sent-65, score-0.33]

34 In our case (details in section 4), we use about 1-5 positive examples, and 200 negative examples. [sent-66, score-0.287]

35 A typical geotagged database may contain several images depicting a particular location. [sent-69, score-0.386]

36 If such images are identified, they may provide a few additional positive examples for the particular place and improve the quality of that per-location classifier. [sent-72, score-0.388]

37 These images can be further verified using geometric verification [21] and included in the positive training data for location j. [sent-75, score-0.29]

38 Non-parametric calibration of the SVM- scores from negative examples only Since the classification scores sj are learned independently for each location j, they cannot be directly used as the scores fj from eq. [sent-78, score-1.185]

39 As illustrated in figure 2, for a given query q, a classifier from an incorrect location (b) can have a higher score (2) than the classifier from the target location (a). [sent-80, score-0.923]

40 This issue is addressed by calibrating scores of the learnt classifiers. [sent-82, score-0.318]

41 The goal of the calibration is to convert the output of each classifier into a probability (or in general a “universal” score), which can be meaningfully compared across classifiers. [sent-83, score-0.38]

42 Several calibration approaches have been proposed in the literature (see [9] and references therein for a review). [sent-84, score-0.282]

43 Another important calibration method is the isotonic regression [3 1], which allows for a non-parametric estimate of the output probability. [sent-90, score-0.38]

44 However, given the availability of negative data, it is easy to estimate the significance of the score of a test example compared to the typical score of (plentifully available) negative examples. [sent-92, score-0.721]

45 Intuitively, we will use a large dataset of negative examples to calibrate the individual classifiers so that they reject the same number of negative examples at each level of the calibrated score. [sent-93, score-0.819]

46 We will expand this idea in detail and use concepts from hypothesis testing to propose a calibration method. [sent-94, score-0.332]

47 In the following, we view the problem of deciding whether a query image matches a given location based on the corresponding SVM score as a hypothesis testing problem. [sent-96, score-0.627]

48 In the NP framework, the significance level of a score is measured by the p-value or equivalently by the value of the cumulative density function (cdf) of the distribution of the negatives at a given score value. [sent-104, score-0.565]

49 The cdf is the function F0 defined by F0 (s) = P(S0 ≤ s), where S0 is the random variable corresponding to ≤the s scores oef negative data (see figure 3 for an illustration of the relation between the cdf and the density of the function). [sent-105, score-0.886]

50 The cdf (or the corresponding p-value1) is naturally estimated by the empirical cumulative density function Fˆ0, which is computed as: Fˆ0(s) =N1c? [sent-106, score-0.429]

51 nN=c11{sn≤s}, where (sn) 1≤n≤Nc are the SVM scores associated with Nc negative examples used for calibration. [sent-107, score-0.395]

52 (s) is the fraction of the negative examples used for calibration (ideally held out negative examples) that have a score below a given value s. [sent-108, score-0.897]

53 Computing Fˆ0 exactly would require to store all the SVM scores for all the calibration data for all classifiers, so in practice, we only keep a fraction of the larger scores. [sent-109, score-0.456]

54 We also interpolate the empirical cdf between consecutive datapoints so that instead of being a staircase function it is a continuous piecewise linear function such as illustrated on figure 2. [sent-110, score-0.29]

55 Given a query, we first compute its SVM score sq and then compute the calibrated probability f(q) = (sq). [sent-111, score-0.353]

56 We obtain a similar calibrated probability fj (q) for each of the SVMs associated with each of the target locations, which can now be ranked. [sent-112, score-0.271]

57 For each place, keep Nc scores from negative examples (sn)1≤n≤Nc used for calibration together with the associated cumulative 1The notion most commonly used in statistics is in fact the p-value. [sent-114, score-0.765]

58 The p-value associated to a score is the quantity α(s) defined by α(s) = 1 pF-0v (aslu);e so tsohec more osi agn sicfoicraent is st thhee score itsit,y yt αhe( cs)lo sdeerfi ntoe 1d tbhye α αcd(sf) )v =alue 1 i−s, and the closer to 0 the p-value is. [sent-115, score-0.302]

59 To keep the presentation simple, we avoid the formulation in terms of p-values and we only talk of the probabilistic calibrated values obtained from the cdf F0. [sent-116, score-0.376]

60 5 (a) (b) Figure 2: An illustration of the proposed normalization of SVM scores for two different database images. [sent-135, score-0.276]

61 For the given query, the raw SVM score of image (b) is lower than for image (a), but the calibrated score of image (b) is higher than for image (a). [sent-138, score-0.392]

62 Compute the interpolated empirical cdf value Fˆ0(sq) ≈Fˆ0(sn) +sns+q1−− s snn(Fˆ0(sn+1) −Fˆ0(sn)). [sent-142, score-0.29]

63 It should be noted that basing the calibration only on the negative data has the advantage that we privilege precision over recall, which is justified given the imbalance of the available training data (much more negatives than positives). [sent-144, score-0.593]

64 By contrast, since we are learning from a comparatively large number of negative examples, we can trust the fact that new negative examples will stay in the half-space containing the negative training set, so that their scores are very unlikely to be large. [sent-146, score-0.813]

65 Our method is therefore based on the fact that we can measure reliably how surprising a high score would be if it was the score of a negative example. [sent-147, score-0.46]

66 7520 50−50 6−5−4−3scores−2−101 (b) Figure 3: A figure showing the relation between (a) the probability density of the random variable S0 modeling the scores of the negative examples and (b) the corresponding cumulative density function F0 (s) = P(S0 ≤ s). [sent-150, score-0.585]

67 [26] propose a method, which is related to ours, and calibrate SVM scores by computing the corresponding cdf value of a Weibull distribution fitted to the top negative scores. [sent-160, score-0.655]

68 The calibration with either methods yields “universal” scores in the sense that they are comparable from one SVM to another, but the calibrated values obtained from logistic regression are not comparable to the values obtained from our approach. [sent-169, score-0.757]

69 To learn the classifier for database image j, the positive and negative training data is constructed as follows. [sent-178, score-0.568]

70 In other words, the negative training data consists of the hard negative images, i. [sent-180, score-0.418]

71 We set the value of the regularization parameters to C1 = 1· nP for positive data and C2 = 10−3 · nN for negative d a1t a· n nwhere nP and nN denote the numbe·r nof examples in the positive and the negative set, respectively. [sent-187, score-0.646]

72 To use a reasonable amount of memory, for each classifier, we store only the first 1000 largest negative scores (the number of negative scores stored could be reduced further using interpolation). [sent-190, score-0.685]

73 We compare the performance of the proposed approach (SVM p-val) with the following baseline methods: (a) Training per-location classifiers without any calibration step (SVM). [sent-205, score-0.42]

74 For all methods, we implemented a two-stage place recognition approach. [sent-212, score-0.247]

75 Given a query image, the aim of the first stage is to efficiently find a small subset (20) of candidates that are likely to depict the same place as the query image. [sent-213, score-0.981]

76 In the second stage, we search for restricted homographies between candidates and the query image using RANSAC [21]. [sent-214, score-0.346]

77 Since the ground truth GPS position for each query image is available, we measure the overall recognition perfor2The calibration of SVM scores with logistic regression is based on a subset of 30 hard negatives from Nj and 1-15 available positive examples from Pj . [sent-216, score-1.205]

78 8 Table 1: The percentage of correctly localized test queries for which the top-ranked database image is within 20 meters from the ground truth query position. [sent-232, score-0.641]

79 Notice that SVM output without calibration gives 0% of correctly localized queries. [sent-235, score-0.35]

80 mance by the percentage of query test images for which the top-ranked database image was located within a distance of 20 meters from the ground truth query location. [sent-236, score-0.866]

81 Results are summarized in table 1 and clearly demonstrate the benefits of careful calibration of the per-location classifiers. [sent-237, score-0.282]

82 In addition, the proposed per-location classifier method outperforms the baseline bag-of-visual-word approach [21] including confuser suppression [13]. [sent-238, score-0.309]

83 Figure 5 illustrates the weights learnt for one database image applied to three different query images. [sent-240, score-0.583]

84 The linear SVM classifiers trained for each database image are currently non-sparse, which increases the computational and memory requirements at query time compared to the original bag-of-visual-words representation. [sent-242, score-0.59]

85 For a database of 25,000 images, applying all classifiers on a query image takes currently on average 1. [sent-243, score-0.59]

86 Conclusions We have shown that place recognition can be cast as a classification problem and have used geotags as a readilyavailable supervision to train an ensemble of classifiers, one for each location in the database. [sent-247, score-0.565]

87 As only few positive examples are available for each location, we have proposed a non-parametric procedure to calibrate the output of each classifier without the need for additional positive training data. [sent-248, score-0.486]

88 The results show improved place recognition performance over baseline methods and demonstrate that careful calibration is critical to achieve competitive place recognition performance. [sent-249, score-0.811]

89 The developed calibration method is not specific to place recognition and can be useful for other perexemplar classification tasks, where only a small number of positive examples are available [18]. [sent-250, score-0.7]

90 Total recall: Automatic query expansion with a generative feature model for object retrieval. [sent-297, score-0.346]

91 Confuser suppression Bag-of-words (a)(b)(c)(d) Figure 4: Examples of query images (gray) correctly (green) and incorrectly (red) localized by different methods. [sent-373, score-0.505]

92 (c) the top-ranked image retrieved by the baseline confuser suppression method [13]. [sent-376, score-0.275]

93 9 9 91 1 13 1 1 Calibrated classifier score fj Target database image j )(Fs0 −. [sent-379, score-0.471]

94 2f5j(q) (a)(b)(c) Calibrated classifier score fj Target database image j Fs)(0 −. [sent-391, score-0.471]

95 2j5(q) (a)(b)(c) Figure 5: A visualization of learnt feature wights for two database images. [sent-404, score-0.237]

96 (Left) Cumulative density function (or calibrated score) learnt for the SVM scores of the corresponding classifier fj ; three query images displayed on the second row are represented by their SVM scores and cdf values F0 (s), denoted (a)-(c) on the graph. [sent-406, score-1.333]

97 Third row: A visualization of the contribution of each feature to the SVM score for the corresponding query image. [sent-407, score-0.482]

98 Red circles represent features with negative weights while green circles correspond to features with positive weights. [sent-408, score-0.361]

99 Left panel: Query (b) gets a high score because the building has orange and white stripes similar to the the sun-blinds of the bakery, which are features that also have large positive weights in the query image (c) of the correct place. [sent-411, score-0.621]

100 Right panel: Query (b) is in fact also an image of the same location with a portion of the left skyscraper in the target image detected in the upper left corner and the side of the rightmost building in the target image detected in the top right corner. [sent-412, score-0.245]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('query', 0.346), ('calibration', 0.282), ('cdf', 0.256), ('place', 0.217), ('geotagged', 0.193), ('negative', 0.188), ('svm', 0.185), ('database', 0.141), ('geotags', 0.138), ('score', 0.136), ('scores', 0.135), ('confuser', 0.125), ('calibrated', 0.12), ('logistic', 0.11), ('bj', 0.109), ('classifiers', 0.103), ('positive', 0.099), ('classifier', 0.098), ('sq', 0.097), ('fj', 0.096), ('sn', 0.096), ('learnt', 0.096), ('location', 0.095), ('philbin', 0.091), ('cumulative', 0.088), ('calibrating', 0.087), ('sivic', 0.086), ('pj', 0.085), ('negatives', 0.081), ('wj', 0.08), ('calibrate', 0.076), ('chum', 0.076), ('positives', 0.076), ('significance', 0.073), ('depict', 0.072), ('examples', 0.072), ('nj', 0.07), ('scheirer', 0.069), ('localized', 0.068), ('retrieved', 0.064), ('isard', 0.064), ('panoramas', 0.064), ('google', 0.064), ('paristech', 0.062), ('wjtx', 0.062), ('np', 0.061), ('retrieval', 0.061), ('target', 0.055), ('ponts', 0.055), ('verification', 0.054), ('gps', 0.053), ('panel', 0.053), ('queries', 0.053), ('depicting', 0.052), ('streetview', 0.051), ('weibull', 0.051), ('suppression', 0.051), ('density', 0.051), ('hypothesis', 0.05), ('regression', 0.05), ('isotonic', 0.048), ('sj', 0.047), ('geographical', 0.046), ('cast', 0.046), ('iii', 0.046), ('dj', 0.045), ('surf', 0.045), ('training', 0.042), ('quantization', 0.042), ('des', 0.041), ('umr', 0.041), ('nc', 0.041), ('building', 0.04), ('ii', 0.04), ('panorama', 0.04), ('incorrectly', 0.04), ('store', 0.039), ('supervision', 0.039), ('dot', 0.038), ('iarpa', 0.038), ('photograph', 0.038), ('pittsburgh', 0.038), ('perspective', 0.038), ('landmark', 0.038), ('circles', 0.037), ('baseline', 0.035), ('france', 0.034), ('egou', 0.034), ('empirical', 0.034), ('douze', 0.034), ('meters', 0.033), ('svms', 0.032), ('held', 0.031), ('universal', 0.031), ('vocabularies', 0.031), ('recognition', 0.03), ('comparable', 0.03), ('quantity', 0.03), ('snavely', 0.03), ('views', 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999934 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla

Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France

2 0.32159236 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

Author: Song Cao, Noah Snavely

Abstract: Recognizing the location of a query image by matching it to a database is an important problem in computer vision, and one for which the representation of the database is a key issue. We explore new ways for exploiting the structure of a database by representing it as a graph, and show how the rich information embedded in a graph can improve a bagof-words-based location recognition method. In particular, starting from a graph on a set of images based on visual connectivity, we propose a method for selecting a set of subgraphs and learning a local distance function for each using discriminative techniques. For a query image, each database image is ranked according to these local distance functions in order to place the image in the right part of the graph. In addition, we propose a probabilistic method for increasing the diversity of these ranked database images, again based on the structure of the image graph. We demonstrate that our methods improve performance over standard bag-of-words methods on several existing location recognition datasets.

3 0.28614867 456 cvpr-2013-Visual Place Recognition with Repetitive Structures

Author: Akihiko Torii, Josef Sivic, Tomáš Pajdla, Masatoshi Okutomi

Abstract: Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. Even more importantly, they violate thefeature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, theyform an importantdistinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval. It is based on robust detection of repeated image structures and a simple modification of weights in the bag-of-visual-word model. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline and more recently proposed burstiness weighting.

4 0.21264677 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval

Author: Danfeng Qin, Christian Wengert, Luc Van_Gool

Abstract: Many recent object retrieval systems rely on local features for describing an image. The similarity between a pair of images is measured by aggregating the similarity between their corresponding local features. In this paper we present a probabilistic framework for modeling the feature to feature similarity measure. We then derive a query adaptive distance which is appropriate for global similarity evaluation. Furthermore, we propose a function to score the individual contributions into an image to image similarity within the probabilistic framework. Experimental results show that our method improves the retrieval accuracy significantly and consistently. Moreover, our result compares favorably to the state-of-the-art.

5 0.15684786 99 cvpr-2013-Cross-View Image Geolocalization

Author: Tsung-Yi Lin, Serge Belongie, James Hays

Abstract: The recent availability oflarge amounts ofgeotagged imagery has inspired a number of data driven solutions to the image geolocalization problem. Existing approaches predict the location of a query image by matching it to a database of georeferenced photographs. While there are many geotagged images available on photo sharing and street view sites, most are clustered around landmarks and urban areas. The vast majority of the Earth’s land area has no ground level reference photos available, which limits the applicability of all existing image geolocalization methods. On the other hand, there is no shortage of visual and geographic data that densely covers the Earth we examine overhead imagery and land cover survey data but the relationship between this data and ground level query photographs is complex. In this paper, we introduce a cross-view feature translation approach to greatly extend the reach of image geolocalization methods. We can often localize a query even if it has no corresponding ground– – level images in the database. A key idea is to learn the relationship between ground level appearance and overhead appearance and land cover attributes from sparsely available geotagged ground-level images. We perform experiments over a 1600 km2 region containing a variety of scenes and land cover types. For each query, our algorithm produces a probability density over the region of interest.

6 0.15255171 76 cvpr-2013-Can a Fully Unconstrained Imaging Model Be Applied Effectively to Central Cameras?

7 0.15132879 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

8 0.14376806 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?

9 0.12664673 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method

10 0.12504019 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

11 0.12032226 151 cvpr-2013-Event Retrieval in Large Video Collections with Circulant Temporal Encoding

12 0.11964028 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering

13 0.11956064 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

14 0.11771086 119 cvpr-2013-Detecting and Aligning Faces by Image Retrieval

15 0.11736652 138 cvpr-2013-Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition

16 0.11614924 134 cvpr-2013-Discriminative Sub-categorization

17 0.11183298 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

18 0.1113534 430 cvpr-2013-The SVM-Minus Similarity Score for Video Face Recognition

19 0.10497475 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

20 0.10134728 153 cvpr-2013-Expanded Parts Model for Human Attribute and Action Recognition in Still Images


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.237), (1, -0.06), (2, -0.026), (3, -0.009), (4, 0.118), (5, 0.021), (6, -0.105), (7, -0.066), (8, -0.015), (9, -0.032), (10, -0.101), (11, 0.035), (12, 0.138), (13, 0.009), (14, -0.106), (15, -0.203), (16, 0.109), (17, -0.004), (18, 0.009), (19, -0.159), (20, 0.188), (21, -0.051), (22, -0.061), (23, 0.083), (24, -0.061), (25, 0.032), (26, 0.043), (27, 0.019), (28, -0.004), (29, 0.085), (30, 0.1), (31, -0.055), (32, -0.046), (33, 0.063), (34, -0.033), (35, -0.015), (36, 0.077), (37, -0.129), (38, -0.129), (39, 0.003), (40, -0.013), (41, -0.139), (42, 0.004), (43, 0.057), (44, -0.119), (45, 0.056), (46, 0.076), (47, 0.024), (48, 0.005), (49, 0.149)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96216673 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla

Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France

2 0.81795281 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

Author: Song Cao, Noah Snavely

Abstract: Recognizing the location of a query image by matching it to a database is an important problem in computer vision, and one for which the representation of the database is a key issue. We explore new ways for exploiting the structure of a database by representing it as a graph, and show how the rich information embedded in a graph can improve a bagof-words-based location recognition method. In particular, starting from a graph on a set of images based on visual connectivity, we propose a method for selecting a set of subgraphs and learning a local distance function for each using discriminative techniques. For a query image, each database image is ranked according to these local distance functions in order to place the image in the right part of the graph. In addition, we propose a probabilistic method for increasing the diversity of these ranked database images, again based on the structure of the image graph. We demonstrate that our methods improve performance over standard bag-of-words methods on several existing location recognition datasets.

3 0.77227134 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval

Author: Danfeng Qin, Christian Wengert, Luc Van_Gool

Abstract: Many recent object retrieval systems rely on local features for describing an image. The similarity between a pair of images is measured by aggregating the similarity between their corresponding local features. In this paper we present a probabilistic framework for modeling the feature to feature similarity measure. We then derive a query adaptive distance which is appropriate for global similarity evaluation. Furthermore, we propose a function to score the individual contributions into an image to image similarity within the probabilistic framework. Experimental results show that our method improves the retrieval accuracy significantly and consistently. Moreover, our result compares favorably to the state-of-the-art.

4 0.77164507 456 cvpr-2013-Visual Place Recognition with Repetitive Structures

Author: Akihiko Torii, Josef Sivic, Tomáš Pajdla, Masatoshi Okutomi

Abstract: Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. Even more importantly, they violate thefeature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, theyform an importantdistinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval. It is based on robust detection of repeated image structures and a simple modification of weights in the bag-of-visual-word model. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline and more recently proposed burstiness weighting.

5 0.70019209 99 cvpr-2013-Cross-View Image Geolocalization

Author: Tsung-Yi Lin, Serge Belongie, James Hays

Abstract: The recent availability oflarge amounts ofgeotagged imagery has inspired a number of data driven solutions to the image geolocalization problem. Existing approaches predict the location of a query image by matching it to a database of georeferenced photographs. While there are many geotagged images available on photo sharing and street view sites, most are clustered around landmarks and urban areas. The vast majority of the Earth’s land area has no ground level reference photos available, which limits the applicability of all existing image geolocalization methods. On the other hand, there is no shortage of visual and geographic data that densely covers the Earth we examine overhead imagery and land cover survey data but the relationship between this data and ground level query photographs is complex. In this paper, we introduce a cross-view feature translation approach to greatly extend the reach of image geolocalization methods. We can often localize a query even if it has no corresponding ground– – level images in the database. A key idea is to learn the relationship between ground level appearance and overhead appearance and land cover attributes from sparsely available geotagged ground-level images. We perform experiments over a 1600 km2 region containing a variety of scenes and land cover types. For each query, our algorithm produces a probability density over the region of interest.

6 0.60913563 373 cvpr-2013-SWIGS: A Swift Guided Sampling Method

7 0.59511381 275 cvpr-2013-Lp-Norm IDF for Large Scale Image Search

8 0.55436003 15 cvpr-2013-A Lazy Man's Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

9 0.55305463 157 cvpr-2013-Exploring Implicit Image Statistics for Visual Representativeness Modeling

10 0.54135919 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering

11 0.53677595 274 cvpr-2013-Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

12 0.50923616 134 cvpr-2013-Discriminative Sub-categorization

13 0.49679971 309 cvpr-2013-Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

14 0.49650577 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

15 0.49321386 168 cvpr-2013-Fast Object Detection with Entropy-Driven Evaluation

16 0.48826751 173 cvpr-2013-Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

17 0.4880695 320 cvpr-2013-Optimizing 1-Nearest Prototype Classifiers

18 0.48326743 126 cvpr-2013-Diffusion Processes for Retrieval Revisited

19 0.47718659 293 cvpr-2013-Multi-attribute Queries: To Merge or Not to Merge?

20 0.47664016 271 cvpr-2013-Locally Aligned Feature Transforms across Views


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.061), (16, 0.011), (26, 0.023), (33, 0.724), (67, 0.036), (69, 0.025), (87, 0.041)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99950463 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

Author: Petr Gronát, Guillaume Obozinski, Josef Sivic, Tomáš Pajdla

Abstract: The aim of this work is to localize a query photograph by finding other images depicting the same place in a large geotagged image database. This is a challenging task due to changes in viewpoint, imaging conditions and the large size of the image database. The contribution of this work is two-fold. First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per-exemplar SVMs in object recognition. Second, as onlyfewpositive training examples are availablefor each location, we propose a new approach to calibrate all the per-location SVM classifiers using only the negative examples. The calibration we propose relies on a significance measure essentially equivalent to the p-values classically used in statistical hypothesis testing. Experiments are performed on a database of 25,000 geotagged street view images of Pittsburgh and demonstrate improved place recognition accuracy of the proposed approach over the previous work. 2Center for Machine Perception, Faculty of Electrical Engineering 3WILLOW project, Laboratoire d’Informatique de l’E´cole Normale Sup e´rieure, ENS/INRIA/CNRS UMR 8548. 4Universit Paris-Est, LIGM (UMR CNRS 8049), Center for Visual Computing, Ecole des Ponts - ParisTech, 77455 Marne-la-Valle, France

2 0.99947125 180 cvpr-2013-Fully-Connected CRFs with Non-Parametric Pairwise Potential

Author: Neill D.F. Campbell, Kartic Subr, Jan Kautz

Abstract: Conditional Random Fields (CRFs) are used for diverse tasks, ranging from image denoising to object recognition. For images, they are commonly defined as a graph with nodes corresponding to individual pixels and pairwise links that connect nodes to their immediate neighbors. Recent work has shown that fully-connected CRFs, where each node is connected to every other node, can be solved efficiently under the restriction that the pairwise term is a Gaussian kernel over a Euclidean feature space. In this paper, we generalize the pairwise terms to a non-linear dissimilarity measure that is not required to be a distance metric. To this end, we propose a density estimation technique to derive conditional pairwise potentials in a nonparametric manner. We then use an efficient embedding technique to estimate an approximate Euclidean feature space for these potentials, in which the pairwise term can still be expressed as a Gaussian kernel. We demonstrate that the use of non-parametric models for the pairwise interactions, conditioned on the input data, greatly increases expressive power whilst maintaining efficient inference.

3 0.99946088 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification

Author: Amirreza Shaban, Hamid R. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad

Abstract: Bag of words models for feature extraction have demonstrated top-notch performance in image classification. These representations are usually accompanied by a coding method. Recently, methods that code a descriptor giving regard to its nearby bases have proved efficacious. These methods take into account the nonlinear structure of descriptors, since local similarities are a good approximation of global similarities. However, they confine their usage of the global similarities to nearby bases. In this paper, we propose a coding scheme that brings into focus the manifold structure of descriptors, and devise a method to compute the global similarities of descriptors to the bases. Given a local similarity measure between bases, a global measure is computed. Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. Experiments on benchmark image classification datasets substantiate the superiority oftheproposed method over its locality and sparsity based rivals.

4 0.99946034 357 cvpr-2013-Revisiting Depth Layers from Occlusions

Author: Adarsh Kowdle, Andrew Gallagher, Tsuhan Chen

Abstract: In this work, we consider images of a scene with a moving object captured by a static camera. As the object (human or otherwise) moves about the scene, it reveals pairwise depth-ordering or occlusion cues. The goal of this work is to use these sparse occlusion cues along with monocular depth occlusion cues to densely segment the scene into depth layers. We cast the problem of depth-layer segmentation as a discrete labeling problem on a spatiotemporal Markov Random Field (MRF) that uses the motion occlusion cues along with monocular cues and a smooth motion prior for the moving object. We quantitatively show that depth ordering produced by the proposed combination of the depth cues from object motion and monocular occlusion cues are superior to using either feature independently, and using a na¨ ıve combination of the features.

5 0.99932784 137 cvpr-2013-Dynamic Scene Classification: Learning Motion Descriptors with Slow Features Analysis

Author: Christian Thériault, Nicolas Thome, Matthieu Cord

Abstract: In this paper, we address the challenging problem of categorizing video sequences composed of dynamic natural scenes. Contrarily to previous methods that rely on handcrafted descriptors, we propose here to represent videos using unsupervised learning of motion features. Our method encompasses three main contributions: 1) Based on the Slow Feature Analysis principle, we introduce a learned local motion descriptor which represents the principal and more stable motion components of training videos. 2) We integrate our local motion feature into a global coding/pooling architecture in order to provide an effective signature for each video sequence. 3) We report state of the art classification performances on two challenging natural scenes data sets. In particular, an outstanding improvement of 11 % in classification score is reached on a data set introduced in 2012.

6 0.99928069 93 cvpr-2013-Constraints as Features

7 0.9990952 55 cvpr-2013-Background Modeling Based on Bidirectional Analysis

8 0.99888849 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning

9 0.99875718 252 cvpr-2013-Learning Locally-Adaptive Decision Functions for Person Verification

10 0.99853522 113 cvpr-2013-Dense Variational Reconstruction of Non-rigid Surfaces from Monocular Video

11 0.99828142 165 cvpr-2013-Fast Energy Minimization Using Learned State Filters

12 0.9979561 59 cvpr-2013-Better Exploiting Motion for Better Action Recognition

13 0.99604511 48 cvpr-2013-Attribute-Based Detection of Unfamiliar Classes with Humans in the Loop

14 0.99584115 301 cvpr-2013-Multi-target Tracking by Rank-1 Tensor Approximation

15 0.9909789 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

16 0.99037886 379 cvpr-2013-Scalable Sparse Subspace Clustering

17 0.98936439 266 cvpr-2013-Learning without Human Scores for Blind Image Quality Assessment

18 0.9888429 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval

19 0.98780286 306 cvpr-2013-Non-rigid Structure from Motion with Diffusion Maps Prior

20 0.98759919 148 cvpr-2013-Ensemble Video Object Cut in Highly Dynamic Scenes