cvpr cvpr2013 cvpr2013-250 knowledge-graph by maker-knowledge-mining

250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering


Source: pdf

Author: Raghuraman Gopalan

Abstract: Estimating geographic location from images is a challenging problem that is receiving recent attention. In contrast to many existing methods that primarily model discriminative information corresponding to different locations, we propose joint learning of information that images across locations share and vary upon. Starting with generative and discriminative subspaces pertaining to domains, which are obtained by a hierarchical grouping of images from adjacent locations, we present a top-down approach that first models cross-domain information transfer by utilizing the geometry ofthese subspaces, and then encodes the model results onto individual images to infer their location. We report competitive results for location recognition and clustering on two public datasets, im2GPS and San Francisco, and empirically validate the utility of various design choices involved in the approach.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 com Abstract Estimating geographic location from images is a challenging problem that is receiving recent attention. [sent-4, score-0.249]

2 In contrast to many existing methods that primarily model discriminative information corresponding to different locations, we propose joint learning of information that images across locations share and vary upon. [sent-5, score-0.303]

3 We report competitive results for location recognition and clustering on two public datasets, im2GPS and San Francisco, and empirically validate the utility of various design choices involved in the approach. [sent-7, score-0.509]

4 [15]), which could either be noisy or missing depending on the location of interest, and application areas such as surveillance. [sent-11, score-0.195]

5 Most existing methods for location recognition follow the paradigm of discriminative modeling for feature selection and classification. [sent-13, score-0.335]

6 For instance, [11] used several lowlevel features that could distinguish images across locations and used a nearest-neighbor classifier to estimate query locations from a large data set. [sent-14, score-0.314]

7 While images in (a) correspond to familiar locations that either have distinct visual features or have good exposure amongst the general public, the location of images in (b) is hard to infer. [sent-16, score-0.386]

8 One intuitive way to address such cases is to analyze how those images are relatively similar to and different from other known locations so that a meaningful location estimate can be obtained. [sent-17, score-0.311]

9 We pursue such a goal in this work using tools pertaining to subspace geometry. [sent-18, score-0.291]

10 In addition to robust feature descriptors, there have been several studies on efficient schemes for classification and retrieval of location queries. [sent-22, score-0.195]

11 Scalable vocabulary tree coding algorithms were presented by [22, 13], while [ 16] modeled landmark image collections using iconic scene graphs. [sent-24, score-0.217]

12 Image features that are confusing from a place recognition perspective was studied by [14], and [5] addressed obtaining discriminative features that are geographically informative, while occurring frequently at the same time. [sent-25, score-0.336]

13 There have also been efforts that provide landmark search engines for web-scale image collections [32] and for mobile vision applications [3]. [sent-26, score-0.251]

14 Besides modeling location specific information, some studies have examined the utility of complimentary information provided by other data modalities. [sent-27, score-0.377]

15 7 7 7 3232 2 9 1 9 Discriminative approaches, however, do not entirely address an important problem in location recognition that is illustrated in Figure 1. [sent-29, score-0.228]

16 One feasible way to obtain an approximate location estimate of these images is tojointly analyze the properties they have in common to and vary from other well known locations. [sent-31, score-0.229]

17 Such an analysis should also account for the fact that the visual and location information of images do not always correlate for instance, one could have images that look very much alike but correspond to vastly different geographic areas. [sent-34, score-0.362]

18 We address this problem by pursuing a top-down approach where given a set of training images representative of different locations, we first group the images into dif- ferent domains based on location adjacency. [sent-35, score-0.865]

19 We then derive generative and discriminative subspaces of same dimensions from these domains, and motivated by [9], we model cross-domain transfer of similar (resp. [sent-36, score-0.692]

20 distinct) information by pursuing a Grassmann manifold interpretation of the space spanned by these generative (resp. [sent-37, score-0.3]

21 We finally embed the effect of this transformation onto images from training and query, and perform location inference in both recognition and clustering settings. [sent-39, score-0.346]

22 Now given a query image xt, the goal of this work is to estimate its location yt = f2 (f1(X)) where f1models information transfer across domains (that are created by grouping xi based on their location yi) and f2 denotes the subsequent classification or clustering mechanism. [sent-49, score-1.519]

23 Assigning images from X into domains D in a threelevel hierarchical manner, and organizing the domains into groups G for further analysis. [sent-75, score-1.212]

24 Each group contains four domains within which generative and discriminative subspaces are analyzed for cross-domain information transfer. [sent-76, score-1.343]

25 Such an analysis on groups in all three levels convey top-down information on how visually similar and distinct information looks like among image collections that trend progressively from global to local. [sent-77, score-0.329]

26 Modeling Cross-domain Information Transfer Creating Domains: Assuming that X correspond to images from all over the earth, we flatten the earth and create domains D in a three-level hierarchical fashion. [sent-80, score-0.655]

27 The first level domains D1 to D4 correspond to images from four quadrants (with each quadrant covering 90 degrees in latitude and 180 degrees in longitude) of the flattened earth, and let group G1 represent the collection all these four domains. [sent-81, score-0.846]

28 The second level domains D5 to D20 are obtained by splitting each first level domain into four quadrants, and we thus obtain four groups Gi = {Di}44∗∗i(i−1)+1,i = 2 to 5. [sent-82, score-0.798]

29 Similarly we obtain the third level domains D21 to D84 by splitting each of the second level domains into four quadrants, with which we constitute 16 groups. [sent-83, score-1.22]

30 So we have a total of c = 84 domains D = {Di}i8=41 that are split into 21 groups G = {Gi}i2=11 containing four domains each, which represents image collections pertaining to location neighborhoods that trend progressively from global to local. [sent-85, score-1.617]

31 We now model how visually similar and different information transform across domains within each group Gi. [sent-86, score-0.764]

32 We perform PCA on each domain Di to obtain a d N orthonormal matrix whose column space denotes the generative subspace Si1. [sent-89, score-0.43]

33 We obtain discriminative subspace pertaining to each Di by considering a one-vs. [sent-90, score-0.316]

34 other three domains in the group that Di belongs to) and performing a two-class PLS to obtain a d N orthonormal matrix whose column space correspond to the discriminative subspace Si2. [sent-93, score-1.019]

35 Let S = {Si1}i8=41 ∪ {Sj2}8j4=1 refer to the collection of generative and discriminative subspaces obtained from G. [sent-95, score-0.629]

36 The problem of modeling cross-domain information now translates into: (i) analyzing the space of these N-dimensional subspaces in Rd to study the transfer of visually generic (resp. [sent-96, score-0.566]

37 discriminative) subspaces within each group Gi, and (ii) embedding this information transfer onto each individual training data xi to obtain a new representation f1(xi) that is cognizant of the cross-domain variations. [sent-98, score-0.714]

38 Grassmann Manifold: Before starting our analysis, we note that the space of subspaces is non-Euclidean, and it can be characterized by the Grassmann manifold [6]. [sent-99, score-0.401]

39 The Grassmannian Gd,N is an analytical manifold which is the space of all N-dimensional subspaces in containing the origin. [sent-100, score-0.401]

40 1 Analyzing Information Flow Between Subspaces We first learn how information transforms across different domains within a group. [sent-106, score-0.646]

41 For this we consider a pair of generative (or discriminative) subspaces in that group, although the following analysis can be extended beyond a pair of subspaces. [sent-107, score-0.522]

42 S1 and S2 could either correspond to a pair of generative subspaces Si1 and Sj1 within a group, or a pair of discriminative subspaces Si2 and Sj2 within a group. [sent-110, score-1.013]

43 In our analysis we have 6 pairs of generative and 6 pairs of discriminative subspaces within each group (since a group has four domains), thereby making 12∗ 21 subspace pairs in all. [sent-111, score-1.001]

44 While one could consider a pair made of a generative subspace Si1 and a discriminative subspace Sj2, we did not pursue that since the information contained in such a pair is different. [sent-112, score-0.624]

45 Now to obtain the geodesic flow between S1 and S2, we compute the direction matrix A such that the geodesic along that di- × rection, while starting from S1, reaches S2 in unit time. [sent-122, score-0.264]

46 Using the information contained in A, we can ‘sample’ points along the geodesic to understand how information transforms between different domains. [sent-124, score-0.239]

47 denote the number of subspaces obtained from a geodesic, which includes S1, S2 and all intermediate subspaces sampled between them. [sent-129, score-0.704]

48 This process, when repeated between all pairs of generative (resp. [sent-130, score-0.205]

49 2 Embedded Data Representation We then embed this information onto the training data by projecting each xi on all c1 subspaces to result in a matrix Mi? [sent-137, score-0.575]

50 By repeating the above process for the entire training set X, we obtain n points on GN,N1 having location information yi associated with them. [sent-145, score-0.286]

51 Performing Location Inference We now train a classifier f2 by performing statistics over the point cloud f1(xi)’s on GN,N1 , to recognize location yt of the query xt. [sent-148, score-0.395]

52 However since the number of locations m is generally much higher than the amount of data available at each location, we discriminate between the domains instead. [sent-155, score-0.656]

53 = 64 domains from third level since they provide the finest location grouping of images xi among all the three levels of the hierarchy (Sec 2. [sent-157, score-0.919]

54 The query location yt is then inferred by first computing the matrix Mt from xt using the procedure described in Sec 2. [sent-161, score-0.423]

55 training data xi from four unique locations yi (m = 4; a specific mountain, wetland, city, desert) into three domains D (c = 3). [sent-166, score-0.845]

56 These domains may contain visually dissimilar images as we do only a coarse grouping. [sent-167, score-0.635]

57 Assume these three domains are combined into a single group G. [sent-168, score-0.631]

58 Step 2: Obtaining generative (red) and discriminative (green) subspaces from these domains, and sampling points (yellow) along the geodesic between them (solid and dashed lines, resp. [sent-169, score-0.744]

59 Step 3: Projecting each training data xi onto these subspaces to obtain an embedded representation f1(xi) - colored ovals (based on yi): black-city, orange-wetland, white-mountain, purple-desert. [sent-171, score-0.55]

60 Step 4: Learning a discriminative space f2 using Algo 3 (red ellipses) on f1(xi) grouped by their domains (c? [sent-172, score-0.681]

61 = c here), to infer location yt of f1(xt) (brown oval) derived from query xt. [sent-173, score-0.391]

62 and finally selecting the location yi of the nearest neighbor from Ftrain. [sent-174, score-0.278]

63 1 Clustering Besides location ‘recognition’, there could be cases where the data is not labeled. [sent-178, score-0.195]

64 Experiments We evaluate the method on two datasets, im2GPS [11], and San Francisco [3], for location recognition and clustering and present an analysis of relative merits of some design choices involved in our approach. [sent-202, score-0.365]

65 We created domains D using the procedure outlined in Section 2. [sent-213, score-0.61]

66 1, then modeled cross-domain information transfer using the geometry of subspaces derived from the domains (Sec 2. [sent-214, score-1.031]

67 = 64 domains with which the query location was inferred. [sent-220, score-0.889]

68 We then repeated the above process but with the classifier f2 trained on even finer domains, by first splitting each of the 64 domains vertically into two (c? [sent-223, score-0.655]

69 Observations: It can be seen that our method performs better overall, even by using only two features (out of the original seven), which shows the utility of the joint generative and discriminative information captured by our model. [sent-228, score-0.459]

70 Another observation is that the recognition improves with finer grouping of domains, which is intuitive since such domains are more representative of finer locations. [sent-230, score-0.758]

71 In Figure 5(c) we report the location recognition performance on two other test sets, 2K random and geographically uniform, that are provided as a part of the im2GPS dataset. [sent-231, score-0.386]

72 Utility of hierarchical formation of domains, and creating groups from them: We now study two alternate strategies to create and analyze domains D as opposed to the scheme discussed in Sec 2. [sent-233, score-0.672]

73 In the first setting we do not pursue a hierarchical scheme and use just the domains from the third level along with their grouping. [sent-235, score-0.615]

74 So we have 64 domains {Di}i8=421 that are consolidated into 16 groups {Gi}i2=15 (from Sec 2. [sent-236, score-0.638]

75 We then create generative and discriminative subspaces to analyze geodesics between as described earlier. [sent-238, score-0.697]

76 (b) Similar trends are observed in the location retrieval of query images on the default test set. [sent-246, score-0.346]

77 Results with 64 and 128 domains on these two test sets are given in the supplementary material. [sent-249, score-0.608]

78 It can be seen that the famous locations (top two rows) have retrievals that are both visually and geographically similar, while the retrievals for rows 4 and 5 are visually similar but geographically varying. [sent-253, score-0.66]

79 information from Case-A1 and consider generative and discriminative subspace pairs among all 64 domains. [sent-255, score-0.466]

80 Discriminative subspaces in the case are obtained by a twogroup class PLS in a one-vs-remaining(63 domains) setting. [sent-256, score-0.352]

81 We have 4032 subspace pairs in this case (64C2 each for generative pairs and discriminative pairs). [sent-257, score-0.459]

82 This suggests that a top-down mechanism of obtaining domains is better, and for analyzing subspaces across domains it is important to have some supervision (in terms of groups Gi) in modeling visual properties across locations. [sent-262, score-1.71]

83 Case-A1 and A2 deal with the domain and group creation aspect, whereas Case-B 1and B2 deal with obtaining the embedded representation f1(xi) using Euclidean tools instead of geometry-driven ones. [sent-268, score-0.234]

84 Utility of considering column space of Mi to perform location recognition: We then address the utility of obtaining the embedded cross-domain representation f1(xi) by considering the column span of matrix Mi. [sent-270, score-0.518]

85 1 Clustering We then performed a clustering experiment to account for cases where the data xi may not have location information yi. [sent-283, score-0.406]

86 We first created 64 random groupings of the data into domains D. [sent-285, score-0.61]

87 We then modeled cross-domain information by projecting each data xi ∈ X onto the geodesic between these subspaces to obtain f1(xi), and performed k-means clustering (Sec 2. [sent-287, score-0.726]

88 We computed the geolocation error for each xi by picking out four closest neighbors of f1(xi) from its cluster (using d¯2), computing the error between the ‘ground truth’ location yi with the ground truth location of its four neighbors, and taking the average of those four values. [sent-290, score-0.692]

89 While the clustering accuracy is not very high, we are still able to infer approximate locations without any labeled data, for a problem where visually similar images can come from vastly different locations. [sent-292, score-0.284]

90 We then created domains D and the corresponding groups G by partitioning the rectangular grid covering the city. [sent-298, score-0.674]

91 We then learnt f1 and f2 from the procedure described before to infer the locations of the test set containing 803 query images. [sent-300, score-0.234]

92 When the GPS option is used, we infer query location by computing nearest neighbors (Algorithm 3) from the training data pertaining to the domain of the query (obtained from its ground truth) and to the four domains adjacent to it. [sent-302, score-1.291]

93 Conclusion We proposed a top-down approach to jointly model generative and discriminative information portrayed by the data and demonstrated its utility for the challenging problem of location recognition and clustering, where the visual and location properties of images may not always correlate. [sent-308, score-0.882]

94 The competitive results obtained on two public datasets, along with an empirical analysis on the utility of certain design choices, seems to suggest the importance of modeling tools that are cognizant of the underlying geometric space of the data they operate on. [sent-309, score-0.267]

95 Visual word based location recognition in 3d models using distance augmented weighting. [sent-364, score-0.228]

96 (b) For every image, we compute the difference of its ‘ground truth’ location with the ground truth location of its four nearest neighbors, and consider the average ofthese location errors. [sent-401, score-0.66]

97 The k-means clustering and the corresponding random grouping of data into domains D was repeated 10 times and the average location errors are plotted. [sent-402, score-0.89]

98 Results using 64 and 128 domains are given in the supplementary material. [sent-407, score-0.608]

99 Modeling and recognition of landmark image collections using iconic scene graphs. [sent-437, score-0.25]

100 Joint people, event, and location recognition in personal photo collections using cross-domain context. [sent-456, score-0.3]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('domains', 0.574), ('subspaces', 0.352), ('location', 0.195), ('generative', 0.17), ('geographically', 0.158), ('sec', 0.146), ('utility', 0.14), ('query', 0.12), ('geodesic', 0.115), ('subspace', 0.112), ('discriminative', 0.107), ('landmark', 0.106), ('mi', 0.103), ('pls', 0.102), ('xi', 0.099), ('pertaining', 0.097), ('francisco', 0.087), ('grassmann', 0.085), ('locations', 0.082), ('mobile', 0.073), ('collections', 0.072), ('clustering', 0.07), ('groups', 0.064), ('transfer', 0.063), ('visually', 0.061), ('san', 0.06), ('gps', 0.06), ('group', 0.057), ('grassmannian', 0.056), ('seven', 0.054), ('quadrants', 0.054), ('geographic', 0.054), ('cognizant', 0.053), ('pfi', 0.053), ('retrievals', 0.053), ('pca', 0.053), ('city', 0.052), ('exponential', 0.052), ('embedded', 0.051), ('grouping', 0.051), ('finer', 0.05), ('earth', 0.049), ('yi', 0.049), ('manifold', 0.049), ('gi', 0.049), ('analyzing', 0.048), ('distinct', 0.048), ('onto', 0.048), ('domain', 0.047), ('latitude', 0.047), ('pci', 0.047), ('di', 0.046), ('yt', 0.044), ('pages', 0.043), ('information', 0.042), ('pursue', 0.041), ('tools', 0.041), ('four', 0.041), ('epitomic', 0.041), ('longitude', 0.041), ('contained', 0.04), ('pursuing', 0.039), ('vastly', 0.039), ('iconic', 0.039), ('chen', 0.038), ('obtaining', 0.038), ('choices', 0.038), ('vedantham', 0.037), ('orthonormal', 0.037), ('created', 0.036), ('grzeszczuk', 0.036), ('srivastava', 0.036), ('concentration', 0.036), ('performing', 0.036), ('experimented', 0.035), ('pairs', 0.035), ('baatz', 0.035), ('gopalan', 0.035), ('nearest', 0.034), ('matrix', 0.034), ('supplementary', 0.034), ('famous', 0.034), ('geodesics', 0.034), ('analyze', 0.034), ('reproduced', 0.033), ('prioritized', 0.033), ('recognition', 0.033), ('public', 0.033), ('discriminant', 0.032), ('correspond', 0.032), ('infer', 0.032), ('splitting', 0.031), ('neighbors', 0.031), ('default', 0.031), ('across', 0.03), ('column', 0.03), ('xt', 0.03), ('chose', 0.03), ('amongst', 0.029), ('merits', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999964 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering

Author: Raghuraman Gopalan

Abstract: Estimating geographic location from images is a challenging problem that is receiving recent attention. In contrast to many existing methods that primarily model discriminative information corresponding to different locations, we propose joint learning of information that images across locations share and vary upon. Starting with generative and discriminative subspaces pertaining to domains, which are obtained by a hierarchical grouping of images from adjacent locations, we present a top-down approach that first models cross-domain information transfer by utilizing the geometry ofthese subspaces, and then encodes the model results onto individual images to infer their location. We report competitive results for location recognition and clustering on two public datasets, im2GPS and San Francisco, and empirically validate the utility of various design choices involved in the approach.

2 0.24762967 215 cvpr-2013-Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces

Author: Shaokang Chen, Conrad Sanderson, Mehrtash T. Harandi, Brian C. Lovell

Abstract: Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

3 0.21490853 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries

Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa

Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.

4 0.20785192 135 cvpr-2013-Discriminative Subspace Clustering

Author: Vasileios Zografos, Liam Ellis, Rudolf Mester

Abstract: We present a novel method for clustering data drawn from a union of arbitrary dimensional subspaces, called Discriminative Subspace Clustering (DiSC). DiSC solves the subspace clustering problem by using a quadratic classifier trained from unlabeled data (clustering by classification). We generate labels by exploiting the locality of points from the same subspace and a basic affinity criterion. A number of classifiers are then diversely trained from different partitions of the data, and their results are combined together in an ensemble, in order to obtain the final clustering result. We have tested our method with 4 challenging datasets and compared against 8 state-of-the-art methods from literature. Our results show that DiSC is a very strong performer in both accuracy and robustness, and also of low computational complexity.

5 0.18709667 253 cvpr-2013-Learning Multiple Non-linear Sub-spaces Using K-RBMs

Author: Siddhartha Chandra, Shailesh Kumar, C.V. Jawahar

Abstract: Understanding the nature of data is the key to building good representations. In domains such as natural images, the data comes from very complex distributions which are hard to capture. Feature learning intends to discover or best approximate these underlying distributions and use their knowledge to weed out irrelevant information, preserving most of the relevant information. Feature learning can thus be seen as a form of dimensionality reduction. In this paper, we describe a feature learning scheme for natural images. We hypothesize that image patches do not all come from the same distribution, they lie in multiple nonlinear subspaces. We propose a framework that uses K Restricted Boltzmann Machines (K-RBMS) to learn multiple non-linear subspaces in the raw image space. Projections of the image patches into these subspaces gives us features, which we use to build image representations. Our algorithm solves the coupled problem of finding the right non-linear subspaces in the input space and associating image patches with those subspaces in an iterative EM like algorithm to minimize the overall reconstruction error. Extensive empirical results over several popular image classification datasets show that representations based on our framework outperform the traditional feature representations such as the SIFT based Bag-of-Words (BoW) and convolutional deep belief networks.

6 0.17979573 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation

7 0.17640871 237 cvpr-2013-Kernel Learning for Extrinsic Classification of Manifold Features

8 0.17221385 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources

9 0.1526197 233 cvpr-2013-Joint Sparsity-Based Representation and Analysis of Unconstrained Activities

10 0.13974793 405 cvpr-2013-Sparse Subspace Denoising for Image Manifolds

11 0.12571014 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

12 0.12305457 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

13 0.12016556 6 cvpr-2013-A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems

14 0.11964028 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

15 0.11292444 456 cvpr-2013-Visual Place Recognition with Repetitive Structures

16 0.10941348 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories

17 0.10491274 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval

18 0.097377196 379 cvpr-2013-Scalable Sparse Subspace Clustering

19 0.096496366 109 cvpr-2013-Dense Non-rigid Point-Matching Using Random Projections

20 0.093157053 99 cvpr-2013-Cross-View Image Geolocalization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.212), (1, -0.06), (2, -0.082), (3, 0.051), (4, 0.036), (5, -0.004), (6, -0.064), (7, -0.12), (8, -0.05), (9, -0.077), (10, 0.016), (11, -0.021), (12, -0.071), (13, -0.095), (14, -0.109), (15, -0.062), (16, -0.095), (17, -0.058), (18, -0.122), (19, -0.137), (20, 0.135), (21, -0.122), (22, 0.038), (23, -0.008), (24, -0.08), (25, -0.096), (26, 0.068), (27, -0.159), (28, 0.001), (29, 0.02), (30, 0.046), (31, 0.06), (32, 0.078), (33, -0.015), (34, -0.06), (35, -0.065), (36, 0.051), (37, -0.009), (38, -0.13), (39, 0.108), (40, -0.015), (41, -0.007), (42, -0.073), (43, 0.035), (44, -0.052), (45, 0.084), (46, 0.046), (47, -0.083), (48, -0.053), (49, -0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.94331092 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering

Author: Raghuraman Gopalan

Abstract: Estimating geographic location from images is a challenging problem that is receiving recent attention. In contrast to many existing methods that primarily model discriminative information corresponding to different locations, we propose joint learning of information that images across locations share and vary upon. Starting with generative and discriminative subspaces pertaining to domains, which are obtained by a hierarchical grouping of images from adjacent locations, we present a top-down approach that first models cross-domain information transfer by utilizing the geometry ofthese subspaces, and then encodes the model results onto individual images to infer their location. We report competitive results for location recognition and clustering on two public datasets, im2GPS and San Francisco, and empirically validate the utility of various design choices involved in the approach.

2 0.77934182 215 cvpr-2013-Improved Image Set Classification via Joint Sparse Approximated Nearest Subspaces

Author: Shaokang Chen, Conrad Sanderson, Mehrtash T. Harandi, Brian C. Lovell

Abstract: Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

3 0.71859139 135 cvpr-2013-Discriminative Subspace Clustering

Author: Vasileios Zografos, Liam Ellis, Rudolf Mester

Abstract: We present a novel method for clustering data drawn from a union of arbitrary dimensional subspaces, called Discriminative Subspace Clustering (DiSC). DiSC solves the subspace clustering problem by using a quadratic classifier trained from unlabeled data (clustering by classification). We generate labels by exploiting the locality of points from the same subspace and a basic affinity criterion. A number of classifiers are then diversely trained from different partitions of the data, and their results are combined together in an ensemble, in order to obtain the final clustering result. We have tested our method with 4 challenging datasets and compared against 8 state-of-the-art methods from literature. Our results show that DiSC is a very strong performer in both accuracy and robustness, and also of low computational complexity.

4 0.65604144 405 cvpr-2013-Sparse Subspace Denoising for Image Manifolds

Author: Bo Wang, Zhuowen Tu

Abstract: With the increasing availability of high dimensional data and demand in sophisticated data analysis algorithms, manifold learning becomes a critical technique to perform dimensionality reduction, unraveling the intrinsic data structure. The real-world data however often come with noises and outliers; seldom, all the data live in a single linear subspace. Inspired by the recent advances in sparse subspace learning and diffusion-based approaches, we propose a new manifold denoising algorithm in which data neighborhoods are adaptively inferred via sparse subspace reconstruction; we then derive a new formulation to perform denoising to the original data. Experiments carried out on both toy and real applications demonstrate the effectiveness of our method; it is insensitive to parameter tuning and we show significant improvement over the competing algorithms.

5 0.60998607 109 cvpr-2013-Dense Non-rigid Point-Matching Using Random Projections

Author: Raffay Hamid, Dennis Decoste, Chih-Jen Lin

Abstract: We present a robust and efficient technique for matching dense sets of points undergoing non-rigid spatial transformations. Our main intuition is that the subset of points that can be matched with high confidence should be used to guide the matching procedure for the rest. We propose a novel algorithm that incorporates these high-confidence matches as a spatial prior to learn a discriminative subspace that simultaneously encodes both the feature similarity as well as their spatial arrangement. Conventional subspace learning usually requires spectral decomposition of the pair-wise distance matrix across the point-sets, which can become inefficient even for moderately sized problems. To this end, we propose the use of random projections for approximate subspace learning, which can provide significant time improvements at the cost of minimal precision loss. This efficiency gain allows us to iteratively find and remove high-confidence matches from the point sets, resulting in high recall. To show the effectiveness of our approach, we present a systematic set of experiments and results for the problem of dense non-rigid image-feature matching.

6 0.60306734 253 cvpr-2013-Learning Multiple Non-linear Sub-spaces Using K-RBMs

7 0.5754379 379 cvpr-2013-Scalable Sparse Subspace Clustering

8 0.52769518 259 cvpr-2013-Learning a Manifold as an Atlas

9 0.51822895 191 cvpr-2013-Graph-Laplacian PCA: Closed-Form Solution and Robustness

10 0.51224571 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation

11 0.49312383 237 cvpr-2013-Kernel Learning for Extrinsic Classification of Manifold Features

12 0.48807189 260 cvpr-2013-Learning and Calibrating Per-Location Classifiers for Visual Place Recognition

13 0.48088631 42 cvpr-2013-Analytic Bilinear Appearance Subspace Construction for Modeling Image Irradiance under Natural Illumination and Non-Lambertian Reflectance

14 0.46343324 189 cvpr-2013-Graph-Based Discriminative Learning for Location Recognition

15 0.46233734 276 cvpr-2013-MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification

16 0.46177959 134 cvpr-2013-Discriminative Sub-categorization

17 0.46101242 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries

18 0.45290488 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning

19 0.44999465 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources

20 0.44668353 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(7, 0.011), (10, 0.108), (16, 0.03), (26, 0.069), (28, 0.012), (33, 0.31), (38, 0.039), (59, 0.021), (67, 0.056), (68, 0.063), (69, 0.09), (87, 0.101)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.96494269 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics

Author: Bo Zheng, Yibiao Zhao, Joey C. Yu, Katsushi Ikeuchi, Song-Chun Zhu

Abstract: In this paper, we present an approach for scene understanding by reasoning physical stability of objects from point cloud. We utilize a simple observation that, by human design, objects in static scenes should be stable with respect to gravity. This assumption is applicable to all scene categories and poses useful constraints for the plausible interpretations (parses) in scene understanding. Our method consists of two major steps: 1) geometric reasoning: recovering solid 3D volumetric primitives from defective point cloud; and 2) physical reasoning: grouping the unstable primitives to physically stable objects by optimizing the stability and the scene prior. We propose to use a novel disconnectivity graph (DG) to represent the energy landscape and use a Swendsen-Wang Cut (MCMC) method for optimization. In experiments, we demonstrate that the algorithm achieves substantially better performance for i) object segmentation, ii) 3D volumetric recovery of the scene, and iii) better parsing result for scene understanding in comparison to state-of-the-art methods in both public dataset and our own new dataset.

2 0.96476626 194 cvpr-2013-Groupwise Registration via Graph Shrinkage on the Image Manifold

Author: Shihui Ying, Guorong Wu, Qian Wang, Dinggang Shen

Abstract: Recently, groupwise registration has been investigated for simultaneous alignment of all images without selecting any individual image as the template, thus avoiding the potential bias in image registration. However, none of current groupwise registration method fully utilizes the image distribution to guide the registration. Thus, the registration performance usually suffers from large inter-subject variations across individual images. To solve this issue, we propose a novel groupwise registration algorithm for large population dataset, guided by the image distribution on the manifold. Specifically, we first use a graph to model the distribution of all image data sitting on the image manifold, with each node representing an image and each edge representing the geodesic pathway between two nodes (or images). Then, the procedure of warping all images to theirpopulation center turns to the dynamic shrinking ofthe graph nodes along their graph edges until all graph nodes become close to each other. Thus, the topology ofimage distribution on the image manifold is always preserved during the groupwise registration. More importantly, by modeling , the distribution of all images via a graph, we can potentially reduce registration error since every time each image is warped only according to its nearby images with similar structures in the graph. We have evaluated our proposed groupwise registration method on both synthetic and real datasets, with comparison to the two state-of-the-art groupwise registration methods. All experimental results show that our proposed method achieves the best performance in terms of registration accuracy and robustness.

3 0.96230662 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities

Author: Horst Possegger, Sabine Sternig, Thomas Mauthner, Peter M. Roth, Horst Bischof

Abstract: Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on a common 2D ground-plane. To overcome these limitations, we introduce the concept of an occupancy volume exploiting the full geometry and the objects ’ center of mass and develop an efficient algorithm for 3D object tracking. Individual objects are tracked using the local mass density scores within a particle filter based approach, constrained by a Voronoi partitioning between nearby trackers. Our method benefits from the geometric knowledge given by the occupancy volume to robustly extract features and train classifiers on-demand, when volumetric information becomes unreliable. We evaluate our approach on several challenging real-world scenarios including the public APIDIS dataset. Experimental evaluations demonstrate significant improvements compared to state-of-theart methods, while achieving real-time performance. – –

4 0.962084 78 cvpr-2013-Capturing Layers in Image Collections with Componential Models: From the Layered Epitome to the Componential Counting Grid

Author: Alessandro Perina, Nebojsa Jojic

Abstract: Recently, the Counting Grid (CG) model [5] was developed to represent each input image as a point in a large grid of feature counts. This latent point is a corner of a window of grid points which are all uniformly combined to match the (normalized) feature counts in the image. Being a bag of word model with spatial layout in the latent space, the CG model has superior handling of field of view changes in comparison to other bag of word models, but with the price of being essentially a mixture, mapping each scene to a single window in the grid. In this paper we introduce a family of componential models, dubbed the Componential Counting Grid, whose members represent each input image by multiple latent locations, rather than just one. In this way, we make a substantially more flexible admixture model which captures layers or parts of images and maps them to separate windows in a Counting Grid. We tested the models on scene and place classification where their com- ponential nature helped to extract objects, to capture parallax effects, thus better fitting the data and outperforming Counting Grids and Latent Dirichlet Allocation, especially on sequences taken with wearable cameras.

5 0.96056044 350 cvpr-2013-Reconstructing Loopy Curvilinear Structures Using Integer Programming

Author: Engin Türetken, Fethallah Benmansour, Bjoern Andres, Hanspeter Pfister, Pascal Fua

Abstract: We propose a novel approach to automated delineation of linear structures that form complex and potentially loopy networks. This is in contrast to earlier approaches that usually assume a tree topology for the networks. At the heart of our method is an Integer Programming formulation that allows us to find the global optimum of an objective function designed to allow cycles but penalize spurious junctions and early terminations. We demonstrate that it outperforms state-of-the-art techniques on a wide range of datasets.

6 0.95711839 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

7 0.95650518 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images

8 0.95611799 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects

9 0.95590091 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds

10 0.95584989 19 cvpr-2013-A Minimum Error Vanishing Point Detection Approach for Uncalibrated Monocular Images of Man-Made Environments

11 0.95547366 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

12 0.95457852 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models

13 0.95436549 292 cvpr-2013-Multi-agent Event Detection: Localization and Role Assignment

same-paper 14 0.9541555 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering

15 0.95261919 56 cvpr-2013-Bayesian Depth-from-Defocus with Shading Constraints

16 0.95261073 331 cvpr-2013-Physically Plausible 3D Scene Tracking: The Single Actor Hypothesis

17 0.95258951 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection

18 0.95241076 80 cvpr-2013-Category Modeling from Just a Single Labeling: Use Depth Information to Guide the Learning of 2D Models

19 0.95209444 227 cvpr-2013-Intrinsic Scene Properties from a Single RGB-D Image

20 0.95172113 155 cvpr-2013-Exploiting the Power of Stereo Confidences