cvpr cvpr2013 cvpr2013-419 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jie Ni, Qiang Qiu, Rama Chellappa
Abstract: Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present exumd .edu , rama@umiacs .umd .edu training and testing data are captured from the same underlying distribution. Yet this assumption is often violated in many real life applications. For instance, images collected from an internet search engine are compared with those captured from real life [28, 4]. Face recognition systems trained on frontal and high resolution images, are applied to probe images with non-frontal poses and low resolution [6]. Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. We show some examples of dataset shifts in Figure 1. In these scenarios, magnitudes of variations of innate characteristics, which distinguish one class from another, are oftentimes smaller than the variations caused by distribution shift between training and testing dataset. Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. [sent-4, score-0.976]
2 This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. [sent-5, score-0.683]
3 We propose to interpolate subspaces through dictionary learning to link the source and target domains. [sent-6, score-0.797]
4 These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. [sent-7, score-0.995]
5 Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. [sent-8, score-0.752]
6 Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. [sent-16, score-0.486]
7 Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art. [sent-19, score-0.432]
8 This is often known as the domain adaptation problem which has recently drawn much attention in the computer vision community [28, 14, 13, 17]. [sent-26, score-0.451]
9 Domain Adaptation (DA) aims to utilize a source domain with plenty of labeled data to learn a classifier for a target domain which is collected from a different distribution. [sent-27, score-1.257]
10 Semi-supervised DA leverages the few labels in the target data or correspon- dence between the source and target data to reduce the divergence between two domains. [sent-29, score-0.757]
11 Given labeled data in the source domain and unlabeled data in the target domain, our DA procedure learns a set of intermediate {Dk}kK=−11) and the target domain (represented by dictionary DK) {ΔDk }kK=−01 characterize the gradual transition between these subspaces. [sent-35, score-2.078]
12 domains (represented by dictionaries shift between two domains. [sent-36, score-0.446]
13 to capture the intrinsic domain As it is very costly to collect labels for target data under various acquisition conditions ‘in the wild’, it is more desirable that the recognition system be able to adapt in an unsupervised fashion. [sent-37, score-0.832]
14 In this paper, we use subspace representations to model the source and target domains. [sent-39, score-0.571]
15 In this work, we use a dictionary to represent one domain, as dictionary learning based methods [1, 24] have recently become very popular for subspace modeling. [sent-42, score-0.497]
16 Specifically, the presence of domain shifts violates the assumption that test data lie in the linear span of training data. [sent-48, score-0.394]
17 As the dictionary atoms learned from one domain are not optimal to fit a different domain, and only a small subset of the atoms are allowed for representation, it will incur large reconstruction errors for the target data. [sent-49, score-1.137]
18 Further, signals of the same class in the target domain will not have similar sparse codes as those from the source domain. [sent-50, score-0.942]
19 Therefore, effectively leverage unlabeled target data to adapt the dictionary from one domain to another while maintaining certain invariant representation becomes crucial for successful DA. [sent-52, score-0.91]
20 We hypothesize existence of a virtual path which smoothly connects the source and target domains. [sent-55, score-0.578]
21 Imagine the source domain consists of face images in the frontal view while the target domain contains those in the profile view. [sent-56, score-1.393]
22 Intuitively, face images which gradually transform from the frontal to profile view will form a smooth transition path. [sent-57, score-0.391]
23 Recovering intermediate representations along the transition path allows us to more likely capture the underlying domain shift, as well as to build meaningful feature representations which are preserved across different domains. [sent-58, score-0.894]
24 Specifically, we sample several intermediate domains along a virtual path between the source and target domains, and represent each intermediate domain using a dictionary. [sent-60, score-1.527]
25 We then utilize the good reconstruction property of dictionaries, and learn the set of intermediate domain dictionaries which incrementally reduce the reconstruction residue of the target data. [sent-61, score-1.254]
26 In the mean time, we constrain the magnitude of changes between dictionaries for adjacent intermediate domains to ensure the smoothness ofthe transition path ( refer to Figure 2 for an illustration). [sent-62, score-0.729]
27 (2) We then apply invariant sparse codes across the source, intermediate and target domains to render inter- mediate representations, which convey a smooth transition in the data signal space. [sent-63, score-0.972]
28 It also provides a shared feature representation where the sample differences caused by distribution shifts are reduced, and we utilize this new feature representation for cross domain recognition. [sent-64, score-0.486]
29 (3) We provide a quantification of domain shift by measuring the similarity between the source and target domain dictionaries which are learned using our DA approach. [sent-65, score-1.567]
30 Presented with multiple domains, this quantitative measure can be exploited to select the optimal domain to adapt to. [sent-66, score-0.41]
31 (4) We demonstrate the wide applicability of our approach for face recognition across pose, illumination and blur variations, cross dataset object recognition, and report the improved performance of our approach over existing DA methods. [sent-67, score-0.432]
32 In Section 3, we present our general unsupervised DA approach supported by a quantitative measure of domain shift. [sent-69, score-0.435]
33 Semi-supervised DA methods rely on labeled target data to perform cross domain classification. [sent-75, score-0.723]
34 Metric learning approaches [28, 18] were also proposed to learn a cross domain transformation to link two domains. [sent-79, score-0.419]
35 [17] utilized low-rank reconstructions to learn a transformation so that the transformed source samples can be linearly reconstructed by the target samples. [sent-81, score-0.528]
36 Given no labels in the target domain to learn the similarity measure between data instances across domains, unsupervised DA is more difficult to tackle. [sent-82, score-0.763]
37 Therefore it usually enforces certain prior assumptions to relate source and target data. [sent-83, score-0.486]
38 The techniques in [25, 26] reduce the distance across two domains by learning a latent feature space where domain similarity is measured through maximum mean discrepancy. [sent-86, score-0.59]
39 Shi and Sha [29] define an information-theoretic measure which balances between maximizing domain similarity and minimizing expected classification error on the target domain. [sent-87, score-0.65]
40 Two recent approaches [14], [13] in the computer vision community are more relevant to our methodology, where the source and target domains are linked by sampling finite or infinite number of intermediate subspaces on the Grassmannian manifold. [sent-88, score-0.932]
41 These intermediate subspaces appear to be able to capture the intrinsic domain shift. [sent-89, score-0.627]
42 Compared to their abstract manifold walking strategies, our approach emphasizes on synthesizing intermediate subspaces in a manner which gradually reduces the reconstruction residue of the target data. [sent-90, score-0.802]
43 Domain invariant sparse codes are designed for cross domain recognition, alignment and synthesis. [sent-92, score-0.492]
44 Let Ys ∈ Rn∗Ns, Yt ∈ Rn∗Nt be the data instances from the source and target d∈om Rain respectively, where n is the dimension of the data instance, Ns and Nt denote the number of samples in the source and target domains. [sent-97, score-0.972]
45 Let D0 ∈ Rn∗m be the dictionary learned from Ys using standard∈ dictionary learning methods, e. [sent-98, score-0.484]
46 As introduced in Section 1, our approach samples several intermediate domains from a smooth transition path between the source and target domains. [sent-100, score-1.095]
47 We associate each intermediate domain with a dictionary Dk , k ∈ [1, K], where K is the number of intermediate domai,nks w∈h [i1c,hK Kw],il w bhee dreet Kerm isin theed in our DA approach. [sent-101, score-1.002]
48 Learning Intermediate Domain Dictionaries Starting from the source domain dictionary D0, we sequentially learn the intermediate domain dictionaries {Dk}kK=1 to gradually adapt to the target data. [sent-104, score-1.871]
49 The final dictionary DK which best represents the target data in terms of reconstruction error is taken as the target domain dictionary. [sent-106, score-1.18]
50 Given the k-th domain dictionary Dk , k ∈ [0, K − 1], we learn the next domain dictionary Dk+1 kb ∈ase [d0 on i t−s c1o],he wreence with Dk and the remaining residue of the target data. [sent-107, score-1.614]
51 Specifically, we decompose the target data Yt with Dk and get the reconstruction residue Jk: Γk= argΓmin? [sent-108, score-0.494]
52 The next intermediate dois main dictionary Dk+1 is then obtained as: Dk+1 = Dk + ΔDk (5) Note that when λ = 0, the Method of Optimal Direction (MOD) [12] becomes a special case of equation (3), where no regularization is enforced. [sent-130, score-0.437]
53 Starting from the source domain dictionary D0, we apply the above adaptation framework iteratively, and stop the procedure when the magnitude of ? [sent-131, score-0.895]
54 two domains is absorbed into the learned intermediate domain dictionaries. [sent-135, score-0.767]
55 This stopping criteria also automatically gives the number of intermediate domains to sample from the transition path. [sent-136, score-0.571]
56 t the current intermediate domain dictionary and the encoding coefficients. [sent-140, score-0.794]
57 Algorithm 1 Algorithm to interpolate intermediate subspaces between source and target domains. [sent-150, score-0.776]
58 1:Input: Dictionary D0trained from the source data, target data Yt, sparsity level T, stopping threshold δ, parameter λ, k = 0. [sent-151, score-0.536]
59 2: Output: Dictionaries {Dk}kK=−11 for the intermediate Odoumtpauint:s, dictionary sD {KD Dfor} the target domain. [sent-152, score-0.708]
60 Recognition Under Domain Shift Up to now, we have learned a transition path which is encoded with the underlying domain shift. [sent-159, score-0.585]
61 This provides us with rich information to obtain new representations to associate source and target data. [sent-160, score-0.532]
62 Here, we simply apply invariant sparse codes across the source, intermediate, target domain dictionaries {Dk}kK=0. [sent-161, score-0.879]
63 , (DKα)T]T where α ∈ Rm is the sparse code of a source data signal decomposed Rwith D0, or a target data signal decomposed with DK. [sent-165, score-0.679]
64 This new representation incorporates the smooth domain transition recovered in the intermediate dictionaries into the signal space. [sent-166, score-0.894]
65 It brings the source and target data into a shared feature space where the data distribution shift is mitigated. [sent-167, score-0.643]
66 Given the new feature vectors, we apply PCA for dimension reduction1 , and then employ a SVM classifier for cross domain recognition. [sent-169, score-0.419]
67 For instance, we may be faced with more than one source domains in some scenarios. [sent-173, score-0.391]
68 QDS will allow us to select the optimal source domain which has the least domain shift w. [sent-174, score-1.056]
69 We propose to obtain QDS by measuring the similarity between the source domain dictionary D0 and the target domain dictionary DK which is learned using Algorithm 1. [sent-177, score-1.706]
70 This similarity characterizes the amount of domain shift encoded along the transition path. [sent-178, score-0.643]
71 en D0 and DK, and less domain shift along the learned transition path. [sent-183, score-0.647]
72 Similarly, by reversing the role of source and target domain to learn the transition path, we can obtain Qt,s which is the amount of shift from target to source domain. [sent-184, score-1.593]
73 We selected the frontal face images as the source domain, with a total of 1428 images. [sent-193, score-0.408]
74 83468 target domain contains images at different poses, which are denoted as c05 and c29 (yawning about ±22. [sent-202, score-0.628]
75 5hose the farnodnt c-1il1lum (yianwatneindg source images to be the labeled data in the source domain. [sent-205, score-0.463]
76 The task is to determine the identity of the images in the target domain with the same illumination condition. [sent-206, score-0.693]
77 1) Baseline K-SVD [1], where target data is directly decomposed with the dictionary learned from the source domain, and the resulting sparse codes are compared using a nearest neighbor classifier. [sent-209, score-0.87]
78 As our DA approach gradually updates the dictionary learned from frontal face images using non-frontal images, these transformed representations thus convey the transition process in this scenario. [sent-216, score-0.734]
79 The remaining images with the other 10 illumination conditions were convolved with a blur kernel to form the target domain. [sent-225, score-0.486]
80 Synthesized intermediate representations between frontal face images and face images at pose c11. [sent-227, score-0.601]
81 The first row shows the transformed images from a source image (in red box) to the target domain. [sent-228, score-0.528]
82 The second row shows the transformed images from a target image (in green box) to the source domain. [sent-229, score-0.528]
83 Since the domain shift in this experiment consists of both illumination and blur variations, traditional methods which are only illumination insensitive or robust to blur are not able to fully handle both variations. [sent-240, score-0.893]
84 We also show transformed intermediate representations along the transition path of our approach in Figure 4, which clearly captures the transition from clear to blur images and vice versa. [sent-242, score-0.764]
85 Synthesized intermediate representations from face recognition across blur and illumination variations (motion blur with length of 9). [sent-279, score-0.792]
86 The first row shows the transformed images from a source image (in red box) to the target domain. [sent-280, score-0.528]
87 The second row shows the transformed images from a target image (in green box) to the source domain. [sent-281, score-0.528]
88 We report performance on eight different pairs of source and target combinations. [sent-295, score-0.486]
89 We ran 20 different trials corresponding to different selections of labeled data from the source and target domains. [sent-299, score-0.519]
90 It is seen that baseline K-SVD has the lowest recognition rate except for one pair of source and target combination in the semi-supervised setting. [sent-301, score-0.518]
91 Average reconstruction (b) error of the target domain decomposed (c) with the source and intermediate domains. [sent-307, score-1.159]
92 The combinations of source and target domains are (a) frontal face images v. [sent-308, score-0.855]
93 l279tAe586ch Decrease of reconstruction residue along the transition path: Figure 6 shows the average reconstruction residue of target data decomposed with the source, and intermediate domain dictionaries {Dk}kK=0 along the transittieornm path ew dhoicmha were lteioarnnaerdie using Algorithm 1. [sent-320, score-1.683]
94 We provide results on three pairs of source and target combinations: frontal face images v. [sent-321, score-0.679]
95 These quantitative values of domain shift are in line with our experimental performance, i. [sent-331, score-0.484]
96 , higher QDS values indicate less domain shift, and a higher recognition rate between the corresponding two domains. [sent-333, score-0.389]
97 Conclusions We presented a fully unsupervised DA method by incrementally learning intermediate domain dictionaries to capture the underlying domain shift. [sent-335, score-1.143]
98 This allows us to transform original data instances from different modalities into a shared feature representation, which serves as a robust sig- nature for cross domain classification. [sent-336, score-0.449]
99 Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach. [sent-384, score-0.451]
100 Information-theoretical learning of discriminative clusters for unsupervised domain adaptation. [sent-571, score-0.435]
wordName wordTfidf (topN-words)
[('dk', 0.428), ('domain', 0.357), ('target', 0.271), ('dictionary', 0.229), ('da', 0.221), ('source', 0.215), ('intermediate', 0.208), ('domains', 0.176), ('residue', 0.171), ('qds', 0.169), ('jk', 0.166), ('dictionaries', 0.143), ('transition', 0.137), ('blur', 0.129), ('shift', 0.127), ('kt', 0.117), ('face', 0.109), ('atoms', 0.101), ('adaptation', 0.094), ('frontal', 0.084), ('unsupervised', 0.078), ('gfk', 0.069), ('webcam', 0.068), ('path', 0.065), ('illumination', 0.065), ('subspaces', 0.062), ('cross', 0.062), ('dslr', 0.06), ('decomposed', 0.056), ('tut', 0.056), ('yt', 0.053), ('adapt', 0.053), ('sgf', 0.052), ('reconstruction', 0.052), ('caltech', 0.051), ('amazon', 0.051), ('stopping', 0.05), ('quantification', 0.049), ('pages', 0.048), ('kk', 0.047), ('representations', 0.046), ('pose', 0.045), ('codes', 0.044), ('qiu', 0.043), ('angel', 0.042), ('kjktjk', 0.042), ('yawning', 0.042), ('transformed', 0.042), ('subspace', 0.039), ('variations', 0.039), ('life', 0.039), ('gradually', 0.038), ('shifts', 0.037), ('synthesized', 0.037), ('across', 0.035), ('albedo', 0.034), ('jhuo', 0.033), ('kwok', 0.033), ('labeled', 0.033), ('recognition', 0.032), ('rama', 0.031), ('tr', 0.031), ('proposition', 0.031), ('shared', 0.03), ('mairal', 0.03), ('sparse', 0.029), ('biswas', 0.029), ('diag', 0.028), ('aggarwal', 0.028), ('virtual', 0.027), ('signal', 0.026), ('tsang', 0.026), ('ijcai', 0.026), ('signals', 0.026), ('learned', 0.026), ('nt', 0.026), ('correspondence', 0.025), ('saenko', 0.025), ('chellappa', 0.024), ('kulis', 0.024), ('sha', 0.024), ('collected', 0.024), ('proof', 0.024), ('bach', 0.023), ('convey', 0.023), ('substitute', 0.023), ('ys', 0.023), ('smooth', 0.023), ('elad', 0.023), ('pan', 0.023), ('deconvolution', 0.023), ('similarity', 0.022), ('insensitive', 0.021), ('conditions', 0.021), ('shi', 0.021), ('costly', 0.02), ('adjustment', 0.02), ('interpolate', 0.02), ('june', 0.02), ('surf', 0.02)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
Author: Jie Ni, Qiang Qiu, Rama Chellappa
Abstract: Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present exumd .edu , rama@umiacs .umd .edu training and testing data are captured from the same underlying distribution. Yet this assumption is often violated in many real life applications. For instance, images collected from an internet search engine are compared with those captured from real life [28, 4]. Face recognition systems trained on frontal and high resolution images, are applied to probe images with non-frontal poses and low resolution [6]. Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. We show some examples of dataset shifts in Figure 1. In these scenarios, magnitudes of variations of innate characteristics, which distinguish one class from another, are oftentimes smaller than the variations caused by distribution shift between training and testing dataset. Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
2 0.44378743 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
3 0.39600596 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints
Author: Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell
Abstract: Most successful object classification and detection methods rely on classifiers trained on large labeled datasets. However, for domains where labels are limited, simply borrowing labeled data from existing datasets can hurt performance, a phenomenon known as “dataset bias.” We propose a general framework for adapting classifiers from “borrowed” data to the target domain using a combination of available labeled and unlabeled examples. Specifically, we show that imposing smoothness constraints on the classifier scores over the unlabeled data can lead to improved adaptation results. Such constraints are often available in the form of instance correspondences, e.g. when the same object or individual is observed simultaneously from multiple views, or tracked between video frames. In these cases, the object labels are unknown but can be constrained to be the same or similar. We propose techniques that build on existing domain adaptation methods by explicitly modeling these relationships, and demonstrate empirically that they improve recognition accuracy in two scenarios, multicategory image classification and object detection in video.
4 0.3699089 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources
Author: Lin Chen, Lixin Duan, Dong Xu
Abstract: In this work, we propose to leverage a large number of loosely labeled web videos (e.g., from YouTube) and web images (e.g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos. We formulate this task as a new multi-domain adaptation problem with heterogeneous sources, in which the samples from different source domains can be represented by different types of features with different dimensions (e.g., the SIFTfeaturesfrom web images and space-time (ST) features from web videos) while the target domain samples have all types of features. To effectively cope with the heterogeneous sources where some source domains are more relevant to the target domain, we propose a new method called Multi-domain Adaptation with Heterogeneous Sources (MDA-HS) to learn an optimal target classifier, in which we simultaneously seek the optimal weights for different source domains with different types of features as well as infer the labels of unlabeled target domain data based on multiple types of features. We solve our optimization problem by using the cutting-plane algorithm based on group-based multiple kernel learning. Comprehensive experiments on two datasets demonstrate the effectiveness of MDA-HS for event recognition in consumer videos.
5 0.23690537 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
7 0.22691381 392 cvpr-2013-Separable Dictionary Learning
8 0.20804578 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
9 0.18943323 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
10 0.17979573 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering
11 0.17182995 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
12 0.16291051 315 cvpr-2013-Online Robust Dictionary Learning
13 0.15388286 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning
14 0.15034616 422 cvpr-2013-Tag Taxonomy Aware Dictionary Learning for Region Tagging
15 0.14210974 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
16 0.13394845 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
17 0.13266714 265 cvpr-2013-Learning to Estimate and Remove Non-uniform Image Blur
18 0.11740869 164 cvpr-2013-Fast Convolutional Sparse Coding
19 0.11426979 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning
20 0.11326716 459 cvpr-2013-Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots
topicId topicWeight
[(0, 0.22), (1, -0.107), (2, -0.251), (3, 0.241), (4, -0.117), (5, 0.002), (6, 0.067), (7, 0.002), (8, 0.087), (9, 0.065), (10, 0.002), (11, -0.048), (12, -0.023), (13, -0.025), (14, -0.151), (15, -0.122), (16, -0.021), (17, -0.125), (18, -0.114), (19, -0.064), (20, -0.156), (21, -0.219), (22, -0.101), (23, -0.061), (24, -0.02), (25, -0.037), (26, 0.126), (27, -0.191), (28, -0.092), (29, -0.019), (30, -0.014), (31, -0.061), (32, 0.05), (33, 0.048), (34, -0.022), (35, -0.09), (36, 0.037), (37, 0.091), (38, -0.018), (39, 0.11), (40, 0.015), (41, 0.1), (42, 0.027), (43, -0.016), (44, -0.066), (45, -0.049), (46, 0.087), (47, 0.022), (48, -0.006), (49, -0.094)]
simIndex simValue paperId paperTitle
same-paper 1 0.98007351 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
Author: Jie Ni, Qiang Qiu, Rama Chellappa
Abstract: Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present exumd .edu , rama@umiacs .umd .edu training and testing data are captured from the same underlying distribution. Yet this assumption is often violated in many real life applications. For instance, images collected from an internet search engine are compared with those captured from real life [28, 4]. Face recognition systems trained on frontal and high resolution images, are applied to probe images with non-frontal poses and low resolution [6]. Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. We show some examples of dataset shifts in Figure 1. In these scenarios, magnitudes of variations of innate characteristics, which distinguish one class from another, are oftentimes smaller than the variations caused by distribution shift between training and testing dataset. Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
2 0.8507635 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
3 0.81818515 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources
Author: Lin Chen, Lixin Duan, Dong Xu
Abstract: In this work, we propose to leverage a large number of loosely labeled web videos (e.g., from YouTube) and web images (e.g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos. We formulate this task as a new multi-domain adaptation problem with heterogeneous sources, in which the samples from different source domains can be represented by different types of features with different dimensions (e.g., the SIFTfeaturesfrom web images and space-time (ST) features from web videos) while the target domain samples have all types of features. To effectively cope with the heterogeneous sources where some source domains are more relevant to the target domain, we propose a new method called Multi-domain Adaptation with Heterogeneous Sources (MDA-HS) to learn an optimal target classifier, in which we simultaneously seek the optimal weights for different source domains with different types of features as well as infer the labels of unlabeled target domain data based on multiple types of features. We solve our optimization problem by using the cutting-plane algorithm based on group-based multiple kernel learning. Comprehensive experiments on two datasets demonstrate the effectiveness of MDA-HS for event recognition in consumer videos.
4 0.7654146 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints
Author: Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell
Abstract: Most successful object classification and detection methods rely on classifiers trained on large labeled datasets. However, for domains where labels are limited, simply borrowing labeled data from existing datasets can hurt performance, a phenomenon known as “dataset bias.” We propose a general framework for adapting classifiers from “borrowed” data to the target domain using a combination of available labeled and unlabeled examples. Specifically, we show that imposing smoothness constraints on the classifier scores over the unlabeled data can lead to improved adaptation results. Such constraints are often available in the form of instance correspondences, e.g. when the same object or individual is observed simultaneously from multiple views, or tracked between video frames. In these cases, the object labels are unknown but can be constrained to be the same or similar. We propose techniques that build on existing domain adaptation methods by explicitly modeling these relationships, and demonstrate empirically that they improve recognition accuracy in two scenarios, multicategory image classification and object detection in video.
5 0.68932772 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning
Author: Ilja Kuzborskij, Francesco Orabona, Barbara Caputo
Abstract: Since the seminal work of Thrun [17], the learning to learnparadigm has been defined as the ability ofan agent to improve its performance at each task with experience, with the number of tasks. Within the object categorization domain, the visual learning community has actively declined this paradigm in the transfer learning setting. Almost all proposed methods focus on category detection problems, addressing how to learn a new target class from few samples by leveraging over the known source. But if one thinks oflearning over multiple tasks, there is a needfor multiclass transfer learning algorithms able to exploit previous source knowledge when learning a new class, while at the same time optimizing their overall performance. This is an open challenge for existing transfer learning algorithms. The contribution of this paper is a discriminative method that addresses this issue, based on a Least-Squares Support Vector Machine formulation. Our approach is designed to balance between transferring to the new class and preserving what has already been learned on the source models. Exten- sive experiments on subsets of publicly available datasets prove the effectiveness of our approach.
6 0.53118259 392 cvpr-2013-Separable Dictionary Learning
7 0.52500445 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
8 0.51812583 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
10 0.49840289 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
11 0.48839122 315 cvpr-2013-Online Robust Dictionary Learning
12 0.48026916 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
13 0.46982533 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering
14 0.43946558 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
15 0.41835126 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation
16 0.39431685 142 cvpr-2013-Efficient Detector Adaptation for Object Detection in a Video
17 0.3909235 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
18 0.38602793 385 cvpr-2013-Selective Transfer Machine for Personalized Facial Action Unit Detection
19 0.37527606 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
20 0.35259485 459 cvpr-2013-Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots
topicId topicWeight
[(10, 0.17), (16, 0.033), (19, 0.014), (26, 0.028), (33, 0.26), (39, 0.012), (44, 0.123), (67, 0.084), (69, 0.07), (87, 0.087), (91, 0.031)]
simIndex simValue paperId paperTitle
1 0.93101335 365 cvpr-2013-Robust Real-Time Tracking of Multiple Objects by Volumetric Mass Densities
Author: Horst Possegger, Sabine Sternig, Thomas Mauthner, Peter M. Roth, Horst Bischof
Abstract: Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on a common 2D ground-plane. To overcome these limitations, we introduce the concept of an occupancy volume exploiting the full geometry and the objects ’ center of mass and develop an efficient algorithm for 3D object tracking. Individual objects are tracked using the local mass density scores within a particle filter based approach, constrained by a Voronoi partitioning between nearby trackers. Our method benefits from the geometric knowledge given by the occupancy volume to robustly extract features and train classifiers on-demand, when volumetric information becomes unreliable. We evaluate our approach on several challenging real-world scenarios including the public APIDIS dataset. Experimental evaluations demonstrate significant improvements compared to state-of-theart methods, while achieving real-time performance. – –
2 0.9249602 71 cvpr-2013-Boundary Cues for 3D Object Shape Recovery
Author: Kevin Karsch, Zicheng Liao, Jason Rock, Jonathan T. Barron, Derek Hoiem
Abstract: Early work in computer vision considered a host of geometric cues for both shape reconstruction [11] and recognition [14]. However, since then, the vision community has focused heavily on shading cues for reconstruction [1], and moved towards data-driven approaches for recognition [6]. In this paper, we reconsider these perhaps overlooked “boundary” cues (such as self occlusions and folds in a surface), as well as many other established constraints for shape reconstruction. In a variety of user studies and quantitative tasks, we evaluate how well these cues inform shape reconstruction (relative to each other) in terms of both shape quality and shape recognition. Our findings suggest many new directions for future research in shape reconstruction, such as automatic boundary cue detection and relaxing assumptions in shape from shading (e.g. orthographic projection, Lambertian surfaces).
3 0.92102766 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem
Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.
same-paper 4 0.91876793 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
Author: Jie Ni, Qiang Qiu, Rama Chellappa
Abstract: Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present exumd .edu , rama@umiacs .umd .edu training and testing data are captured from the same underlying distribution. Yet this assumption is often violated in many real life applications. For instance, images collected from an internet search engine are compared with those captured from real life [28, 4]. Face recognition systems trained on frontal and high resolution images, are applied to probe images with non-frontal poses and low resolution [6]. Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. We show some examples of dataset shifts in Figure 1. In these scenarios, magnitudes of variations of innate characteristics, which distinguish one class from another, are oftentimes smaller than the variations caused by distribution shift between training and testing dataset. Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
5 0.91415143 414 cvpr-2013-Structure Preserving Object Tracking
Author: Lu Zhang, Laurens van_der_Maaten
Abstract: Model-free trackers can track arbitrary objects based on a single (bounding-box) annotation of the object. Whilst the performance of model-free trackers has recently improved significantly, simultaneously tracking multiple objects with similar appearance remains very hard. In this paper, we propose a new multi-object model-free tracker (based on tracking-by-detection) that resolves this problem by incorporating spatial constraints between the objects. The spatial constraints are learned along with the object detectors using an online structured SVM algorithm. The experimental evaluation ofour structure-preserving object tracker (SPOT) reveals significant performance improvements in multi-object tracking. We also show that SPOT can improve the performance of single-object trackers by simultaneously tracking different parts of the object.
6 0.91069824 408 cvpr-2013-Spatiotemporal Deformable Part Models for Action Detection
7 0.90989763 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
8 0.9096778 314 cvpr-2013-Online Object Tracking: A Benchmark
9 0.90842909 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking
10 0.90770578 325 cvpr-2013-Part Discovery from Partial Correspondence
11 0.90765071 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning
12 0.90625465 400 cvpr-2013-Single Image Calibration of Multi-axial Imaging Systems
13 0.90378755 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
14 0.90346038 98 cvpr-2013-Cross-View Action Recognition via a Continuous Virtual Path
15 0.90315419 198 cvpr-2013-Handling Noise in Single Image Deblurring Using Directional Filters
16 0.90140879 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
17 0.9012174 14 cvpr-2013-A Joint Model for 2D and 3D Pose Estimation from a Single Image
18 0.90105331 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
19 0.9005571 74 cvpr-2013-CLAM: Coupled Localization and Mapping with Efficient Outlier Handling
20 0.90022278 288 cvpr-2013-Modeling Mutual Visibility Relationship in Pedestrian Detection