cvpr cvpr2013 cvpr2013-185 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
Reference: text
sentIndex sentText sentNum sentScore
1 Nguyen Rama Chellappa University of Maryland, College Park, USA { s shekha ,pvi shalm, , hien , rama } @umiacs . [sent-3, score-0.109]
2 edu Abstract Data-driven dictionaries have produced state-of-the-art results in various classification tasks. [sent-5, score-0.17]
3 However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. [sent-6, score-0.346]
4 In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. [sent-7, score-0.278]
5 Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. [sent-8, score-0.962]
6 The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. [sent-10, score-0.111]
7 The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. [sent-11, score-0.359]
8 Introduction The study of sparse representation of signals and images has attracted tremendous interest in last few years. [sent-14, score-0.111]
9 Sparse representations of signals and images require learning an over-complete set of bases called a dictionary along with linear decomposition of signals and images as a combination of few atoms from the learned dictionary. [sent-15, score-0.861]
10 Olshausen and Field [16] in their seminal work introduced the idea of learning dictionary from data instead of using off-the-shelf bases. [sent-16, score-0.598]
11 Since then, data-driven dictionaries have been shown to work well for both image restoration [3] and classification tasks [26]. [sent-17, score-0.17]
12 The efficiency of dictionaries in these wide range of applications can be attributed to the robust discriminant rep∗This work 0124. [sent-18, score-0.17]
13 However, the learned dictionary may not be optimal if the target data has different distribution than the data used for training. [sent-22, score-0.719]
14 Adapting dictionaries to new domains is a challenging task, but has hardly been explored in the vision literature. [sent-25, score-0.443]
15 [12] considered a special case where corresponding samples from each domain were available, and learned a dictionary for each domain. [sent-27, score-0.747]
16 [19] proposed a method for adapt- ing dictionaries for smoothly varying domains using regression. [sent-29, score-0.443]
17 However, in practical applications, target domains are scarcely labeled, and domain shifts may result in abrupt feature changes (e. [sent-30, score-0.574]
18 Hence learning a separate dictionary for each domain will have a severe space constraint, rendering it unfeasible for many practical applications. [sent-34, score-0.744]
19 333565991 In view of the above challenges, we propose a robust method for learning a single dictionary to optimally represent both source and target data. [sent-35, score-0.839]
20 As the features may not be correlated well in the original space, we project data from both the domains onto a common low-dimensional space, while maintaining the manifold structure of data. [sent-36, score-0.343]
21 Simultaneously, we learn a compact dictionary which represents projected data from both the domains well. [sent-37, score-0.899]
22 Firstly, learning separate projection matrix for each domain makes it easy to handle any changes in feature dimension and type in different domains. [sent-40, score-0.268]
23 Further, learning the dictionary on a lowdimensional space makes the algorithm faster, and irrele- vant information in original features is discarded. [sent-42, score-0.598]
24 Moreover, joint learning of dictionary and projections ensures that the common internal structure of data in both the domains is extracted, which can be represented well by sparse linear combinations of dictionary atoms. [sent-43, score-1.589]
25 We will see that by constraining the projection matrices to be orthonormal matrices, convenient forms for optimal dictionary and projection matrices can be obtained. [sent-45, score-0.764]
26 The classification scheme for the learned dictionary is described in Section 5. [sent-52, score-0.601]
27 Related Work The problem of adapting classifiers to new visual domains has recently gained importance in the vision community and several methods have been proposed [21, 13, 6, 5, 11]. [sent-55, score-0.361]
28 [11] learnt a transformation of source data onto target space, such that the joint representation is low-rank. [sent-57, score-0.274]
29 On the other hand, our methodjointly learns projections ofboth the domains, while utilizing the available labels to learn a discriminative dictionary. [sent-59, score-0.139]
30 [10] suggested learning a shared embedding for different domains, along with a sparsity constraint on the representation. [sent-61, score-0.17]
31 Similarly, methods for joint dimensionality reduction and sparse representation have also been proposed [29, 4, 14, 15]. [sent-66, score-0.125]
32 Problem Framework The classical dictionary learning approach minimizes the representation error of the given set of data samples subject to a sparsity constraint [1]. [sent-69, score-0.643]
33 , xN] ∈ RK×N is the sparse representation of Y over D, and ]T ∈0 iRs the sparsity level. [sent-81, score-0.105]
34 We wish to learn a shared K∈-at Roms dictionary, D∈ ∈R Rn×K and mappings Pn a1 h∈a Redn K×n-1a , oPm2s d∈i Rtionn×anry2 oDnt ∈o a common lowdimensiona∈l space, which w∈ill R minimize the representation error in the projected space. [sent-96, score-0.161]
35 Regularization: It will be desirable if the projections, while bringing the data from two domains to a shared subspace, do not lose too much information available in the original domains. [sent-105, score-0.327]
36 Multiple domains The above formulation can be extended so that it can handle multiple domains. [sent-123, score-0.273]
37 For M domain problem, we simply construct matrices as: Y˜, P˜, X˜ P˜ = [P1,· · ,PM] ,Y˜ =⎝⎜⎛Y0. [sent-124, score-0.207]
38 With these definitions, (3) can be extended to multiple domains as follows {D∗,P˜∗,X˜∗} = aDrg,P˜m,X˜inC1(D,P˜,X˜) + λC2(P˜) s. [sent-130, score-0.273]
39 Discriminative Dictionary The dictionary learned in (3) can reconstruct the two domains well, but it cannot discriminate between the data from different classes. [sent-136, score-0.874]
40 Recent advances in learning discriminative dictionaries [20, 28] suggest that learning class-wise, mutually incoherent dictionaries works better for discrimination. [sent-137, score-0.468]
41 To incorporate this into our framework, we write the dictionary D as D = [D1, · · · , DC], where C is the total number of classes. [sent-138, score-0.556]
42 We modify ,thDe cost function similar to [28], which encourages reconstruction samples of a given class by the dictionary of the corresponding class, and penalizes reconstruction by out-of-class dictionaries. [sent-139, score-0.67]
43 2F, (5) where μ and ν are the weights given to the discriminative terms, and matrices X˜in and X˜out are given as: X˜in[i,j] =? [sent-146, score-0.105]
44 Unlabeled data can be handled using semisupervised approaches to dictionary learning [18]. [sent-150, score-0.598]
45 Also, note that we do not need to modify the forms ofprojection matrices, since they capture the overall domain shift, and hence are independent of class variations. [sent-152, score-0.198]
46 Update step for For a fixed B˜, X˜ A˜, the problem becomes that of discrimina- tive dictionary learning, with data as Z = nary D = A˜TK˜B˜. [sent-189, score-0.602]
47 A˜TK˜ and dictio- To jointly learn the dictionary, D, and X˜, sparse code, we use the framework of the discriminative dictionary learning approach presented in [28]. [sent-190, score-0.732]
48 Classification Given a test sample, yte from domain k, we propose the following steps for classification, similar to [15]. [sent-214, score-0.341]
49 Compute the embedding of the sample in the common subspace, zte using the projection, Pk∗. [sent-217, score-0.169]
50 Compute the sparse coefficients, xte, of the embedded sample over dictionary D using the OMP algorithm [17]. [sent-222, score-0.616]
51 Now, the sample can be assigned to class i, if the reconstruction using the class dictionary, Di and the sparse code corresponding to the atoms of the dictionary, xtie is minimum. [sent-231, score-0.311]
52 So, we project the dictionary, Di into the feature space, and assign the test sample to the class with the minimum error in the original feature space: Output class = ai=rg1,m···i,nC? [sent-236, score-0.104]
53 First, we demonstrate some synthesis and recognition results on the CMU MultiPie dataset for face recognition across pose and illumination variations. [sent-244, score-0.189]
54 Next we show the performance of our method on domain adaptation databases and compare it with existing adaptation algorithms. [sent-246, score-0.378]
55 Frontal faces were taken as the source domain, while different off-frontal poses were taken as target domains. [sent-252, score-0.27]
56 Dictionaries were trained using illuminations {1, 4, 7, 12, 17} from the source and the target poses, itino Snses {si1o,n4 ,17 per subject. [sent-253, score-0.278]
57 Amll t hthee s oilluurcmei anantdio tnh images from Session 2, for the target pose, were taken as probe images. [sent-254, score-0.154]
58 1 Pose Alignment First we consider the problem of pose alignment using the proposed dictionary learning framework. [sent-258, score-0.694]
59 Images at the extreme pose of 60o were taken as the target pose. [sent-260, score-0.158]
60 A shared discriminative dictionary was learned using the approach described in this paper. [sent-261, score-0.699]
61 Given the probe image, it was projected on the latent subspace and reconstructed using the dictionary. [sent-262, score-0.117]
62 The reconstruction was back-projected onto the source pose domain, to give the aligned image. [sent-263, score-0.227]
63 Firstly, it can be seen that there is an optimal dictionary size, K = 5, where the best alignment is achieved. [sent-266, score-0.612]
64 For K = 7, the alignment is not good, as the learned dictionary is not able to successfully correlate the two domains when there are more atoms in the dictionary. [sent-268, score-1.046]
65 Moreover, the learned projection matrices (Figure 2(b)) show that our method can learn the internal structure of the two domains. [sent-274, score-0.179]
66 (b) First few components of the learned projection matrices for the two poses. [sent-279, score-0.149]
67 The dictionary learning algorithm, FDDL [28] is not optimal here as it is not able to efficiently represent the non-linear changes introduced by the pose variation. [sent-284, score-0.675]
68 In the second set-up, we evaluate the methods for adaptation using multiple domains. [sent-300, score-0.116]
69 For both the cases, we use 20 training samples per class for Amazon/Caltech, and 8 samples per class for DSLR/Webcam when used as source, and 3 training samples for all of them when used for target domain. [sent-302, score-0.222]
70 Rest of the data in the target domain is used for testing. [sent-303, score-0.264]
71 Dictionary size, K = 4 atoms per class and final dimension, n = 60 for the first set-up. [sent-315, score-0.168]
72 For the second set-up, K = 6 atoms per class and n = 90. [sent-316, score-0.168]
73 For FDDL, the parameters, μ and ν are the same as SDDL, and we learn K = 8 atoms per class for the first set-up and K = 10 atoms per class for the second. [sent-317, score-0.366]
74 The FDDL dictionary was trained using both the source and the target domain features, as it was found to give the best results. [sent-318, score-0.943]
75 2 Results using single source Table 2(a) shows a comparison of the results of different methods on 8 source-target pairs. [sent-322, score-0.123]
76 The proposed algorithm gives the best performance for 5 domain pairs, and is the second best for 2 pairs. [sent-323, score-0.146]
77 For Caltech-DSLR and AmazonWebcam domain pairs, there is more than 15% improvement over the GFK algorithm [5]. [sent-324, score-0.146]
78 Furthermore, a com333666446 (a) Performance comparison on single source four domains benchmark (C: caltech, A: amazon, D: dslr, W: webcam) MFSGDe GFtrKiDhLc[o562Ld1]8s43906C. [sent-325, score-0.396]
79 3 Results using multiple sources As our proposed framework can also handle multiple domains, we also experimented with multiple source adaptation. [sent-357, score-0.123]
80 However, [7] reports higher numbers on webcam and amazon as targets, using boosted classifiers. [sent-360, score-0.277]
81 4 Ease of adaptation A rank of domain (ROD) metric was introduced in [5] to measure the adaptability of different domains. [sent-364, score-0.262]
82 It was shown that ROD correlates with the performance of adaptation algorithm. [sent-365, score-0.116]
83 This is the case because by learning projections along-with the common dictionary, we can achieve a better alignment of the datasets. [sent-370, score-0.2]
84 Number of source images: Here, we choose Amazon/Webcam domain pair, as it is "difficult" to adapt. [sent-376, score-0.269]
85 We increased the number of source images and studied the performance of SDDL and compared it with FDDL. [sent-377, score-0.123]
86 It can be seen that while FDDL’s performance decreases sharply with more source images, SDDL method shows an increase in the performance. [sent-378, score-0.123]
87 Hence, by adapting the source to the target domain, our method can use the source information to increase the accuracy of target recognition, even when their distributions are very different. [sent-379, score-0.57]
88 Dictionary size: All the domain pairs show an initial sharp increase in the performance, and then become almost flat after the dictionary size of 3 or 4. [sent-381, score-0.743]
89 The flat region indicates that alignment of the source and the target data is limited by the number of available target samples. [sent-382, score-0.456]
90 But also, on a positive note, it can be seen that even a smaller dictionary can give the optimal performance. [sent-383, score-0.556]
91 Common subspace dimension: Similar to the previous case, we get an initial sharp increase followed by a flat recognition curve. [sent-385, score-0.118]
92 Conclusion We have proposed a novel framework for adapting dictionaries to testing domains under arbitrary domain shifts. [sent-388, score-0.677]
93 Future works will include studying the effect of using unlabeled data while training, and other relevant problems like large-scale and online adaptation of dictionaries. [sent-393, score-0.116]
94 K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. [sent-398, score-0.26]
95 Recognition performance under different: (a) number of source images, (b) dictionary size, and (c) common subspace dimension. [sent-400, score-0.757]
96 Image denoising via sparse and redundant representations over learned dictionaries. [sent-410, score-0.105]
97 Unsupervised adaptation across domain shift by generating intermediate data representations. [sent-434, score-0.262]
98 What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. [sent-480, score-0.148]
99 Classification and clustering via dictionary learning with structured incoherence and shared features. [sent-530, score-0.652]
100 On the dimensionality reduction for sparse representation based face recognition. [sent-596, score-0.163]
wordName wordTfidf (topN-words)
[('dictionary', 0.556), ('domains', 0.273), ('sddl', 0.219), ('yte', 0.195), ('dictionaries', 0.17), ('fddl', 0.167), ('domain', 0.146), ('webcam', 0.141), ('tk', 0.141), ('amazon', 0.136), ('dslr', 0.133), ('source', 0.123), ('target', 0.118), ('atoms', 0.116), ('adaptation', 0.116), ('zte', 0.103), ('aitkiai', 0.094), ('adapting', 0.088), ('conference', 0.087), ('patel', 0.086), ('pipit', 0.083), ('caltech', 0.076), ('dx', 0.071), ('june', 0.07), ('tie', 0.069), ('projections', 0.065), ('rod', 0.064), ('hien', 0.063), ('tkb', 0.063), ('matrices', 0.061), ('proposition', 0.061), ('sparse', 0.06), ('alignment', 0.056), ('shared', 0.054), ('class', 0.052), ('sgf', 0.051), ('gfk', 0.051), ('signals', 0.051), ('jhuo', 0.049), ('xte', 0.049), ('nary', 0.046), ('rama', 0.046), ('learned', 0.045), ('surf', 0.045), ('sparsity', 0.045), ('pk', 0.044), ('pages', 0.044), ('discriminative', 0.044), ('transactions', 0.044), ('olshausen', 0.043), ('qiu', 0.043), ('projection', 0.043), ('learning', 0.042), ('gopalan', 0.041), ('nguyen', 0.041), ('subspace', 0.041), ('flat', 0.041), ('projected', 0.04), ('pose', 0.04), ('trace', 0.04), ('synthesis', 0.039), ('session', 0.039), ('ieee', 0.039), ('pattern', 0.038), ('face', 0.038), ('sharma', 0.037), ('common', 0.037), ('changes', 0.037), ('saenko', 0.037), ('illuminations', 0.037), ('optimization', 0.036), ('recognition', 0.036), ('probe', 0.036), ('kulis', 0.036), ('dimensionality', 0.036), ('pi', 0.034), ('elad', 0.034), ('kernelized', 0.034), ('onto', 0.033), ('kernel', 0.032), ('di', 0.031), ('update', 0.031), ('reconstruction', 0.031), ('learn', 0.03), ('overcomplete', 0.03), ('insights', 0.03), ('cmu', 0.029), ('poses', 0.029), ('embedding', 0.029), ('reduction', 0.029), ('equality', 0.028), ('dslrs', 0.028), ('ilitn', 0.028), ('rice', 0.028), ('succinctly', 0.028), ('aar', 0.028), ('shalm', 0.028), ('vishal', 0.028), ('reep', 0.028), ('tdi', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000008 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
2 0.46738502 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
3 0.45885617 392 cvpr-2013-Separable Dictionary Learning
Author: Simon Hawe, Matthias Seibert, Martin Kleinsteuber
Abstract: Many techniques in computer vision, machine learning, and statistics rely on the fact that a signal of interest admits a sparse representation over some dictionary. Dictionaries are either available analytically, or can be learned from a suitable training set. While analytic dictionaries permit to capture the global structure of a signal and allow a fast implementation, learned dictionaries often perform better in applications as they are more adapted to the considered class of signals. In imagery, unfortunately, the numerical burden for (i) learning a dictionary and for (ii) employing the dictionary for reconstruction tasks only allows to deal with relatively small image patches that only capture local image information. The approach presented in this paper aims at overcoming these drawbacks by allowing a separable structure on the dictionary throughout the learning process. On the one hand, this permits larger patch-sizes for the learning phase, on the other hand, the dictionary is applied efficiently in reconstruction tasks. The learning procedure is based on optimizing over a product of spheres which updates the dictionary as a whole, thus enforces basic dictionary proper- , ties such as mutual coherence explicitly during the learning procedure. In the special case where no separable structure is enforced, our method competes with state-of-the-art dictionary learning methods like K-SVD.
4 0.44378743 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
Author: Jie Ni, Qiang Qiu, Rama Chellappa
Abstract: Domain adaptation addresses the problem where data instances of a source domain have different distributions from that of a target domain, which occurs frequently in many real life scenarios. This work focuses on unsupervised domain adaptation, where labeled data are only available in the source domain. We propose to interpolate subspaces through dictionary learning to link the source and target domains. These subspaces are able to capture the intrinsic domain shift and form a shared feature representation for cross domain recognition. Further, we introduce a quantitative measure to characterize the shift between two domains, which enables us to select the optimal domain to adapt to the given multiple source domains. We present exumd .edu , rama@umiacs .umd .edu training and testing data are captured from the same underlying distribution. Yet this assumption is often violated in many real life applications. For instance, images collected from an internet search engine are compared with those captured from real life [28, 4]. Face recognition systems trained on frontal and high resolution images, are applied to probe images with non-frontal poses and low resolution [6]. Human actions are recognized from an unseen target view using training data taken from source views [21, 20]. We show some examples of dataset shifts in Figure 1. In these scenarios, magnitudes of variations of innate characteristics, which distinguish one class from another, are oftentimes smaller than the variations caused by distribution shift between training and testing dataset. Directly applying the classifier from the training set to testing set periments on face recognition across pose, illumination and blur variations, cross dataset object recognition, and report improved performance over the state of the art.
Author: Li He, Hairong Qi, Russell Zaretzki
Abstract: This paper addresses the problem of learning overcomplete dictionaries for the coupled feature spaces, where the learned dictionaries also reflect the relationship between the two spaces. A Bayesian method using a beta process prior is applied to learn the over-complete dictionaries. Compared to previous couple feature spaces dictionary learning algorithms, our algorithm not only provides dictionaries that customized to each feature space, but also adds more consistent and accurate mapping between the two feature spaces. This is due to the unique property of the beta process model that the sparse representation can be decomposed to values and dictionary atom indicators. The proposed algorithm is able to learn sparse representations that correspond to the same dictionary atoms with the same sparsity but different values in coupled feature spaces, thus bringing consistent and accurate mapping between coupled feature spaces. Another advantage of the proposed method is that the number of dictionary atoms and their relative importance may be inferred non-parametrically. We compare the proposed approach to several state-of-the-art dictionary learning methods super-resolution. tionaries learned resolution results ods. by applying this method to single image The experimental results show that dicby our method produces the best supercompared to other state-of-the-art meth-
6 0.41409272 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
7 0.37985057 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
8 0.37305188 315 cvpr-2013-Online Robust Dictionary Learning
9 0.29695183 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
10 0.25444368 422 cvpr-2013-Tag Taxonomy Aware Dictionary Learning for Region Tagging
11 0.24783553 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources
12 0.24314462 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints
13 0.23850198 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning
14 0.21490853 250 cvpr-2013-Learning Cross-Domain Information Transfer for Location Recognition and Clustering
15 0.20576626 302 cvpr-2013-Multi-task Sparse Learning with Beta Process Prior for Action Recognition
16 0.19457446 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
17 0.1893951 204 cvpr-2013-Histograms of Sparse Codes for Object Detection
18 0.1724734 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
19 0.12092974 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition
20 0.10934193 233 cvpr-2013-Joint Sparsity-Based Representation and Analysis of Unconstrained Activities
topicId topicWeight
[(0, 0.238), (1, -0.208), (2, -0.372), (3, 0.387), (4, -0.15), (5, -0.148), (6, 0.125), (7, 0.109), (8, 0.016), (9, 0.111), (10, 0.007), (11, 0.041), (12, -0.006), (13, -0.014), (14, -0.07), (15, -0.071), (16, -0.013), (17, -0.052), (18, -0.101), (19, -0.045), (20, -0.095), (21, -0.108), (22, -0.064), (23, -0.03), (24, -0.028), (25, -0.052), (26, 0.048), (27, -0.117), (28, -0.092), (29, 0.018), (30, 0.023), (31, -0.032), (32, 0.021), (33, 0.022), (34, 0.009), (35, -0.059), (36, 0.045), (37, 0.032), (38, -0.049), (39, 0.086), (40, 0.034), (41, 0.057), (42, -0.005), (43, 0.021), (44, -0.038), (45, -0.017), (46, 0.048), (47, -0.003), (48, 0.007), (49, -0.057)]
simIndex simValue paperId paperTitle
same-paper 1 0.95833349 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
2 0.88093024 392 cvpr-2013-Separable Dictionary Learning
Author: Simon Hawe, Matthias Seibert, Martin Kleinsteuber
Abstract: Many techniques in computer vision, machine learning, and statistics rely on the fact that a signal of interest admits a sparse representation over some dictionary. Dictionaries are either available analytically, or can be learned from a suitable training set. While analytic dictionaries permit to capture the global structure of a signal and allow a fast implementation, learned dictionaries often perform better in applications as they are more adapted to the considered class of signals. In imagery, unfortunately, the numerical burden for (i) learning a dictionary and for (ii) employing the dictionary for reconstruction tasks only allows to deal with relatively small image patches that only capture local image information. The approach presented in this paper aims at overcoming these drawbacks by allowing a separable structure on the dictionary throughout the learning process. On the one hand, this permits larger patch-sizes for the learning phase, on the other hand, the dictionary is applied efficiently in reconstruction tasks. The learning procedure is based on optimizing over a product of spheres which updates the dictionary as a whole, thus enforces basic dictionary proper- , ties such as mutual coherence explicitly during the learning procedure. In the special case where no separable structure is enforced, our method competes with state-of-the-art dictionary learning methods like K-SVD.
Author: Li He, Hairong Qi, Russell Zaretzki
Abstract: This paper addresses the problem of learning overcomplete dictionaries for the coupled feature spaces, where the learned dictionaries also reflect the relationship between the two spaces. A Bayesian method using a beta process prior is applied to learn the over-complete dictionaries. Compared to previous couple feature spaces dictionary learning algorithms, our algorithm not only provides dictionaries that customized to each feature space, but also adds more consistent and accurate mapping between the two feature spaces. This is due to the unique property of the beta process model that the sparse representation can be decomposed to values and dictionary atom indicators. The proposed algorithm is able to learn sparse representations that correspond to the same dictionary atoms with the same sparsity but different values in coupled feature spaces, thus bringing consistent and accurate mapping between coupled feature spaces. Another advantage of the proposed method is that the number of dictionary atoms and their relative importance may be inferred non-parametrically. We compare the proposed approach to several state-of-the-art dictionary learning methods super-resolution. tionaries learned resolution results ods. by applying this method to single image The experimental results show that dicby our method produces the best supercompared to other state-of-the-art meth-
4 0.84085423 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
Author: Yu-Tseh Chi, Mohsen Ali, Ajit Rajwade, Jeffrey Ho
Abstract: This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary Learning (ICS-DL) algorithm. An important and distinguishing feature of the proposed framework is that all dictionary blocks are trained simultaneously with respect to each data group while the intra-block coherence being explicitly minimized as an important objective. We provide both empirical evidence and heuristic support for this feature that can be considered as a direct consequence of incorporating both the group structure for the input data and the block structure for the dictionary in the learning process. The optimization problems for both the dictionary learning and sparse coding can be solved efficiently using block-gradient descent, and the details of the optimization algorithms are presented. We evaluate the proposed methods using well-known datasets, and favorable comparisons with state-of-the-art dictionary learning methods demonstrate the viability and validity of the proposed framework.
5 0.8312999 315 cvpr-2013-Online Robust Dictionary Learning
Author: Cewu Lu, Jiaping Shi, Jiaya Jia
Abstract: Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision. It, however, faces the major difficulty to incorporate robust functions, rather than the square data fitting term, to handle outliers in training data. In thispaper, wepropose a new online framework enabling the use of ?1 sparse data fitting term in robust dictionary learning, notably enhancing the usability and practicality of this important technique. Extensive experiments have been carried out to validate our new framework.
6 0.79764897 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
7 0.79366708 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
8 0.7815752 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
9 0.76922512 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
10 0.61020547 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning
11 0.57611263 422 cvpr-2013-Tag Taxonomy Aware Dictionary Learning for Region Tagging
12 0.5747996 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
13 0.52962983 302 cvpr-2013-Multi-task Sparse Learning with Beta Process Prior for Action Recognition
14 0.51144445 150 cvpr-2013-Event Recognition in Videos by Learning from Heterogeneous Web Sources
15 0.50534147 204 cvpr-2013-Histograms of Sparse Codes for Object Detection
16 0.50505632 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation
17 0.50165319 179 cvpr-2013-From N to N+1: Multiclass Transfer Incremental Learning
18 0.47594675 387 cvpr-2013-Semi-supervised Domain Adaptation with Instance Constraints
19 0.44983172 83 cvpr-2013-Classification of Tumor Histology via Morphometric Context
20 0.44742137 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
topicId topicWeight
[(10, 0.117), (16, 0.024), (19, 0.015), (26, 0.049), (28, 0.012), (33, 0.291), (39, 0.017), (67, 0.069), (69, 0.064), (87, 0.064), (91, 0.204), (96, 0.014)]
simIndex simValue paperId paperTitle
1 0.89729595 259 cvpr-2013-Learning a Manifold as an Atlas
Author: Nikolaos Pitelis, Chris Russell, Lourdes Agapito
Abstract: In this work, we return to the underlying mathematical definition of a manifold and directly characterise learning a manifold as finding an atlas, or a set of overlapping charts, that accurately describe local structure. We formulate the problem of learning the manifold as an optimisation that simultaneously refines the continuous parameters defining the charts, and the discrete assignment of points to charts. In contrast to existing methods, this direct formulation of a manifold does not require “unwrapping ” the manifold into a lower dimensional space and allows us to learn closed manifolds of interest to vision, such as those corresponding to gait cycles or camera pose. We report state-ofthe-art results for manifold based nearest neighbour classification on vision datasets, and show how the same techniques can be applied to the 3D reconstruction of human motion from a single image.
2 0.88568425 415 cvpr-2013-Structured Face Hallucination
Author: Chih-Yuan Yang, Sifei Liu, Ming-Hsuan Yang
Abstract: The goal of face hallucination is to generate highresolution images with fidelity from low-resolution ones. In contrast to existing methods based on patch similarity or holistic constraints in the image space, we propose to exploit local image structures for face hallucination. Each face image is represented in terms of facial components, contours and smooth regions. The image structure is maintained via matching gradients in the reconstructed highresolution output. For facial components, we align input images to generate accurate exemplars and transfer the high-frequency details for preserving structural consistency. For contours, we learn statistical priors to generate salient structures in the high-resolution images. A patch matching method is utilized on the smooth regions where the image gradients are preserved. Experimental results demonstrate that the proposed algorithm generates hallucinated face images with favorable quality and adaptability.
3 0.8795929 394 cvpr-2013-Shading-Based Shape Refinement of RGB-D Images
Author: Lap-Fai Yu, Sai-Kit Yeung, Yu-Wing Tai, Stephen Lin
Abstract: We present a shading-based shape refinement algorithm which uses a noisy, incomplete depth map from Kinect to help resolve ambiguities in shape-from-shading. In our framework, the partial depth information is used to overcome bas-relief ambiguity in normals estimation, as well as to assist in recovering relative albedos, which are needed to reliably estimate the lighting environment and to separate shading from albedo. This refinement of surface normals using a noisy depth map leads to high-quality 3D surfaces. The effectiveness of our algorithm is demonstrated through several challenging real-world examples.
same-paper 4 0.86931229 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
5 0.86112887 380 cvpr-2013-Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images
Author: Jamie Shotton, Ben Glocker, Christopher Zach, Shahram Izadi, Antonio Criminisi, Andrew Fitzgibbon
Abstract: We address the problem of inferring the pose of an RGB-D camera relative to a known 3D scene, given only a single acquired image. Our approach employs a regression forest that is capable of inferring an estimate of each pixel’s correspondence to 3D points in the scene ’s world coordinate frame. The forest uses only simple depth and RGB pixel comparison features, and does not require the computation of feature descriptors. The forest is trained to be capable of predicting correspondences at any pixel, so no interest point detectors are required. The camera pose is inferred using a robust optimization scheme. This starts with an initial set of hypothesized camera poses, constructed by applying the forest at a small fraction of image pixels. Preemptive RANSAC then iterates sampling more pixels at which to evaluate the forest, counting inliers, and refining the hypothesized poses. We evaluate on several varied scenes captured with an RGB-D camera and observe that the proposed technique achieves highly accurate relocalization and substantially out-performs two state of the art baselines.
6 0.84952593 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
7 0.8472997 248 cvpr-2013-Learning Collections of Part Models for Object Recognition
8 0.84709936 30 cvpr-2013-Accurate Localization of 3D Objects from RGB-D Data Using Segmentation Hypotheses
9 0.84547609 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
10 0.84493911 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation
11 0.84481603 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
12 0.84451061 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection
13 0.84405649 325 cvpr-2013-Part Discovery from Partial Correspondence
14 0.84389138 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
15 0.8432914 343 cvpr-2013-Query Adaptive Similarity for Large Scale Object Retrieval
16 0.84256464 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
17 0.84233695 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
18 0.84226769 414 cvpr-2013-Structure Preserving Object Tracking
19 0.84221953 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models
20 0.8422091 14 cvpr-2013-A Joint Model for 2D and 3D Pose Estimation from a Single Image