cvpr cvpr2013 cvpr2013-296 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
Reference: text
sentIndex sentText sentNum sentScore
1 In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. [sent-15, score-0.87]
2 For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. [sent-16, score-2.042]
3 Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. [sent-17, score-2.295]
4 In learning stage, D0 denotes the dictionary used by first level nodes (V1,1 ,V1,2 and V1,3). [sent-27, score-0.894]
5 The corresponding representation z1,1 is used for classi- fication model learning among the child nodes (V2,1 and V2,2). [sent-29, score-0.344]
6 The dictionary applied to quantize the local descriptors is often obtained by some clustering method, such as k-means. [sent-32, score-0.695]
7 In recent work [28, 18], sparse coding based dictionary learning has reported more promising results. [sent-33, score-0.85]
8 Furthermore, it is shown in [17, 2] that the dictionaries via supervised learning are beneficial for performance improvement by encoding more discriminative information into the representations. [sent-36, score-0.53]
9 However, when the number of categories is large, these methods suffer from considerable time overhead during the supervised dictionary learning and predicting stages. [sent-37, score-0.876]
10 The categories are usually organized in the form of tree-structured hierarchy [6] and are treated as the leaf nodes in the bottom of the tree. [sent-39, score-0.456]
11 Each internode corresponds to one hyper-category that is composed of a group ofcategories with semantic relevance or visual similarity, so that the structure reflects the hierarchical correlation among categories. [sent-40, score-0.394]
12 Firstly, due to diversified inter-correlation among different layers, the sibling nodes in higher levels are less related than the ones in lower levels, thus the discrimination between nodes in higher levels is easier. [sent-42, score-0.669]
13 Finally, lower level categories are supposed to possess the general properties from the higher level categories and additional classspecific details. [sent-48, score-0.327]
14 The learnt visual dictionary set and feature representations make better use of the category correlation encoded by the hierarchy. [sent-52, score-0.945]
15 As the features exacted from larger receptive fields encode more complex and specific patterns, the dictionaries learnt in lower layers are designed to encode the descriptors at larger scale. [sent-53, score-0.913]
16 Given the structure, we learn one discriminative dictionary and a set of discriminative models for each hyper-category (internode). [sent-54, score-0.798]
17 Besides, our learnt dictionaries in lower levels consist of additional part inherited from ancestor nodes, so that categories in lower levels are described with multi-scale visual information. [sent-55, score-1.284]
18 For internal node V1,1, the corresponding dictionary D1,1 consists of two parts D0 and D0 represents the dictionary inherited D10,1. [sent-58, score-1.606]
19 The specific dictionary and the class models of nodes V2,1 and V2,2 are learnt in a discriminative formulation simultaneously. [sent-60, score-1.094]
20 The dictionaries learnt at different layers encode different scale information. [sent-65, score-0.707]
21 Moreover, each dictionary consists of the general part inherited from upper layers and the specific part learnt from its child nodes. [sent-66, score-1.335]
22 Compared with unsupervised dictionary [28] or class− specific dictionary [17, 2], our learnt dictionaries capture the information of multi-level visual (hyper-)categories in a more effective way. [sent-67, score-1.955]
23 On training stage, the number of dictionaries is equal with the number of internodes in the tree, which is far less than the number of categories. [sent-69, score-0.451]
24 Related Work Current dictionary learning approaches can be categorized into two main types: unsupervised and supervised dictionary learning. [sent-76, score-1.462]
25 In the field of unsupervised dictionary learning, the dictionary is optimized only based on the reconstruction error of the signals [18]. [sent-77, score-1.394]
26 [28] propose to learn a unique unsupervised dictionary by sparse coding for classification and achieve impressive results. [sent-79, score-0.877]
27 [12] employ a tree-structured sparsity to encode the dependencies between dictionary atoms. [sent-84, score-0.706]
28 The dictionaries learnt in a supervised way have stronger discriminative power. [sent-87, score-0.611]
29 For example, label information is fed in Fisher discrimination criterion or logistic regression model for dictionary learning [17, 29]. [sent-88, score-0.782]
30 A shared dictionary or multiple class-specific dictionaries can be obtained by su1L denotes the number of levels 333888224 pervised learning. [sent-89, score-1.154]
31 [2] adapt supervised dictionary learning to local features, which achieves better discriminative power than features extracted by [28]. [sent-91, score-0.815]
32 [34] propose to learn multiple specific dictionaries and a global dictionary shared by all categories. [sent-93, score-1.099]
33 When the number of categories is large, the advantage of hierarchical structure over flat structure emerges in terms of efficiency and accuracy [22, 33, 3]. [sent-95, score-0.331]
34 As one of the coding strategies, local coding aims to learn a set of anchor points (dictionary) to encode signals incorporating with locality constraint, which ensures that similar signals tend to have similar codes. [sent-113, score-0.328]
35 As the supervised learning approach has been shown beneficial to dictionary learning, in this paper we propose to introduce a supervised formulation for local coding. [sent-114, score-0.837]
36 Given a dictionary Db, ˆ xi,p can be reconstructed by: , αi,p( ˆxi,p,Db) = argmin12? [sent-119, score-0.655]
37 Due to the fact that max pooling is not differentiable, average pooling is more appropriate to task-driven dictionary learning [2]. [sent-148, score-0.877]
38 iTtso jointly lse {adrn the dictionary a bned b colaunssdiefidca btiyo unn mito ldel, the model can be formulated as following: , Wmi,Dn 2λ? [sent-160, score-0.655]
39 Multi-Scale Dictionary Learning Given the tree structure, our goal is to learn a set of discriminative dictionaries for discrimination among the sibling nodes. [sent-183, score-0.652]
40 Based on the above analysis, we propose to learn the dictionaries in different layers to encode the descriptors of different scales. [sent-188, score-0.611]
41 Let T denote the set of leaf nodes (categories) in the tree, T¯ denote the set of internodes which represent the hyper- categories, and T+ = T ∪ T¯ denote the set containing all 333888335 the nodes in the tree. [sent-189, score-0.479]
42 We need to learn a single dictionary Dt shared by its child nodes. [sent-200, score-0.831]
43 For the child node v of node t, we define a response function f(∗). [sent-201, score-0.376]
44 The dictionaries in different layers are learnt to discover the valuable properties with different scales. [sent-212, score-0.682]
45 Multi-Level Dictionary chical Representation Learning for Hierar- Given the hierarchical structure, we learn a set of discriminative dictionaries to encode the descriptors of different scales. [sent-215, score-0.624]
46 The dictionary learnt on an internode consists of the specific properties for discriminating the children nodes, and these properties are supposed to be embodied in their children nodes. [sent-216, score-1.103]
47 Therefore, the dictionaries in the higher levels can be regarded as the sharing properties for the groups of correlated categories in lower levels, and they can be inherited by the child nodes through the tree path. [sent-217, score-1.202]
48 1, the corresponding D1,1 is expressed as D1,1 = [D0, D0 denotes the inherited dictionary from V0, D10,1]. [sent-219, score-0.889]
49 and D10,1 denotes the specific dictionary learnt in node V1,1. [sent-220, score-1.008]
50 Then, for the child node V2,1, the response function of sample i(Eq. [sent-223, score-0.276]
51 The dictionary and classifier learning based on hierarchical structure can be revised from Eq. [sent-225, score-0.852]
52 2F+ loss(W,D+,X,Y ) (9) D+ where represents the set of dictionaries in the tree and W denotes the classifier matrix embedded in the structure. [sent-230, score-0.457]
53 In this algorithm, the information propagates via multilevel dictionaries in a top-down fashion. [sent-233, score-0.383]
54 For the nodes in lower layers, the learnt dictionaries are desired to encode more specific information. [sent-234, score-0.792]
55 On the other hand, the inherited dictionaries from ancestors consist of more general information, based on which the response z should be used to minimize the classification loss. [sent-235, score-0.659]
56 For an internode t, the learning process of dictionary and class models is done iteratively, which consists of two steps: 1) Coding: by fixing the dictionary Dt, we compute the coefficients and generate the features zt of the samples. [sent-242, score-1.544]
57 2) Dictionary and class models updating: based on the features computed by previous dictionary, we optimize the class models and dictionary simultaneously. [sent-243, score-0.701]
58 Particularly, the specific part of the dictionary needs to be updated rather than the inherited part. [sent-244, score-0.899]
59 In fact, the inherited dictionary has been optimal in the higher layers, it should be inherited without any update by the classification models of the descendant nodes in lower levels. [sent-245, score-1.261]
60 The dictionary updating is a loss minimization problem through the learnt features z. [sent-247, score-0.878]
61 As the loss function is differentiable with respect to dictionary and class model parameters, the gradient of specific dictionary Dt of internode t and class model of its child node v 333888446 can be computed as following: where: βiΛ,p=? [sent-248, score-1.835]
62 After learning the dictionary and discriminative class models, we can directly use the dictionary to calculate the visual feature z, and use w for classification. [sent-258, score-1.487]
63 As the max pooling is helpful to enhance the performance, we then test the dictionary with max spatial pooling [2]. [sent-259, score-0.821]
64 For dictionary Dt, D¯t denotes the dictionaries inherited from ancestor nodes and the specific part Dt0is initialized by unsupervised dictionary initDl . [sent-263, score-2.234]
65 For each iteration, only the specific part Dt0 is updated, and the Wt is the weight of the representation which is generated via the whole dictionary Dt. [sent-264, score-0.703]
66 Due to the fact that inherited dictionary is not updated, the corresponding representations can be saved and directly used for classification in lower levels. [sent-265, score-0.948]
67 Experiments In this section, we evaluate our dictionary learning approach on two databases: SUN397 [27] and ImageNet subset [7]. [sent-267, score-0.711]
68 Since some nodes in the hierarchy connect to more than one parent nodes, we change the original structure by choosing one parent node for them. [sent-275, score-0.396]
69 Considering the time complexity, we choose the median dictionary sizes (256, 256, 5 12) for the specific part of the dictionaries in different layers in our method. [sent-281, score-1.192]
70 Thus, the dictionary sizes in different layers of the hierarchy are 256, 512 and 1024 in our experiments. [sent-282, score-0.92]
71 The learnt specific dictionary sizes in different layers are all set as 512 in our experiments. [sent-294, score-1.019]
72 For our method, it consists of three basis components: hierarchical structure, multinomial logistic classification, dictionary learning. [sent-295, score-0.811]
73 Based on the task-independent dictionary learnt by Sparse Coding, this method trains the linear SVM with one-vs-all strategy. [sent-298, score-0.846]
74 Based on the taskindependent dictionary learnt by Sparse Coding, we use the similar empirical loss in [33] and train the SVM with the hierarchical structure by SVM-struct package [13]. [sent-301, score-1.019]
75 The method trains the SVM with one-vs-all strategy, and learn a dictionary for each category. [sent-304, score-0.71]
76 With respect to the training time, we accumulate the time of three steps: dictionary learning, mid-level feature computation and model learning. [sent-309, score-0.655]
77 For the unsupervised dictionary learning methods (Bi-ScSPM and H-ScSPM), the time cost in dictionary learning is trivial compared with the other two steps. [sent-310, score-1.47]
78 The number of dictionaries we need to learn is equal with the number of internodes and L − 1 features are computed for each image. [sent-314, score-0.482]
79 For the supervised dictionary learning, the time cost lies on the number of dictionaries. [sent-337, score-0.703]
80 However, the descriptors need to be computed through all the dictionaries in Bi-TDDL method, so the time complexity increases drastically. [sent-341, score-0.38]
81 Even so, our method also outperforms H-ScSPM with the help of hierarchical dictionary learning. [sent-350, score-0.761]
82 We can clearly see the promising results of MLDDL on hierarchical error, especially on the top-1 error rate, since our proposed method fully optimizes both the multilevel dictionaries and the discriminative models towards the 333888668 TAaMHb-lLgeSocD-4ri. [sent-354, score-0.569]
83 Result Analysis and Discussion To investigate the relation among the dictionaries in different layers, we use another strategy (named as ML-DDL0) for dictionary learning in the hierarchical structure. [sent-366, score-1.18]
84 In ML-DDL0, the dictionaries in lower levels do not inherit the dictionary from ancestor nodes, in other words, the dictionaries in different layers only have the specific parts which are learnt by discrimination models. [sent-367, score-2.037]
85 On the other hand, the nodes in lower levels are so visually similar that they are much harder to be distinguished compared with nodes in higher levels. [sent-371, score-0.45]
86 However, the comparison between and H-ScSPM shows that the problem could be relieved by the benefit from dictionary learning. [sent-372, score-0.655]
87 Furthermore, the effect of the dictionary inheritance has also been revealed by the performance difference between ML-DDL We can see that the accuracy has been improved in the lower layers, especially at the leaf nodes. [sent-373, score-0.84]
88 This implies that the properties captured from the ancestor nodes are of great importance for child nodes. [sent-374, score-0.4]
89 Different from the sharing model based on pre-computed features (sibling child nodes inherit the common information from ancestor nodes) [22], these properties can be selected and weighted via class models in child nodes, thus have more flexibilities. [sent-375, score-0.632]
90 Due to dictionary inheritance in the hierarchy, the image representations in lower levels integrate all the useful information of multiple scales, which are beneficial to promote model capacity. [sent-376, score-0.916]
91 Given a test image, its local descriptors are encoded based on the specific dictionary in each layer, and the reconstruction coefficients regarded as the response of the dictionary are pooled to represent the image. [sent-381, score-1.477]
92 The dictionaries learnt in each layer consist of atoms which are biased to ML-DDL0 and ML-DDL0. [sent-382, score-0.644]
93 The points with different colors denote the locations having large values of the response using the learnt dictionaries in different levels. [sent-386, score-0.563]
94 It shows that the dictionaries in different layers consist of specific atoms, thus the different part of visual information are highlighted at different layers. [sent-387, score-0.618]
95 Compared with binary dictionary learning with flat structure (Bi-TDDL), the recognition accuracy of a lot of categories has been improved by ML-DDL. [sent-388, score-0.901]
96 Besides, the misclassification in higher levels spreads through the path and finally incurs the misclassification of the child nodes. [sent-412, score-0.329]
97 Therefore, the visual coherence of the hierarchy is directly related to the result, and finding a better tree structure is very important for visual classification. [sent-413, score-0.314]
98 Conclusion In this paper, we present a hierarchical dictionary learning approach. [sent-415, score-0.817]
99 The dictionaries in different layers are learnt to capture the discriminative information of different scales. [sent-416, score-0.712]
100 Besides, the inheritance of dictionaries in the hierarchical structure enables the categories in lower levels to exploit the features of multiple scales. [sent-417, score-0.801]
wordName wordTfidf (topN-words)
[('dictionary', 0.655), ('dictionaries', 0.34), ('inherited', 0.196), ('learnt', 0.167), ('internode', 0.155), ('layers', 0.149), ('nodes', 0.145), ('child', 0.12), ('categories', 0.117), ('hierarchy', 0.116), ('internodes', 0.111), ('ancestor', 0.109), ('hierarchical', 0.106), ('node', 0.1), ('levels', 0.096), ('coding', 0.087), ('pooling', 0.083), ('tree', 0.079), ('sibling', 0.078), ('leaf', 0.078), ('inheritance', 0.066), ('layer', 0.063), ('dt', 0.058), ('discriminative', 0.056), ('loss', 0.056), ('response', 0.056), ('learning', 0.056), ('imagenet', 0.054), ('category', 0.053), ('db', 0.052), ('encode', 0.051), ('supervised', 0.048), ('unsupervised', 0.048), ('specific', 0.048), ('inherit', 0.047), ('discrimination', 0.045), ('xi', 0.045), ('mlddl', 0.044), ('xwpu', 0.044), ('categorization', 0.044), ('misclassification', 0.044), ('kb', 0.043), ('multilevel', 0.043), ('sharing', 0.042), ('visual', 0.042), ('besides', 0.041), ('lower', 0.041), ('cas', 0.04), ('descriptors', 0.04), ('zit', 0.039), ('consist', 0.039), ('flat', 0.038), ('denotes', 0.038), ('signals', 0.036), ('atoms', 0.035), ('structure', 0.035), ('semantic', 0.033), ('china', 0.033), ('jenatton', 0.033), ('scspm', 0.031), ('ranganath', 0.031), ('mairal', 0.031), ('learn', 0.031), ('wv', 0.03), ('classifying', 0.03), ('beneficial', 0.03), ('beijing', 0.029), ('shen', 0.028), ('sparse', 0.028), ('representations', 0.028), ('classification', 0.028), ('boureau', 0.027), ('grosse', 0.027), ('yu', 0.027), ('supposed', 0.026), ('receptive', 0.026), ('properties', 0.026), ('logistic', 0.026), ('shared', 0.025), ('path', 0.025), ('icml', 0.024), ('bach', 0.024), ('ix', 0.024), ('fv', 0.024), ('xc', 0.024), ('belonging', 0.024), ('multinomial', 0.024), ('trains', 0.024), ('promising', 0.024), ('zi', 0.024), ('multiclass', 0.024), ('numbers', 0.023), ('brought', 0.023), ('pooled', 0.023), ('class', 0.023), ('torralba', 0.023), ('stage', 0.023), ('among', 0.023), ('similarity', 0.023), ('distinguished', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000006 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
2 0.56104934 392 cvpr-2013-Separable Dictionary Learning
Author: Simon Hawe, Matthias Seibert, Martin Kleinsteuber
Abstract: Many techniques in computer vision, machine learning, and statistics rely on the fact that a signal of interest admits a sparse representation over some dictionary. Dictionaries are either available analytically, or can be learned from a suitable training set. While analytic dictionaries permit to capture the global structure of a signal and allow a fast implementation, learned dictionaries often perform better in applications as they are more adapted to the considered class of signals. In imagery, unfortunately, the numerical burden for (i) learning a dictionary and for (ii) employing the dictionary for reconstruction tasks only allows to deal with relatively small image patches that only capture local image information. The approach presented in this paper aims at overcoming these drawbacks by allowing a separable structure on the dictionary throughout the learning process. On the one hand, this permits larger patch-sizes for the learning phase, on the other hand, the dictionary is applied efficiently in reconstruction tasks. The learning procedure is based on optimizing over a product of spheres which updates the dictionary as a whole, thus enforces basic dictionary proper- , ties such as mutual coherence explicitly during the learning procedure. In the special case where no separable structure is enforced, our method competes with state-of-the-art dictionary learning methods like K-SVD.
Author: Li He, Hairong Qi, Russell Zaretzki
Abstract: This paper addresses the problem of learning overcomplete dictionaries for the coupled feature spaces, where the learned dictionaries also reflect the relationship between the two spaces. A Bayesian method using a beta process prior is applied to learn the over-complete dictionaries. Compared to previous couple feature spaces dictionary learning algorithms, our algorithm not only provides dictionaries that customized to each feature space, but also adds more consistent and accurate mapping between the two feature spaces. This is due to the unique property of the beta process model that the sparse representation can be decomposed to values and dictionary atom indicators. The proposed algorithm is able to learn sparse representations that correspond to the same dictionary atoms with the same sparsity but different values in coupled feature spaces, thus bringing consistent and accurate mapping between coupled feature spaces. Another advantage of the proposed method is that the number of dictionary atoms and their relative importance may be inferred non-parametrically. We compare the proposed approach to several state-of-the-art dictionary learning methods super-resolution. tionaries learned resolution results ods. by applying this method to single image The experimental results show that dicby our method produces the best supercompared to other state-of-the-art meth-
4 0.46738502 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
Author: Sumit Shekhar, Vishal M. Patel, Hien V. Nguyen, Rama Chellappa
Abstract: Data-driven dictionaries have produced state-of-the-art results in various classification tasks. However, when the target data has a different distribution than the source data, the learned sparse representation may not be optimal. In this paper, we investigate if it is possible to optimally represent both source and target by a common dictionary. Specifically, we describe a technique which jointly learns projections of data in the two domains, and a latent dictionary which can succinctly represent both the domains in the projected low-dimensional space. An efficient optimization technique is presented, which can be easily kernelized and extended to multiple domains. The algorithm is modified to learn a common discriminative dictionary, which can be further used for classification. The proposed approach does not require any explicit correspondence between the source and target domains, and shows good results even when there are only a few labels available in the target domain. Various recognition experiments show that the methodperforms onparor better than competitive stateof-the-art methods.
5 0.46041566 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
Author: Yangmuzi Zhang, Zhuolin Jiang, Larry S. Davis
Abstract: An approach to learn a structured low-rank representation for image classification is presented. We use a supervised learning method to construct a discriminative and reconstructive dictionary. By introducing an ideal regularization term, we perform low-rank matrix recovery for contaminated training data from all categories simultaneously without losing structural information. A discriminative low-rank representation for images with respect to the constructed dictionary is obtained. With semantic structure information and strong identification capability, this representation is good for classification tasks even using a simple linear multi-classifier. Experimental results demonstrate the effectiveness of our approach.
6 0.43130642 315 cvpr-2013-Online Robust Dictionary Learning
7 0.42742762 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
8 0.34622842 422 cvpr-2013-Tag Taxonomy Aware Dictionary Learning for Region Tagging
9 0.3363649 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
10 0.31167445 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning
11 0.23690537 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
12 0.23058996 302 cvpr-2013-Multi-task Sparse Learning with Beta Process Prior for Action Recognition
13 0.20852858 204 cvpr-2013-Histograms of Sparse Codes for Object Detection
14 0.20029026 340 cvpr-2013-Probabilistic Label Trees for Efficient Large Scale Image Classification
15 0.16036566 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
16 0.15521538 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition
17 0.15520079 399 cvpr-2013-Single-Sample Face Recognition with Image Corruption and Misalignment via Sparse Illumination Transfer
18 0.14438312 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification
19 0.12646672 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video
20 0.12187497 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning
topicId topicWeight
[(0, 0.238), (1, -0.267), (2, -0.353), (3, 0.415), (4, -0.126), (5, -0.144), (6, 0.164), (7, 0.236), (8, -0.066), (9, 0.131), (10, -0.003), (11, 0.101), (12, 0.018), (13, 0.043), (14, 0.05), (15, 0.022), (16, 0.002), (17, 0.055), (18, 0.019), (19, -0.034), (20, 0.017), (21, 0.023), (22, 0.014), (23, 0.016), (24, -0.021), (25, 0.025), (26, 0.029), (27, 0.014), (28, -0.039), (29, 0.11), (30, -0.04), (31, -0.007), (32, 0.0), (33, 0.015), (34, 0.078), (35, -0.001), (36, 0.023), (37, -0.034), (38, 0.036), (39, -0.026), (40, 0.01), (41, 0.017), (42, -0.031), (43, 0.064), (44, 0.021), (45, 0.004), (46, -0.043), (47, 0.0), (48, 0.023), (49, -0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.96368074 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
2 0.95267373 392 cvpr-2013-Separable Dictionary Learning
Author: Simon Hawe, Matthias Seibert, Martin Kleinsteuber
Abstract: Many techniques in computer vision, machine learning, and statistics rely on the fact that a signal of interest admits a sparse representation over some dictionary. Dictionaries are either available analytically, or can be learned from a suitable training set. While analytic dictionaries permit to capture the global structure of a signal and allow a fast implementation, learned dictionaries often perform better in applications as they are more adapted to the considered class of signals. In imagery, unfortunately, the numerical burden for (i) learning a dictionary and for (ii) employing the dictionary for reconstruction tasks only allows to deal with relatively small image patches that only capture local image information. The approach presented in this paper aims at overcoming these drawbacks by allowing a separable structure on the dictionary throughout the learning process. On the one hand, this permits larger patch-sizes for the learning phase, on the other hand, the dictionary is applied efficiently in reconstruction tasks. The learning procedure is based on optimizing over a product of spheres which updates the dictionary as a whole, thus enforces basic dictionary proper- , ties such as mutual coherence explicitly during the learning procedure. In the special case where no separable structure is enforced, our method competes with state-of-the-art dictionary learning methods like K-SVD.
Author: Li He, Hairong Qi, Russell Zaretzki
Abstract: This paper addresses the problem of learning overcomplete dictionaries for the coupled feature spaces, where the learned dictionaries also reflect the relationship between the two spaces. A Bayesian method using a beta process prior is applied to learn the over-complete dictionaries. Compared to previous couple feature spaces dictionary learning algorithms, our algorithm not only provides dictionaries that customized to each feature space, but also adds more consistent and accurate mapping between the two feature spaces. This is due to the unique property of the beta process model that the sparse representation can be decomposed to values and dictionary atom indicators. The proposed algorithm is able to learn sparse representations that correspond to the same dictionary atoms with the same sparsity but different values in coupled feature spaces, thus bringing consistent and accurate mapping between coupled feature spaces. Another advantage of the proposed method is that the number of dictionary atoms and their relative importance may be inferred non-parametrically. We compare the proposed approach to several state-of-the-art dictionary learning methods super-resolution. tionaries learned resolution results ods. by applying this method to single image The experimental results show that dicby our method produces the best supercompared to other state-of-the-art meth-
4 0.92986143 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning
Author: Yu-Tseh Chi, Mohsen Ali, Ajit Rajwade, Jeffrey Ho
Abstract: This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary Learning (ICS-DL) algorithm. An important and distinguishing feature of the proposed framework is that all dictionary blocks are trained simultaneously with respect to each data group while the intra-block coherence being explicitly minimized as an important objective. We provide both empirical evidence and heuristic support for this feature that can be considered as a direct consequence of incorporating both the group structure for the input data and the block structure for the dictionary in the learning process. The optimization problems for both the dictionary learning and sparse coding can be solved efficiently using block-gradient descent, and the details of the optimization algorithms are presented. We evaluate the proposed methods using well-known datasets, and favorable comparisons with state-of-the-art dictionary learning methods demonstrate the viability and validity of the proposed framework.
5 0.91305906 315 cvpr-2013-Online Robust Dictionary Learning
Author: Cewu Lu, Jiaping Shi, Jiaya Jia
Abstract: Online dictionary learning is particularly useful for processing large-scale and dynamic data in computer vision. It, however, faces the major difficulty to incorporate robust functions, rather than the square data fitting term, to handle outliers in training data. In thispaper, wepropose a new online framework enabling the use of ?1 sparse data fitting term in robust dictionary learning, notably enhancing the usability and practicality of this important technique. Extensive experiments have been carried out to validate our new framework.
6 0.88631159 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification
7 0.83761722 125 cvpr-2013-Dictionary Learning from Ambiguously Labeled Data
8 0.77643299 185 cvpr-2013-Generalized Domain-Adaptive Dictionaries
9 0.76852995 5 cvpr-2013-A Bayesian Approach to Multimodal Visual Dictionary Learning
10 0.71528327 422 cvpr-2013-Tag Taxonomy Aware Dictionary Learning for Region Tagging
11 0.60384709 204 cvpr-2013-Histograms of Sparse Codes for Object Detection
12 0.6007992 302 cvpr-2013-Multi-task Sparse Learning with Beta Process Prior for Action Recognition
13 0.58362722 220 cvpr-2013-In Defense of Sparsity Based Face Recognition
14 0.5696553 83 cvpr-2013-Classification of Tumor Histology via Morphometric Context
15 0.53782159 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation
16 0.44193947 304 cvpr-2013-Multipath Sparse Coding Using Hierarchical Matching Pursuit
17 0.42429647 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition
18 0.42078701 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification
19 0.39422226 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation
20 0.3884064 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition
topicId topicWeight
[(10, 0.103), (16, 0.021), (26, 0.039), (28, 0.012), (33, 0.345), (43, 0.122), (57, 0.012), (67, 0.072), (69, 0.085), (87, 0.078)]
simIndex simValue paperId paperTitle
1 0.97889936 83 cvpr-2013-Classification of Tumor Histology via Morphometric Context
Author: Hang Chang, Alexander Borowsky, Paul Spellman, Bahram Parvin
Abstract: Image-based classification oftissue histology, in terms of different components (e.g., normal signature, categories of aberrant signatures), provides a series of indices for tumor composition. Subsequently, aggregation of these indices in each whole slide image (WSI) from a large cohort can provide predictive models of clinical outcome. However, the performance of the existing techniques is hindered as a result of large technical and biological variations that are always present in a large cohort. In this paper, we propose two algorithms for classification of tissue histology based on robust representations of morphometric context, which are built upon nuclear level morphometric features at various locations and scales within the spatial pyramid matching (SPM) framework. These methods have been evaluated on two distinct datasets of different tumor types collected from The Cancer Genome Atlas (TCGA), and the experimental results indicate that our methods are (i) extensible to different tumor types; (ii) robust in the presence of wide technical and biological variations; (iii) invariant to different nuclear segmentation strategies; and (iv) scalable with varying training sample size. In addition, our experiments suggest that enforcing sparsity, during the construction of morphometric context, further improves the performance of the system.
2 0.96882993 368 cvpr-2013-Rolling Shutter Camera Calibration
Author: Luc Oth, Paul Furgale, Laurent Kneip, Roland Siegwart
Abstract: Rolling Shutter (RS) cameras are used across a wide range of consumer electronic devices—from smart-phones to high-end cameras. It is well known, that if a RS camera is used with a moving camera or scene, significant image distortions are introduced. The quality or even success of structure from motion on rolling shutter images requires the usual intrinsic parameters such as focal length and distortion coefficients as well as accurate modelling of the shutter timing. The current state-of-the-art technique for calibrating the shutter timings requires specialised hardware. We present a new method that only requires video of a known calibration pattern. Experimental results on over 60 real datasets show that our method is more accurate than the current state of the art.
same-paper 3 0.9517169 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization
Author: Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
Abstract: For the task of visual categorization, the learning model is expected to be endowed with discriminative visual feature representation and flexibilities in processing many categories. Many existing approaches are designed based on a flat category structure, or rely on a set of pre-computed visual features, hence may not be appreciated for dealing with large numbers of categories. In this paper, we propose a novel dictionary learning method by taking advantage of hierarchical category correlation. For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity. Moreover, the dictionaries in lower levels also inherit the dictionary of ancestor nodes, so that categories in lower levels are described with multi-scale visual information using our dictionary learning approach. Experiments on ImageNet object data subset and SUN397 scene dataset demonstrate that our approach achieves promising performance on data with large numbers of classes compared with some state-of-the-art methods, and is more efficient in processing large numbers of categories.
4 0.94632965 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases
Author: Wongun Choi, Yu-Wei Chao, Caroline Pantofaru, Silvio Savarese
Abstract: Visual scene understanding is a difficult problem interleaving object detection, geometric reasoning and scene classification. We present a hierarchical scene model for learning and reasoning about complex indoor scenes which is computationally tractable, can be learned from a reasonable amount of training data, and avoids oversimplification. At the core of this approach is the 3D Geometric Phrase Model which captures the semantic and geometric relationships between objects whichfrequently co-occur in the same 3D spatial configuration. Experiments show that this model effectively explains scene semantics, geometry and object groupings from a single image, while also improving individual object detections.
5 0.94522089 427 cvpr-2013-Texture Enhanced Image Denoising via Gradient Histogram Preservation
Author: Wangmeng Zuo, Lei Zhang, Chunwei Song, David Zhang
Abstract: Image denoising is a classical yet fundamental problem in low level vision, as well as an ideal test bed to evaluate various statistical image modeling methods. One of the most challenging problems in image denoising is how to preserve the fine scale texture structures while removing noise. Various natural image priors, such as gradient based prior, nonlocal self-similarity prior, and sparsity prior, have been extensively exploited for noise removal. The denoising algorithms based on these priors, however, tend to smooth the detailed image textures, degrading the image visual quality. To address this problem, in this paper we propose a texture enhanced image denoising (TEID) method by enforcing the gradient distribution of the denoised image to be close to the estimated gradient distribution of the original image. A novel gradient histogram preservation (GHP) algorithm is developed to enhance the texture structures while removing noise. Our experimental results demonstrate that theproposed GHP based TEID can well preserve the texture features of the denoised images, making them look more natural.
6 0.9444021 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection
7 0.94429541 94 cvpr-2013-Context-Aware Modeling and Recognition of Activities in Video
8 0.9441458 256 cvpr-2013-Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning
9 0.94365203 329 cvpr-2013-Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images
10 0.94340146 132 cvpr-2013-Discriminative Re-ranking of Diverse Segmentations
11 0.9430896 36 cvpr-2013-Adding Unlabeled Samples to Categories by Learned Attributes
12 0.94280565 372 cvpr-2013-SLAM++: Simultaneous Localisation and Mapping at the Level of Objects
13 0.94278461 318 cvpr-2013-Optimized Pedestrian Detection for Multiple and Occluded People
14 0.94233936 82 cvpr-2013-Class Generative Models Based on Feature Regression for Pose Estimation of Object Categories
15 0.94203079 416 cvpr-2013-Studying Relationships between Human Gaze, Description, and Computer Vision
16 0.94163382 61 cvpr-2013-Beyond Point Clouds: Scene Understanding by Reasoning Geometry and Physics
17 0.94139349 417 cvpr-2013-Subcategory-Aware Object Classification
18 0.94135058 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection
19 0.94134921 242 cvpr-2013-Label Propagation from ImageNet to 3D Point Clouds
20 0.94134349 284 cvpr-2013-Mesh Based Semantic Modelling for Indoor and Outdoor Scenes