iccv iccv2013 iccv2013-276 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Chen-Kuo Chiang, Te-Feng Su, Chih Yen, Shang-Hong Lai
Abstract: We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn categorydependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
Reference: text
sentIndex sentText sentNum sentScore
1 Multi-Attributed Dictionary Learning for Sparse Coding Chen-Kuo Chiang, Te-Feng Su, Chih Yen and Shang-Hong Lai National Tsing Hua University, Hsinchu, 300, Taiwan {ckchi ang , t fsu , Abstract We present a multi-attributed dictionary learning algorithm for sparse coding. [sent-1, score-0.615]
2 Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. [sent-2, score-0.348]
3 Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm. [sent-5, score-0.55]
4 Lately, learning the dictionary instead of using predefined bases has been shown to improve signal reconstruction significantly. [sent-10, score-0.645]
5 Dictionary learning of sparse representation is aimed to find the optimal dictionary that leads to the lowest reconstruction error with a set of sparse coefficients. [sent-11, score-0.775]
6 [18] exploited the entire training set as the dictionary and proposed the sparse representation classification (SRC) for robust face recognition. [sent-13, score-0.711]
7 [12] assumed a correct dictionary associated with one class lai @ cs . [sent-16, score-0.561]
8 Example of utilizing multiple attributes in dictionary learning for sparse representation with attributes of facial expressions, pose variations and lighting conditions. [sent-20, score-1.114]
9 The K-SVD algorithm [1] learns an over-complete dictionary from a set of signals. [sent-23, score-0.511]
10 Since it focuses on the representation power of the dictionary without considering the discrimination capability, the Discriminative K-SVD algorithm (D-KSVD) [20] achieved the represen- tational and discriminative dictionary learning in a unified process. [sent-25, score-1.215]
11 Submodular dictionary learning [9] models the selection of the dictionary columns and the sparse representation of signals as a joint combinatorial optimization problem. [sent-27, score-1.153]
12 Later, a compact and discriminative submodular dictionary learning was proposed by a greedy-based approach [6]. [sent-28, score-0.69]
13 Dictionary selection by considering data connectivity and attribute similarity. [sent-30, score-0.286]
14 A label consistent K-SVD (LC-KSVD) algorithm [5] associated the class labels with each dictionary atom to enforce discrimination in sparse codes. [sent-32, score-0.783]
15 A recent work [16] learned a context aware dictionary by a set of labeled training images to predict the presence of objects in images. [sent-34, score-0.547]
16 In the existing methods, only single attribute or class label is considered in the dictionary learning problem. [sent-35, score-0.835]
17 The compact term favors close dictionary atoms by utilizing both data and attribute similarity into one unified distance measure. [sent-44, score-1.092]
18 The reconstruction term introduces the representative ability by selecting dictionary atoms with minimal reconstruction errors. [sent-45, score-0.882]
19 Last, the label term enforces label-consistent dictionary atoms from multi-attributed training samples. [sent-46, score-0.772]
20 Tphpien gtrathnesition probability of the graph is utilized to measure a new distance on how close the sample pair is and how similar the attributes they share simultaneously. [sent-48, score-0.374]
21 We present an objective function for dictionary learning tphreats cnotn asnid oebrsje tchtiev ed fatuan representation capability, the discrimination power, and label consistency of multiple attributes in a unified framework. [sent-49, score-0.941]
22 Problem Statement Given a signal x in Rm, a sparse approximation over a dictionary D in Rm×k is to find a linear combination of a few atoms from D that is close to the signal x, where the k columns selected from D are referred to as dictionary atoms. [sent-53, score-1.232]
23 Therefore, dictionary D can be represented as D = [D(1), . [sent-61, score-0.511]
24 For the face recognition problem, the attributes could be facial expression, face pose or a lighting condition, etc. [sent-74, score-0.457]
25 For example, the attribute for facial expression may be the smile, angry or screaming type. [sent-80, score-0.375]
26 The types in attribute ai are defined as ai = [ai1, . [sent-81, score-0.328]
27 Considering data distance and attribute similarity in dictionary learning problem, we can combine these two terms with appropriate weighting. [sent-85, score-0.83]
28 However, it is difficult to tune the weighting coefficients to achieve optimal performance as the number of attribute increases. [sent-86, score-0.247]
29 To learn the dictionary automatically and deterministically, we model the dictionary learning as a clustering problem. [sent-87, score-1.086]
30 A new distance of measuring the pairwise relationship is proposed by considering both the Euclidean distance and shared attributes between a pair of data points. [sent-89, score-0.314]
31 Then, the dictionaries are learned by partitioning the graph into K clusters via minimizing the objective function which enforces the dictionary to be compact, reconstructive and label-consistent. [sent-90, score-0.751]
32 Distance Measure of Data and Attributes To realize how dictionary can be selected by graph clustering based on data connectivity (the k-nearest-neighbor relationship) and shared attributes, a simple example is depicted in Figure 2. [sent-92, score-0.618]
33 (p1, smile, 90◦, dark) represents that the image is from person 1 with smile expression, 90◦ face pose and captured under dark lighting condition. [sent-95, score-0.267]
34 In Figure 2 (b), dictionary selection considers only data connectivity. [sent-96, score-0.538]
35 We argue that dictionary selection can be better achieved based on data connectivity and their multiple attributes, which is illustrated in Figure 2 (d). [sent-98, score-0.58]
36 In this paper, we integrate the data distance and attribute similarity into a unified framework based on the construction of an augmented graph. [sent-99, score-0.36]
37 Except for data vertices, we also add vertices for attributes Figure 3. [sent-104, score-0.378]
38 , ar] containing a total number of r attributes which are associated with vertices in V . [sent-110, score-0.378]
39 Attribute vertices Va can be defined to associate with type j, j=1,. [sent-115, score-0.222]
40 An edge between a data vertex and an att{r{ibvute} vert}ex (vi , vaji ) ∈ Ea is constructed if the data vertex vi has the attribute ai with the type aji . [sent-119, score-0.646]
41 e can define an augmented graph with attributes Ga = (V ? [sent-122, score-0.255]
42 For brevity, the data vertices are called D node and the attribute vertices are called A node for the rest of this paper. [sent-130, score-0.733]
43 Disconnected vertices or vertices with just a few paths between them imply their distance is far. [sent-135, score-0.481]
44 The entry Pql indicates the probability oftraveling from vertex vq to vertex vl. [sent-138, score-0.774]
45 The transition probability matrix P that we can travel between two vertices in s steps is defined as: P(1) = P, P(s) = P(s−1) ∗ P = Ps. [sent-139, score-0.344]
46 The transition probability measures reaching vl in 1, . [sent-145, score-0.254]
47 11 113399 Next, we give the definition of transition probability among D node and A node. [sent-150, score-0.222]
48 The transition probability from a vertex vq in D node to another vertex vl in D node is defined by: P(1)(vq,vl) =? [sent-151, score-1.117]
49 eIrte means vq can sreeancths vl ei nn one step w neitigh htbheo probability |Ω(vq1)|+r if vq and vl are connected by a edge. [sent-153, score-1.089]
50 This is intuitive since vq has Ω(vq) + r edges to other vertices. [sent-154, score-0.421]
51 Similarly, the transition probability from a vertex in D node to a vertex in A node is given by: P(1)(vq,vjai) =? [sent-157, score-0.596]
52 is∈e EA (4) The transition probability from a vertex in A node to a vertex in D node is given by: P(1)(vaji,vq) =? [sent-160, score-0.596]
53 w∈ise EA (5) Since there is no edge between any two A node, the transition probability is zero between two vertices in A node: P(1)(vaji,vtas) = 0,∀vjai, vats ∈ VA (6) From the transition probabilities defined by Eq. [sent-163, score-0.451]
54 In Figure 3, we use only the pose attribute for example. [sent-166, score-0.244]
55 An example of the transition probability matrix P for nine vertices in D node and two vertices in A node is given below: whpe(1r)=t⎢⎡ ⎣ o1rd/0 :e3of10/r:5o3ws. [sent-168, score-0.67]
56 By defining the transition probability between vertices in the augmented graph, we can note that for two vertices vq and vl with the same connectivity in graph Ga, if a vertex vt shares more attributes Figure 4. [sent-173, score-1.56]
57 with vq than vl, the distance d(vt, vq) is less than d(vt, vl). [sent-175, score-0.484]
58 Vertex v1 shares the same attributes (90◦, angry) with v3 and shares only one attribute 90◦ with v2. [sent-178, score-0.463]
59 Instead of learning a dictionary for the entire dataset, we learn K category-dependent subdictionary D(1), . [sent-184, score-0.55]
60 Dictionary learning aims to be compact (closeness of dictionary atoms based on data distance and attribute similar- ity), reconstructive (low reconstruction error with correct dictionary), and label-consistent (encouraging labels of dictionary atoms to be similar). [sent-192, score-1.859]
61 In the following, we formulate the multi-attributed dictionary learning problem and describe the novel objective function for the optimization. [sent-193, score-0.58]
62 Compact Term We use the compact term to constrain the dictionary atoms to be selected under closer data distance or with more shared attributes to the centroid. [sent-196, score-1.01]
63 Denote ¯ v(k) the centroid of atoms in dictionary D(k). [sent-199, score-0.729]
64 ∈D(k) In the dictionary selection process, an atom vq is assigned to dictionary D(k) if it satisfies: k∗ = argmkind(vq, ¯v (k)) (9) 4. [sent-206, score-1.509]
65 Reconstruction Term It is critical to learn a dictionary which is representative, i. [sent-208, score-0.511]
66 with low reconstruction error, since the discrimination 11 114400 power relies on low reconstruction error for representing a data sample using the correct dictionary. [sent-210, score-0.302]
67 A reconstruction term is introduced to encourage dictionary selection with minimal reconstruction error during training process. [sent-211, score-0.8]
68 ∈ D(k) An atom vq is assigned to dictionary D(k∗) if it satisfies: k∗ = argmkin? [sent-220, score-0.971]
69 Label Term In the dictionary learning based on multiple data attributes, the attribute labels within a sub-dictionary are encouraged to be consistent, as suggested by [5, 6]. [sent-230, score-0.791]
70 Denote Nik,j to be the number of labels with type j of attribute ai in dictionary k. [sent-231, score-0.823]
71 The label consistency can be evaluated by counting the maximal number of types in each attribute across all classes. [sent-232, score-0.324]
72 n=Ni1k,Nj∗ik,j⎠⎟ ⎞,wherej∗= argjmaxN(1ik2,j) where j∗ is the label type for attribute i,in dictionary k with the maximal number. [sent-236, score-0.834]
73 Let be the number of labels with type j of attribute ai in dictionary k after adding one sample vq into this dictionary. [sent-239, score-1.28]
74 A training sample vq is assigned to dictionary k if it satisfies: Nˆik,j k∗= argmkaxr1? [sent-240, score-1.004]
75 One can expect if all attributes of a sample frwalli ien, itth ies categories cwanith e xthpee cmt iafx aiml a alt tnruibmutbeesr, tfh ea function returns the value 1(after normalization by r). [sent-248, score-0.257]
76 Optimization of MADL The objective function of multi-attributed dictionary learning combines the compact term, reconstruction term and label term. [sent-251, score-0.822]
77 A K-Medoids clustering method [7] is exploited to find the solution iteratively: the most centrally located data sample in a cluster is selected as a centroid according to the learned distance matrix. [sent-253, score-0.223]
78 Distance between a pair of samples: We can note that both compact term and label term range from 0 to 1whereas the reconstruction term does not. [sent-263, score-0.314]
79 , ek (vi)], where ek (vi) is the reconstruction error of sparse coding using dictionary D(k) . [sent-268, score-0.704]
80 Cluster update: After new centroids are decided for all clusters, data points are assigned to new clusters according to their nearest centroids based on the summed-up distance of three terms. [sent-274, score-0.253]
81 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: repeat for each data point vq Calculate compact and label term by Eq. [sent-286, score-0.568]
82 13, Solve sparse coefficients for vq using D(1) , . [sent-288, score-0.516]
83 , D(k) , Calculate drecon (vq, ¯ v(k) ) using e(vq), Compute the distance of vq to each centroid ¯ v(k) , Switch vq to a new cluster if the distance decreases, end Update centroid ¯ v(k) of D(k), k = 1, . [sent-291, score-1.158]
84 The class label of the test sample is decided by counting the label with maximal number from non-zero coefficients using the dictionary with minimal reconstruction error. [sent-305, score-0.84]
85 We compared our results with K-Means, SRC [18] and other dictionary learning algorithms: SPAMS [11], FDDL Figure 6. [sent-312, score-0.55]
86 Among the dictionary learning methods, SPAMS learns the dictionary by matrix factorization in an online learning manner. [sent-340, score-1.1]
87 FDDL adopts the Fisher discrimination criterion into the dictionary learning, which also learns class-specified dictionaries. [sent-341, score-0.587]
88 We also compare our method with LCSVD which uses class labels (single attribute only) in their formulation to learn dictionaries. [sent-342, score-0.265]
89 Two attributes are exploited in this dataset: action and angle. [sent-356, score-0.299]
90 In the attribute of action, there are eleven types of actions. [sent-357, score-0.328]
91 So, there are 100 types in the attribute of identity. [sent-381, score-0.25]
92 We can note that the pure clustering for dictionary learning based only on data distance gave the worst results than all the other methods. [sent-413, score-0.638]
93 After the termination of our optimization process, we found that some classes might have only 2 to 3 samples as their dictionary atoms. [sent-422, score-0.57]
94 Since we did not constrain the number of dictionary atoms, we will address this issue in our future work. [sent-424, score-0.511]
95 The recognition accuracies of the proposed MADL and other dictionary learning methods are listed in Table 4. [sent-459, score-0.585]
96 Conclusion We presented a novel multi-attributed dictionary learning algorithm for sparse coding in this paper. [sent-464, score-0.648]
97 In order to take both data and the associated multiple attributes into consideration, we first proposed a joint distance matrix. [sent-465, score-0.251]
98 Experimental results have shown improved performance by using the proposed algorithm over the previous dictionary learning methods through the action classification and face recognition experiments. [sent-467, score-0.708]
99 Learning a discriminative dictionary for sparse coding via label consistent k-svd. [sent-501, score-0.678]
100 Classification and clustering via dictionary learning with structured incoherence and shared features. [sent-560, score-0.575]
wordName wordTfidf (topN-words)
[('dictionary', 0.511), ('vq', 0.421), ('madl', 0.243), ('attribute', 0.217), ('vertices', 0.19), ('attributes', 0.188), ('vertex', 0.153), ('atoms', 0.145), ('ixmas', 0.13), ('lcsvd', 0.11), ('transition', 0.107), ('vl', 0.1), ('dictionaries', 0.098), ('spams', 0.097), ('reconstruction', 0.095), ('smile', 0.092), ('pie', 0.09), ('action', 0.085), ('fddl', 0.078), ('eleven', 0.078), ('discrimination', 0.076), ('src', 0.074), ('face', 0.073), ('centroid', 0.073), ('expression', 0.073), ('cmu', 0.069), ('node', 0.068), ('ar', 0.068), ('compact', 0.067), ('centroids', 0.067), ('ctionbydat', 0.066), ('dictonarysel', 0.066), ('ectiv', 0.066), ('sparse', 0.065), ('distance', 0.063), ('mhi', 0.059), ('cvx', 0.059), ('expressions', 0.054), ('unified', 0.053), ('vi', 0.052), ('facial', 0.049), ('glasses', 0.048), ('submodular', 0.048), ('probability', 0.047), ('lighting', 0.047), ('totally', 0.047), ('scarf', 0.045), ('aini', 0.044), ('categorydependent', 0.044), ('drecon', 0.044), ('vjai', 0.044), ('label', 0.044), ('reconstructive', 0.042), ('connectivity', 0.042), ('graph', 0.04), ('ai', 0.039), ('atom', 0.039), ('learning', 0.039), ('va', 0.039), ('paths', 0.038), ('training', 0.036), ('term', 0.036), ('angry', 0.036), ('sample', 0.036), ('accuracies', 0.035), ('actions', 0.035), ('mairal', 0.034), ('database', 0.034), ('coding', 0.033), ('ea', 0.033), ('types', 0.033), ('closeness', 0.032), ('type', 0.032), ('samples', 0.032), ('maximal', 0.03), ('calculate', 0.03), ('objective', 0.03), ('clusters', 0.03), ('angles', 0.03), ('coefficients', 0.03), ('shares', 0.029), ('person', 0.028), ('terminates', 0.027), ('walk', 0.027), ('pose', 0.027), ('augmented', 0.027), ('termination', 0.027), ('selection', 0.027), ('exploited', 0.026), ('splits', 0.026), ('vt', 0.026), ('lai', 0.026), ('decided', 0.026), ('capability', 0.025), ('clustering', 0.025), ('discriminative', 0.025), ('illumination', 0.025), ('june', 0.024), ('class', 0.024), ('labels', 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
Author: Chen-Kuo Chiang, Te-Feng Su, Chih Yen, Shang-Hong Lai
Abstract: We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn categorydependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
2 0.43417934 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
3 0.42517713 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
4 0.32239234 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
Author: Jiajia Luo, Wei Wang, Hairong Qi
Abstract: Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depthbased human action recognition. Especially, a discriminative class-specific dictionary learning algorithm isproposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly hqi } @ ut k . edu GB ImagesR epth ImagesD setkonlSy0 896.5170d4ept.3h021 .x02y 19.876504.dep3th02.1 x02. achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.
5 0.30820233 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
Author: Jingjing Zheng, Zhuolin Jiang
Abstract: We present an approach to jointly learn a set of viewspecific dictionaries and a common dictionary for crossview action recognition. The set of view-specific dictionaries is learned for specific views while the common dictionary is shared across different views. Our approach represents videos in each view using both the corresponding view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from different views of the same action to have similar sparse representations. In this way, we can align view-specific features in the sparse feature spaces spanned by the viewspecific dictionary set and transfer the view-shared features in the sparse feature space spanned by the common dictionary. Meanwhile, the incoherence between the common dictionary and the view-specific dictionary set enables us to exploit the discrimination information encoded in viewspecific features and view-shared features separately. In addition, the learned common dictionary not only has the capability to represent actions from unseen views, but also , makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labels exist in the target view. Extensive experiments using the multi-view IXMAS dataset demonstrate that our approach outperforms many recent approaches for cross-view action recognition.
6 0.28880337 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
7 0.28721523 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
8 0.28402883 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
9 0.26379973 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
10 0.2423629 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
11 0.22173204 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
12 0.21262643 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
13 0.1972187 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors
14 0.194593 52 iccv-2013-Attribute Adaptation for Personalized Image Search
15 0.19077711 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
16 0.17811944 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
17 0.17019834 96 iccv-2013-Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition
18 0.16900796 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
19 0.16801246 399 iccv-2013-Spoken Attributes: Mixing Binary and Relative Attributes to Say the Right Thing
20 0.15053941 53 iccv-2013-Attribute Dominance: What Pops Out?
topicId topicWeight
[(0, 0.249), (1, 0.279), (2, -0.14), (3, -0.059), (4, -0.305), (5, -0.237), (6, -0.199), (7, -0.199), (8, 0.052), (9, 0.093), (10, -0.03), (11, 0.167), (12, 0.059), (13, 0.103), (14, -0.091), (15, 0.106), (16, 0.017), (17, 0.004), (18, -0.019), (19, 0.017), (20, -0.006), (21, -0.041), (22, 0.072), (23, 0.047), (24, -0.003), (25, -0.068), (26, -0.076), (27, -0.042), (28, 0.051), (29, -0.003), (30, -0.02), (31, 0.057), (32, -0.034), (33, -0.08), (34, -0.024), (35, 0.007), (36, 0.048), (37, -0.026), (38, -0.009), (39, 0.024), (40, 0.021), (41, 0.031), (42, 0.015), (43, 0.043), (44, -0.003), (45, -0.027), (46, 0.001), (47, -0.041), (48, 0.051), (49, -0.009)]
simIndex simValue paperId paperTitle
same-paper 1 0.968633 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
Author: Chen-Kuo Chiang, Te-Feng Su, Chih Yen, Shang-Hong Lai
Abstract: We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn categorydependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
2 0.84079885 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
3 0.82693464 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
4 0.77483708 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
5 0.76788211 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
Author: Zhaowen Wang, Jianchao Yang, Nasser Nasrabadi, Thomas Huang
Abstract: Sparse Representation-based Classification (SRC) is a powerful tool in distinguishing signal categories which lie on different subspaces. Despite its wide application to visual recognition tasks, current understanding of SRC is solely based on a reconstructive perspective, which neither offers any guarantee on its classification performance nor provides any insight on how to design a discriminative dictionary for SRC. In this paper, we present a novel perspective towards SRC and interpret it as a margin classifier. The decision boundary and margin of SRC are analyzed in local regions where the support of sparse code is stable. Based on the derived margin, we propose a hinge loss function as the gauge for the classification performance of SRC. A stochastic gradient descent algorithm is implemented to maximize the margin of SRC and obtain more discriminative dictionaries. Experiments validate the effectiveness of the proposed approach in predicting classification performance and improving dictionary quality over reconstructive ones. Classification results competitive with other state-ofthe-art sparse coding methods are reported on several data sets.
6 0.7109918 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
7 0.70021015 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
8 0.69693393 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
9 0.6588971 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
10 0.65740967 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
11 0.65432847 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
12 0.6198414 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
14 0.48998168 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
15 0.48527369 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
16 0.48280075 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
17 0.47495335 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
18 0.47203392 31 iccv-2013-A Unified Probabilistic Approach Modeling Relationships between Attributes and Objects
19 0.4714663 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
20 0.46815264 19 iccv-2013-A Learning-Based Approach to Reduce JPEG Artifacts in Image Matting
topicId topicWeight
[(2, 0.085), (7, 0.011), (13, 0.02), (26, 0.107), (31, 0.028), (42, 0.127), (48, 0.011), (64, 0.053), (73, 0.02), (78, 0.263), (89, 0.151), (97, 0.02)]
simIndex simValue paperId paperTitle
1 0.86850339 69 iccv-2013-Capturing Global Semantic Relationships for Facial Action Unit Recognition
Author: Ziheng Wang, Yongqiang Li, Shangfei Wang, Qiang Ji
Abstract: In this paper we tackle the problem of facial action unit (AU) recognition by exploiting the complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards this goal, we build a hierarchical model that combines the bottom-level image features and the top-level AU relationships to jointly recognize AUs in a principled manner. The proposed model has two major advantages over existing methods. 1) Unlike methods that can only capture local pair-wise AU dependencies, our model is developed upon the restricted Boltzmann machine and therefore can exploit the global relationships among AUs. 2) Although AU relationships are influenced by many related factors such as facial expressions, these factors are generally ignored by the current methods. Our model, however, can successfully capture them to more accurately characterize the AU relationships. Efficient learning and inference algorithms of the proposed model are also developed. Experimental results on benchmark databases demonstrate the effectiveness of the proposed approach in modelling complex AU relationships as well as its superior AU recognition performance over existing approaches.
2 0.84668517 290 iccv-2013-New Graph Structured Sparsity Model for Multi-label Image Annotations
Author: Xiao Cai, Feiping Nie, Weidong Cai, Heng Huang
Abstract: In multi-label image annotations, because each image is associated to multiple categories, the semantic terms (label classes) are not mutually exclusive. Previous research showed that such label correlations can largely boost the annotation accuracy. However, all existing methods only directly apply the label correlation matrix to enhance the label inference and assignment without further learning the structural information among classes. In this paper, we model the label correlations using the relational graph, and propose a novel graph structured sparse learning model to incorporate the topological constraints of relation graph in multi-label classifications. As a result, our new method will capture and utilize the hidden class structures in relational graph to improve the annotation results. In proposed objective, a large number of structured sparsity-inducing norms are utilized, thus the optimization becomes difficult. To solve this problem, we derive an efficient optimization algorithm with proved convergence. We perform extensive experiments on six multi-label image annotation benchmark data sets. In all empirical results, our new method shows better annotation results than the state-of-the-art approaches.
3 0.8466258 344 iccv-2013-Recognising Human-Object Interaction via Exemplar Based Modelling
Author: Jian-Fang Hu, Wei-Shi Zheng, Jianhuang Lai, Shaogang Gong, Tao Xiang
Abstract: Human action can be recognised from a single still image by modelling Human-object interaction (HOI), which infers the mutual spatial structure information between human and object as well as their appearance. Existing approaches rely heavily on accurate detection of human and object, and estimation of human pose. They are thus sensitive to large variations of human poses, occlusion and unsatisfactory detection of small size objects. To overcome this limitation, a novel exemplar based approach is proposed in this work. Our approach learns a set of spatial pose-object interaction exemplars, which are density functions describing how a person is interacting with a manipulated object for different activities spatially in a probabilistic way. A representation based on our HOI exemplar thus has great potential for being robust to the errors in human/object detection and pose estimation. A new framework consists of a proposed exemplar based HOI descriptor and an activity specific matching model that learns the parameters is formulated for robust human activity recog- nition. Experiments on two benchmark activity datasets demonstrate that the proposed approach obtains state-ofthe-art performance.
4 0.82177013 252 iccv-2013-Line Assisted Light Field Triangulation and Stereo Matching
Author: Zhan Yu, Xinqing Guo, Haibing Lin, Andrew Lumsdaine, Jingyi Yu
Abstract: Light fields are image-based representations that use densely sampled rays as a scene description. In this paper, we explore geometric structures of 3D lines in ray space for improving light field triangulation and stereo matching. The triangulation problem aims to fill in the ray space with continuous and non-overlapping simplices anchored at sampled points (rays). Such a triangulation provides a piecewise-linear interpolant useful for light field superresolution. We show that the light field space is largely bilinear due to 3D line segments in the scene, and direct triangulation of these bilinear subspaces leads to large errors. We instead present a simple but effective algorithm to first map bilinear subspaces to line constraints and then apply Constrained Delaunay Triangulation (CDT). Based on our analysis, we further develop a novel line-assisted graphcut (LAGC) algorithm that effectively encodes 3D line constraints into light field stereo matching. Experiments on synthetic and real data show that both our triangulation and LAGC algorithms outperform state-of-the-art solutions in accuracy and visual quality.
same-paper 5 0.80935729 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
Author: Chen-Kuo Chiang, Te-Feng Su, Chih Yen, Shang-Hong Lai
Abstract: We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn categorydependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
6 0.78149378 175 iccv-2013-From Actemes to Action: A Strongly-Supervised Representation for Detailed Action Understanding
7 0.73373437 155 iccv-2013-Facial Action Unit Event Detection by Cascade of Tasks
8 0.72373879 150 iccv-2013-Exemplar Cut
9 0.70077276 194 iccv-2013-Heterogeneous Image Features Integration via Multi-modal Semi-supervised Learning Model
10 0.69431949 179 iccv-2013-From Subcategories to Visual Composites: A Multi-level Framework for Object Detection
11 0.69048715 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
12 0.68556082 277 iccv-2013-Multi-channel Correlation Filters
13 0.68286121 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
14 0.68255091 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
15 0.68185061 149 iccv-2013-Exemplar-Based Graph Matching for Robust Facial Landmark Localization
16 0.68080086 59 iccv-2013-Bayesian Joint Topic Modelling for Weakly Supervised Object Localisation
17 0.68057799 268 iccv-2013-Modeling 4D Human-Object Interactions for Event and Object Recognition
18 0.67980748 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
19 0.67578179 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
20 0.67564499 43 iccv-2013-Active Visual Recognition with Expertise Estimation in Crowdsourcing