iccv iccv2013 iccv2013-354 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
Reference: text
sentIndex sentText sentNum sentScore
1 Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. [sent-4, score-0.599]
2 In practice, both training and testing data may be corrupted and contain noises and outliers. [sent-5, score-0.377]
3 Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. [sent-6, score-0.296]
4 In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. [sent-7, score-0.742]
5 We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. [sent-8, score-0.889]
6 In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. [sent-9, score-0.596]
7 Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach. [sent-10, score-0.629]
8 Introduction With the development of harmonic analysis [4, 3], sparse models have received a lot of attention in recent years. [sent-12, score-0.169]
9 The universal sparsity in real applications enables us to achieve good performancein many areas such as compressive sensing [3], image recovery [6] and classification [29]. [sent-13, score-0.118]
10 Specifically, learning a sparse prototype model (or “dictionary”) [15, 21, 6] to represent training data set is often applied as a first step. [sent-15, score-0.211]
11 The advantages of dictionary learning over pre-defined fixed bases, such as DCT and FFT, have been shown in many applications [8, 23, 6]. [sent-16, score-0.472]
12 Recent studies [26] also provided theoretical support for exact recovery of all codewords under that condition of sufficient sparsity and . [sent-17, score-0.338]
13 Most sparse coding methods [27, 15, 6, 17] make a basic assumption that the observed signals consist of a sparse linear combination of codewords plus dense Gaussian noises of small variation. [sent-21, score-0.952]
14 However, though working well generally, this assumption does not hold in case oflarge corruptions and outliers, which is common in practice. [sent-22, score-0.145]
15 For example, in face recognition, a sample face image can be considered as corrupted if the person accidentally wears sunglasses. [sent-23, score-0.287]
16 As shown in [29], if the training data is clean, corrupted test- ing data can be handled by using sparse residual. [sent-24, score-0.314]
17 This robust method demonstrated very encouraging face recognition results [29, 3 1, 12]. [sent-25, score-0.143]
18 In practice, it may be inevitable to include corrupted sample and outliers in addition to dense Gaussian noises in the training data. [sent-26, score-0.497]
19 , xkA is person A accidentally wearing sunglasses, then it can be very ambiguous to recognize a corrupted input, e. [sent-40, score-0.241]
20 It is clear that noisy and corrupted training data will largely result in low quality dictionary if learned by existing methods. [sent-43, score-0.61]
21 As the data noise come multiple sources with different characteristics, we call this issue the residual modality problem. [sent-44, score-0.425]
22 This also emerges in many other vision tasks, such as removing salt and pepper noises, and handling artificially added texts and other outliers in images. [sent-45, score-0.261]
23 In order to address this issue, we propose a robust dictionary learning approach based on the decomposition of the reconstructive residual into two modalities: one for dense small Gaussian noises an the other for large sparse outliers. [sent-46, score-1.473]
24 We can have different residual penalty for different modalities. [sent-47, score-0.389]
25 This paper provides a coordinate descent solution for robust dictionary learning, an online acceleration method, and its convergence property. [sent-48, score-0.621]
26 This new approach allows us to learn a robust dictionary and identify outlier training data. [sent-49, score-0.553]
27 In addition, our further study reveals a very interesting con- 2216 nection between this source decomposition approach and the “partial dictionary update” approach. [sent-50, score-0.597]
28 This residual decomposition method is an explicit way to handle corrupted data in dictionary learning. [sent-51, score-1.002]
29 Moreover, we also propose an alternative that uses robust functions on reconstructive residual, which is an implicit means for corrupted data. [sent-52, score-0.318]
30 Experiments on synthetic dataset, texture synthesis, and image denoising show that our model is able to achieve quite satisfactory results without using much heuristics. [sent-54, score-0.208]
31 We aim to learn a dictionary Dn×m }= w {hde1r , d x2, . [sent-60, score-0.43]
32 ork of sparse dictionary learning was first proposed by Olshausen and Field [21] based on human perceptional system. [sent-74, score-0.641]
33 In the equation, the first term measures the residual (typically φ(. [sent-78, score-0.344]
34 In sparse coding, an L1-norm is always applied for ψ [15, 28, 17]. [sent-83, score-0.169]
35 Recently, a lot of work has been done to improve the traditional dictionary learning model in Eqn (1) for specific tasks. [sent-84, score-0.502]
36 |Λ(X [D17α]), ,| M|2F tiroa penalize idtif tfoer aen wt diigmhteends isoqnusa differently Xwit −h a diagonal matrix Λ; Zhao [32] and Lu [16] assume that the residual observes a Laplacian distribution and use a pure L1-norm. [sent-92, score-0.344]
37 Zhou [33] studies the influence of residual modality parameter settings and suggests that a good estimation ofnoise level can enhance the performance of sparse coding. [sent-93, score-0.565]
38 In contrast to these methods, we propose to decompose the residual into two sources rather than one Gaussian or Laplacian. [sent-94, score-0.373]
39 The empirical residual distribution, its Gaussian and Laplacian fitting is shown in blue, red and green. [sent-104, score-0.379]
40 We can see clearly that the true residual has smoother p. [sent-105, score-0.383]
41 Sparse/Non-sparse Residual Decomposition Rather than fitting one universal Gaussian or Laplacian model, we assume that the residual Res = X−Dα contains tmwood components: Res ? [sent-116, score-0.387]
42 ΞN x ∈ D \ Ω Ω (2) where denotes the corrupted region. [sent-118, score-0.145]
43 A simple illustration of our idea is given in Figure-2: we propose to learn a set of robust codewords {d1, d2}, to sparsely represent ad sateat points (sdtia cmodoenwdos adnsd { triangles) oan sdignore the outlier (the red diamond corrupted in z coordinate. [sent-120, score-0.566]
44 A typical L2-norm for residual penalty only obtains a compromised result {d? [sent-121, score-0.389]
45 ussed above, we seek to estimate a dictionary, sparse coefficients and corruptions by minimizing the number of nonzero elements of α, Ξ as well as the negative log-likelihood of Gaussian residual N si2217 ed by triangles and diamonds, with one outlier (marked in red). [sent-124, score-0.806]
46 Ideally, two green codewords d1, d2 are desired, while the outlier brings d1 to d? [sent-125, score-0.314]
47 Gaussian and Laplacian noises have been carefully studied and the analytical form of the p. [sent-140, score-0.232]
48 , dm} are updated, while the noisy codewords w=ith { dnatural ba}sis a rDeN uopdiseat d=, {e1, . [sent-149, score-0.298]
49 The non-convexity of dictionary learning method in Eqn (3) requires a good initialization; fixing DNoise reasonably avoids local minima and enables us to obtain a better numerical solution. [sent-154, score-0.507]
50 (2) Fixing sparse coefficients Ξ and α, we update D: Dˆ = argmDin||X − Ξ − Dα||2F s. [sent-160, score-0.248]
51 Three reasonable assumptions have been made in [17]: (A) compact support1; (B) strictly convex quadratic surrogate functions2; (C) unique sparse coding solution3. [sent-174, score-0.362]
52 We keep (A)(B) unchanged and modify (C) slightly as: (C’) Unique Sparse Solution: the informative codewords {d1, d2, . [sent-175, score-0.302]
53 κ Accordingly, with f(D) strictly convex and the sparse solution αi well defined, we have: Proposition 1(Convergence of Dt) Under assumptions (A)(B)(C’), the distance between the informative Dt and the set of stationary points converges almost surely to 0 when t → ∞ with probability 1. [sent-191, score-0.247]
54 Dictionary Learning by Robust Penalty The above residual decomposition approach model the residual explicitly. [sent-195, score-0.771]
55 In this paper, we also propose an alternative that handles the residual implicitly. [sent-196, score-0.344]
56 of residual should: (1) be smoother around Res = 0 than Laplacian; (2) have heavier tails than Gaussian. [sent-201, score-0.478]
57 Accordingly, we propose to take outliers into consideration implicitly: ? [sent-202, score-0.12]
58 In robust statistics [11], various forms of robust functions hav√e been proposed, such as the Charbonnier penalty φ(s) = √s2+ ? [sent-211, score-0.189]
59 If we further regard the error source decomposition model as φ(s) = inξf(s − ξ)2+ λ|ξ| then the shape of φ(s) is very similar to the shape of the robust function. [sent-213, score-0.189]
60 λ, Similar online optimization and convergence analysis can also be extended to the robust influence function models. [sent-216, score-0.157]
61 We apply a stochastic gradient method for dictionary update as: Dt= ΠC(Dt−1−tρi? [sent-217, score-0.52]
62 Generally speaking, both the error source decomposition method and the robust penalty method perform well, but the former outperforms the latter in speed. [sent-227, score-0.234]
63 To make the comparison “fair”, we shift the phase transition line of Gaussian prior to the left (green), since more bases are implicitly used in the other two methods. [sent-236, score-0.187]
64 aaretrix N w spithar Gseau vsescitaonr sn;o nises∼ ∼of N Nsm(0a,llσ variance; n2 rise a sparse corruption amusatsriiaxn nwoiithse large mGaaluls vsaiarninoise for nonzero entries. [sent-265, score-0.267]
65 , σs1s2i)a We train an over-complete dictionary Dm? [sent-272, score-0.43]
66 In Figure-3, We compare the performance of traditional dictionary learning with Gaussian prior [15] and Laplacian prior [32] with our model. [sent-284, score-0.502]
67 = 60 potential codewords for a true dictionary of size m = 30, the ratio is m? [sent-288, score-0.693]
68 /m = 2) 4; the vertical axis is the variance of sparse noises n2. [sent-289, score-0.401]
69 We can see clearly that our robust model (blue) has more tolerance to mixed heavy-tail noises than both [15] (green and red for with/out self-taught bases) and [32] (purple lines). [sent-291, score-0.339]
70 As shown in Figure-4, we train a dictionary D on the SparseNet image dataset [21] with small Gaussian noises (5dB) and sparse large outliers (red characters) added. [sent-299, score-0.951]
71 A3 vpiatsucahle comparison oinfi itraaldizietional dictionary learning [15] and our algorithm is shown in Figure-5(a)(b) respectively. [sent-301, score-0.472]
72 22% of our bases contain red patches, in comparison with 2. [sent-305, score-0.121]
73 Close scrutiny of Ξ coefficients reveals that a good initialization of DNoise absorbs the corruptions and keeps DInfo away from sparse red outliers. [sent-307, score-0.433]
74 47c2]3i7an[2] and total-variation [24] on denoise benchmark [7] with random sparse corruptions added. [sent-330, score-0.314]
75 nthetic Gaussian noises of σ = 20 and sparse outliers of σ ? [sent-345, score-0.521]
76 Some denoised results are shown in Figure-6, from which we can see that the “dotted” salt and pepper corruptions are eliminated successfully. [sent-348, score-0.291]
77 Besides Gaussian noises with σ = {5, 10, 15}, we corrupts 1% pixGelsa uwssiithan σ o=is 2s5 w. [sent-350, score-0.232]
78 self-similarity of textures with outlier removal by integrating our model into image quilting [5]: (1) Robust Dictionary Learning: given an textured image, we first learn D: {D,α} =argmD,inα||X − Dα − Ξ||2F+ λ||Ξ||L1 s. [sent-359, score-0.146]
79 m(2e) t Roo ubpudsat eP a Dtch a Processing: efornra a new patch y to be added “agreeing” with the neighbors based on the criteria in [5], we decide whether it is also consistent with learned codewords D by: f(y) = mα,inξ||y − Dα − ξ||2+ λ|ξ| s. [sent-362, score-0.263]
80 In Figure-7, we randomly add some outliers to original patches and the synthesized textures are shown in Figure8. [sent-368, score-0.221]
81 To remove the artificially added outliers (the black line), we eliminate some infrequent patterns in the input. [sent-377, score-0.154]
82 We have also carried out a complete evaluation on the CMU-NRT Database5 with sparse noises added. [sent-379, score-0.401]
83 We show a failure case in Figure-9: the internal patterns need to be more frequent than outliers to be synthesized, and our algorithm sometimes achieve over-uniform textures during step(2). [sent-381, score-0.159]
84 Robust Discriminative Dictionary Learning Finally, we propose to learn a robust dictionary for classification. [sent-384, score-0.502]
85 There have been some work on discriminative models [13, 18, 23], relying either on the reconstructive residual, or on the discriminative ability of sparse coding coefficients. [sent-385, score-0.459]
86 , Dαk} for ea}ch a ncldas rse satisfying following etwntos αcon =dit {ioαns: (1) Given xi ∈ cj we have xi = Dαi ≈ ; (2) the within∈-ccl ass scatter is small, wh≈ile D Dthe between-class scatter is large. [sent-403, score-0.296]
87 1 − m)(mci − m)T where mci and m are the mean of Xci and X. [sent-419, score-0.157]
88 We apply the error source decomposition to the discriminative fidelity term as: r(X,D,α) = ? [sent-420, score-0.152]
89 Then, we iteratively update sparse coding for α∗ and dictionary update for D. [sent-437, score-0.808]
90 We test our robust dictionary learning on Yale extended B benchmark [9], consisting of 2,414 frontal-face images from 38 individuals under different lighting condition. [sent-439, score-0.544]
91 The comparison is shown in Table 3, which reveals that by adding robustness can enhance the performance of discriminative dictionary learning. [sent-441, score-0.515]
92 Conclusion In this work, we introduce a novel generalized residual separation approach in robust dictionary learning to handle corruptions and outliers in training data. [sent-443, score-1.153]
93 By exploiting the statistics on reconstructive residual, we observe that it comes from two sources: a large sparse corruption component and a small dense Gaussian component. [sent-444, score-0.339]
94 Accordingly, we formulate a novel regularization to model the residual modality. [sent-445, score-0.344]
95 Image denoising via learned dictionaries [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] and sparse representation. [sent-484, score-0.284]
96 Group sparse coding with a laplacian scale mixture prior. [sent-497, score-0.457]
97 Learning a discriminative dictionary for sparse coding via label consistent k-svd. [sent-533, score-0.753]
98 Discriminative sparse image models for class-specific edge detection and image interpretation. [sent-577, score-0.169]
99 Sparse coding with an overcomplete basis set: A strategy employed by v1? [sent-589, score-0.119]
100 Classification and clustering via dictionary learning with structured incoherence and shared features. [sent-599, score-0.472]
wordName wordTfidf (topN-words)
[('dictionary', 0.43), ('residual', 0.344), ('codewords', 0.263), ('noises', 0.232), ('laplacian', 0.169), ('sparse', 0.169), ('mci', 0.157), ('corruptions', 0.145), ('corrupted', 0.145), ('eqn', 0.139), ('outliers', 0.12), ('coding', 0.119), ('reconstructive', 0.101), ('charbonnier', 0.094), ('dinfo', 0.094), ('dnoise', 0.094), ('dt', 0.09), ('denoising', 0.087), ('bases', 0.086), ('decomposition', 0.083), ('mairal', 0.081), ('res', 0.078), ('gaussian', 0.075), ('robust', 0.072), ('corruption', 0.069), ('wj', 0.069), ('xi', 0.067), ('dcj', 0.063), ('pepper', 0.063), ('xka', 0.063), ('codeword', 0.06), ('yale', 0.057), ('cj', 0.056), ('nou', 0.056), ('quilting', 0.056), ('accidentally', 0.056), ('sapiro', 0.055), ('phase', 0.054), ('scatter', 0.053), ('tit', 0.052), ('xdi', 0.052), ('modality', 0.052), ('outlier', 0.051), ('reveals', 0.05), ('olshausen', 0.049), ('tails', 0.049), ('transition', 0.047), ('clean', 0.047), ('heavier', 0.046), ('stochastic', 0.045), ('accordingly', 0.045), ('penalty', 0.045), ('update', 0.045), ('salt', 0.044), ('dirty', 0.044), ('satisfactory', 0.044), ('di', 0.044), ('universal', 0.043), ('readers', 0.043), ('dj', 0.043), ('face', 0.043), ('convergence', 0.043), ('online', 0.042), ('learning', 0.042), ('thr', 0.041), ('synthetic', 0.041), ('recognize', 0.04), ('ps', 0.039), ('strictly', 0.039), ('denoised', 0.039), ('smoother', 0.039), ('textures', 0.039), ('informative', 0.039), ('recovery', 0.039), ('synthesis', 0.038), ('contaminated', 0.037), ('texture', 0.036), ('bach', 0.036), ('sparsity', 0.036), ('discriminative', 0.035), ('red', 0.035), ('fixing', 0.035), ('noisy', 0.035), ('surrogate', 0.035), ('source', 0.034), ('triangles', 0.034), ('artificially', 0.034), ('acceleration', 0.034), ('coefficients', 0.034), ('add', 0.033), ('shrinkage', 0.033), ('traditional', 0.03), ('omit', 0.03), ('ponce', 0.029), ('sources', 0.029), ('sb', 0.029), ('synthesized', 0.029), ('nonzero', 0.029), ('encouraging', 0.028), ('dictionaries', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
2 0.40774879 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
3 0.39561161 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
4 0.28880337 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
Author: Chen-Kuo Chiang, Te-Feng Su, Chih Yen, Shang-Hong Lai
Abstract: We present a multi-attributed dictionary learning algorithm for sparse coding. Considering training samples with multiple attributes, a new distance matrix is proposed by jointly incorporating data and attribute similarities. Then, an objective function is presented to learn categorydependent dictionaries that are compact (closeness of dictionary atoms based on data distance and attribute similarity), reconstructive (low reconstruction error with correct dictionary) and label-consistent (encouraging the labels of dictionary atoms to be similar). We have demonstrated our algorithm on action classification and face recognition tasks on several publicly available datasets. Experimental results with improved performance over previous dictionary learning methods are shown to validate the effectiveness of the proposed algorithm.
5 0.25884262 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
Author: Hans Lobel, René Vidal, Alvaro Soto
Abstract: Currently, Bag-of-Visual-Words (BoVW) and part-based methods are the most popular approaches for visual recognition. In both cases, a mid-level representation is built on top of low-level image descriptors and top-level classifiers use this mid-level representation to achieve visual recognition. While in current part-based approaches, mid- and top-level representations are usually jointly trained, this is not the usual case for BoVW schemes. A main reason for this is the complex data association problem related to the usual large dictionary size needed by BoVW approaches. As a further observation, typical solutions based on BoVW and part-based representations are usually limited to extensions of binary classification schemes, a strategy that ignores relevant correlations among classes. In this work we propose a novel hierarchical approach to visual recognition based on a BoVW scheme that jointly learns suitable midand top-level representations. Furthermore, using a maxmargin learning framework, the proposed approach directly handles the multiclass case at both levels of abstraction. We test our proposed method using several popular bench- mark datasets. As our main result, we demonstrate that, by coupling learning of mid- and top-level representations, the proposed approach fosters sharing of discriminative visual words among target classes, being able to achieve state-ofthe-art recognition performance using far less visual words than previous approaches.
6 0.24678719 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
7 0.23019235 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
8 0.22487682 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
9 0.22044803 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
10 0.18949932 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
11 0.17644432 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
12 0.15531504 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
13 0.15131669 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
14 0.1509991 221 iccv-2013-Joint Inverted Indexing
15 0.14903443 96 iccv-2013-Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition
16 0.11859807 298 iccv-2013-Online Robust Non-negative Dictionary Learning for Visual Tracking
17 0.1161482 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
18 0.11499932 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
19 0.10755304 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
20 0.10514958 204 iccv-2013-Human Attribute Recognition by Rich Appearance Dictionary
topicId topicWeight
[(0, 0.226), (1, 0.119), (2, -0.134), (3, -0.03), (4, -0.397), (5, -0.142), (6, -0.188), (7, -0.041), (8, -0.062), (9, -0.034), (10, 0.026), (11, 0.058), (12, 0.016), (13, 0.067), (14, -0.089), (15, 0.061), (16, -0.01), (17, 0.033), (18, 0.02), (19, 0.051), (20, -0.008), (21, -0.046), (22, 0.015), (23, 0.033), (24, 0.05), (25, -0.02), (26, 0.061), (27, -0.015), (28, 0.039), (29, -0.055), (30, 0.006), (31, 0.012), (32, 0.018), (33, 0.016), (34, 0.017), (35, 0.011), (36, 0.036), (37, 0.018), (38, 0.031), (39, 0.026), (40, -0.049), (41, 0.074), (42, -0.047), (43, -0.01), (44, 0.036), (45, 0.052), (46, 0.038), (47, -0.031), (48, -0.01), (49, 0.004)]
simIndex simValue paperId paperTitle
same-paper 1 0.97637653 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
2 0.96939015 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
Author: Chenglong Bao, Jian-Feng Cai, Hui Ji
Abstract: In recent years, how to learn a dictionary from input images for sparse modelling has been one very active topic in image processing and recognition. Most existing dictionary learning methods consider an over-complete dictionary, e.g. the K-SVD method. Often they require solving some minimization problem that is very challenging in terms of computational feasibility and efficiency. However, if the correlations among dictionary atoms are not well constrained, the redundancy of the dictionary does not necessarily improve the performance of sparse coding. This paper proposed a fast orthogonal dictionary learning method for sparse image representation. With comparable performance on several image restoration tasks, the proposed method is much more computationally efficient than the over-complete dictionary based learning methods.
3 0.92935735 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
Author: Hua Wang, Feiping Nie, Weidong Cai, Heng Huang
Abstract: Representing the raw input of a data set by a set of relevant codes is crucial to many computer vision applications. Due to the intrinsic sparse property of real-world data, dictionary learning, in which the linear decomposition of a data point uses a set of learned dictionary bases, i.e., codes, has demonstrated state-of-the-art performance. However, traditional dictionary learning methods suffer from three weaknesses: sensitivity to noisy and outlier samples, difficulty to determine the optimal dictionary size, and incapability to incorporate supervision information. In this paper, we address these weaknesses by learning a Semi-Supervised Robust Dictionary (SSR-D). Specifically, we use the ℓ2,0+ norm as the loss function to improve the robustness against outliers, and develop a new structured sparse regularization com, , tom. . cai@sydney . edu . au , heng@uta .edu make the learning tasks easier to deal with and reduce the computational cost. For example, in image tagging, instead of using the raw pixel-wise features, semi-local or patch- based features, such as SIFT and geometric blur, are usually more desirable to achieve better performance. In practice, finding a set of compact features bases, also referred to as dictionary, with enhanced representative and discriminative power, plays a significant role in building a successful computer vision system. In this paper, we explore this important problem by proposing a novel formulation and its solution for learning Semi-Supervised Robust Dictionary (SSRD), where we examine the challenges in dictionary learning, and seek opportunities to overcome them and improve the dictionary qualities. 1.1. Challenges in Dictionary Learning to incorporate the supervision information in dictionary learning, without incurring additional parameters. Moreover, the optimal dictionary size is automatically learned from the input data. Minimizing the derived objective function is challenging because it involves many non-smooth ℓ2,0+ -norm terms. We present an efficient algorithm to solve the problem with a rigorous proof of the convergence of the algorithm. Extensive experiments are presented to show the superior performance of the proposed method.
4 0.83498657 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
Author: Zhaowen Wang, Jianchao Yang, Nasser Nasrabadi, Thomas Huang
Abstract: Sparse Representation-based Classification (SRC) is a powerful tool in distinguishing signal categories which lie on different subspaces. Despite its wide application to visual recognition tasks, current understanding of SRC is solely based on a reconstructive perspective, which neither offers any guarantee on its classification performance nor provides any insight on how to design a discriminative dictionary for SRC. In this paper, we present a novel perspective towards SRC and interpret it as a margin classifier. The decision boundary and margin of SRC are analyzed in local regions where the support of sparse code is stable. Based on the derived margin, we propose a hinge loss function as the gauge for the classification performance of SRC. A stochastic gradient descent algorithm is implemented to maximize the margin of SRC and obtain more discriminative dictionaries. Experiments validate the effectiveness of the proposed approach in predicting classification performance and improving dictionary quality over reconstructive ones. Classification results competitive with other state-ofthe-art sparse coding methods are reported on several data sets.
5 0.8168363 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
Author: Radu Timofte, Vincent De_Smet, Luc Van_Gool
Abstract: Recently there have been significant advances in image upscaling or image super-resolution based on a dictionary of low and high resolution exemplars. The running time of the methods is often ignored despite the fact that it is a critical factor for real applications. This paper proposes fast super-resolution methods while making no compromise on quality. First, we support the use of sparse learned dictionaries in combination with neighbor embedding methods. In this case, the nearest neighbors are computed using the correlation with the dictionary atoms rather than the Euclidean distance. Moreover, we show that most of the current approaches reach top performance for the right parameters. Second, we show that using global collaborative coding has considerable speed advantages, reducing the super-resolution mapping to a precomputed projective matrix. Third, we propose the anchored neighborhood regression. That is to anchor the neighborhood embedding of a low resolution patch to the nearest atom in the dictionary and to precompute the corresponding embedding matrix. These proposals are contrasted with current state-of- the-art methods on standard images. We obtain similar or improved quality and one or two orders of magnitude speed improvements.
6 0.79771531 276 iccv-2013-Multi-attributed Dictionary Learning for Sparse Coding
7 0.76332539 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
8 0.75666749 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
9 0.75358945 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
10 0.7138254 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
11 0.68874341 398 iccv-2013-Sparse Variation Dictionary Learning for Face Recognition with a Single Training Sample per Person
12 0.67245221 19 iccv-2013-A Learning-Based Approach to Reduce JPEG Artifacts in Image Matting
13 0.66217804 96 iccv-2013-Coupled Dictionary and Feature Space Learning with Applications to Cross-Domain Image Synthesis and Recognition
14 0.66080981 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
15 0.63933969 14 iccv-2013-A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding
16 0.60915542 244 iccv-2013-Learning View-Invariant Sparse Representations for Cross-View Action Recognition
17 0.59796143 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
18 0.57319468 359 iccv-2013-Robust Object Tracking with Online Multi-lifespan Dictionary Learning
19 0.53246105 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
20 0.5155158 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology
topicId topicWeight
[(2, 0.074), (7, 0.021), (16, 0.016), (26, 0.098), (27, 0.012), (31, 0.072), (42, 0.115), (48, 0.255), (64, 0.04), (73, 0.047), (78, 0.02), (89, 0.117), (98, 0.012)]
simIndex simValue paperId paperTitle
1 0.88556546 331 iccv-2013-Pyramid Coding for Functional Scene Element Recognition in Video Scenes
Author: Eran Swears, Anthony Hoogs, Kim Boyer
Abstract: Recognizing functional scene elemeents in video scenes based on the behaviors of moving objects that interact with them is an emerging problem ooff interest. Existing approaches have a limited ability to chharacterize elements such as cross-walks, intersections, andd buildings that have low activity, are multi-modal, or havee indirect evidence. Our approach recognizes the low activvity and multi-model elements (crosswalks/intersections) by introducing a hierarchy of descriptive clusters to fform a pyramid of codebooks that is sparse in the numbber of clusters and dense in content. The incorporation oof local behavioral context such as person-enter-building aand vehicle-parking nearby enables the detection of elemennts that do not have direct motion-based evidence, e.g. buuildings. These two contributions significantly improvee scene element recognition when compared against thhree state-of-the-art approaches. Results are shown on tyypical ground level surveillance video and for the first time on the more complex Wide Area Motion Imagery.
2 0.8643657 311 iccv-2013-Pedestrian Parsing via Deep Decompositional Network
Author: Ping Luo, Xiaogang Wang, Xiaoou Tang
Abstract: We propose a new Deep Decompositional Network (DDN) for parsing pedestrian images into semantic regions, such as hair, head, body, arms, and legs, where the pedestrians can be heavily occluded. Unlike existing methods based on template matching or Bayesian inference, our approach directly maps low-level visual features to the label maps of body parts with DDN, which is able to accurately estimate complex pose variations with good robustness to occlusions and background clutters. DDN jointly estimates occluded regions and segments body parts by stacking three types of hidden layers: occlusion estimation layers, completion layers, and decomposition layers. The occlusion estimation layers estimate a binary mask, indicating which part of a pedestrian is invisible. The completion layers synthesize low-level features of the invisible part from the original features and the occlusion mask. The decomposition layers directly transform the synthesized visual features to label maps. We devise a new strategy to pre-train these hidden layers, and then fine-tune the entire network using the stochastic gradient descent. Experimental results show that our approach achieves better segmentation accuracy than the state-of-the-art methods on pedestrian images with or without occlusions. Another important contribution of this paper is that it provides a large scale benchmark human parsing dataset1 that includes 3, 673 annotated samples collected from 171 surveillance videos. It is 20 times larger than existing public datasets.
3 0.85577786 63 iccv-2013-Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints
Author: Masoud S. Nosrati, Shawn Andrews, Ghassan Hamarneh
Abstract: The inclusion of shape and appearance priors have proven useful for obtaining more accurate and plausible segmentations, especially for complex objects with multiple parts. In this paper, we augment the popular MumfordShah model to incorporate two important geometrical constraints, termed containment and detachment, between different regions with a specified minimum distance between their boundaries. Our method is able to handle multiple instances of multi-part objects defined by these geometrical hamarneh} @ s fu . ca (a)Standar laΩb ehlingΩfuhnctionseting(Ωb)hΩOuirseΩtijng Figure 1: The inside vs. outside ambiguity in (a) is resolved by our containment constraint in (b). constraints using a single labeling function while maintaining global optimality. We demonstrate the utility and advantages of these two constraints and show that the proposed convex continuous method is superior to other state-of-theart methods, including its discrete counterpart, in terms of memory usage, and metrication errors.
same-paper 4 0.80026501 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
5 0.79262412 320 iccv-2013-Pose-Configurable Generic Tracking of Elongated Objects
Author: Daniel Wesierski, Patrick Horain
Abstract: Elongated objects have various shapes and can shift, rotate, change scale, and be rigid or deform by flexing, articulating, and vibrating, with examples as varied as a glass bottle, a robotic arm, a surgical suture, a finger pair, a tram, and a guitar string. This generally makes tracking of poses of elongated objects very challenging. We describe a unified, configurable framework for tracking the pose of elongated objects, which move in the image plane and extend over the image region. Our method strives for simplicity, versatility, and efficiency. The object is decomposed into a chained assembly of segments of multiple parts that are arranged under a hierarchy of tailored spatio-temporal constraints. In this hierarchy, segments can rescale independently while their elasticity is controlled with global orientations and local distances. While the trend in tracking is to design complex, structure-free algorithms that update object appearance on- line, we show that our tracker, with the novel but remarkably simple, structured organization of parts with constant appearance, reaches or improves state-of-the-art performance. Most importantly, our model can be easily configured to track exact pose of arbitrary, elongated objects in the image plane. The tracker can run up to 100 fps on a desktop PC, yet the computation time scales linearly with the number of object parts. To our knowledge, this is the first approach to generic tracking of elongated objects.
6 0.75238049 207 iccv-2013-Illuminant Chromaticity from Image Sequences
7 0.6865797 220 iccv-2013-Joint Deep Learning for Pedestrian Detection
8 0.67440391 279 iccv-2013-Multi-stage Contextual Deep Learning for Pedestrian Detection
9 0.67368853 206 iccv-2013-Hybrid Deep Learning for Face Verification
10 0.66576505 7 iccv-2013-A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
11 0.66433609 106 iccv-2013-Deep Learning Identity-Preserving Face Space
12 0.65325749 208 iccv-2013-Image Co-segmentation via Consistent Functional Maps
13 0.65053189 312 iccv-2013-Perceptual Fidelity Aware Mean Squared Error
14 0.64639592 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
15 0.64603162 61 iccv-2013-Beyond Hard Negative Mining: Efficient Detector Learning via Block-Circulant Decomposition
16 0.64560223 80 iccv-2013-Collaborative Active Learning of a Kernel Machine Ensemble for Recognition
17 0.64216411 376 iccv-2013-Scene Text Localization and Recognition with Oriented Stroke Detection
18 0.63952285 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
19 0.63892591 126 iccv-2013-Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
20 0.6385839 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning