iccv iccv2013 iccv2013-258 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Tianzhu Zhang, Bernard Ghanem, Si Liu, Changsheng Xu, Narendra Ahuja
Abstract: In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36].
Reference: text
sentIndex sentText sentNum sentScore
1 China 5 University of Illinois at Urbana-Champaign, Urbana, IL USA Abstract In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. [sent-3, score-0.591]
2 LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. [sent-4, score-0.22]
3 As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. [sent-5, score-0.46]
4 (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. [sent-7, score-0.523]
5 We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. [sent-9, score-0.373]
6 Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36]. [sent-10, score-0.689]
7 The conventional BoW pipeline for classification consists of five stages: feature extraction and description, codebook design, feature coding, feature pooling, and classifier construction. [sent-14, score-0.403]
8 Given an image, features, such as SIFT [27], HOG [7] and SURF [2], can be densely extracted and encoded with a codebook constructed using K-means clustering. [sent-17, score-0.222]
9 After computing codes for local features, they need to be pooled together to form equal sized feature vectors each representing one im- age in a dataset. [sent-19, score-0.242]
10 To include the spatial layout of local features in an image, Spatial Pyramid Matching (SPM) [22] is usually performed to obtain an image-level representation that can be used to discriminate different categories of objects, scenes, or actions. [sent-23, score-0.189]
11 The earliest method is hard-assignment coding (vector quantization) [22], a voting scheme that is simple yet highly sensitive to the selection of codebook. [sent-26, score-0.394]
12 A more robust voting approach is soft-assignment coding [33], which assigns a code coefficient for a particular local feature to each visual word according to their pairwise distance. [sent-27, score-0.541]
13 To improve hard and soft-assignment coding, sparsity is enforced on local feature codes via sparse learning techniques [36]. [sent-28, score-0.42]
14 However, sparse coding is time consuming and usually leads to non-consistent codes [18, 11], i. [sent-29, score-0.625]
15 local features with similar descriptors tend to have different sparse codes. [sent-31, score-0.266]
16 To alleviate inconsistency, authors in [37] introduce another coding property, called locality, which encourages that visual words used to represent a local feature be similar to the feature’s descriptor itself. [sent-32, score-0.554]
17 This is usually ensured by constructing a feature’s codebook from its nearest neighbors in the universal codebook. [sent-33, score-0.235]
18 In fact, several implementa281 densely sampled in (a) concatentated in matrix form; (d) SIFT descriptors from the local region depicted in red color in (a); From Figure (b), we see that SIFT descriptors in the image tend to be sparse and low-rank. [sent-34, score-0.363]
19 The work in [15] rebrands the locality property as codebook ‘saliency’ . [sent-41, score-0.335]
20 However, all the aforementioned coding schemes encode local features independently. [sent-42, score-0.492]
21 The property of spatial consistency encourages local features, which are spatially close in an image, to have similar sparse codes and similar supports. [sent-44, score-0.484]
22 The latter implication encourages code consistency among features and suggests that the same visual words represent each local feature in a spatial neighborhood. [sent-45, score-0.364]
23 In fact, only recently, the spatial layout of local features has been used to select ‘optimal’ visual words for each feature in an image [31]. [sent-47, score-0.273]
24 Despite its awareness of spatial layout, the spatial consistency property is only invoked in the codebook selection process, which is done independently from the coding itself. [sent-49, score-0.824]
25 Although features in a spatial neighborhood are encouraged to have the same set of visual words representing them, their sparse codes are not directly encouraged to be similar or have similar supports w. [sent-50, score-0.427]
26 As stated before, maintaining spatial consistency among feature codes enables a more faithful representation of an image and has been shown to improve classification performance. [sent-53, score-0.365]
27 By dividing the image into superpixels as shown in Figure 1(a), we observe that the matrix of descriptors for SIFT points in a particular superpixel (denoted in red) is also sparse and lowrank as shown in Figure 1(d). [sent-63, score-0.221]
28 Inspired by the observation and prior work on feature coding, we propose a low-rank sparse coding (LRSC) method that encourages both sparsity and spatial consistency in the coding step of the BoW model. [sent-67, score-1.121]
29 Here, the joint coding of features in a local region is viewed as a low-rank sparse learning problem. [sent-68, score-0.592]
30 Unlike previous methods, we exploit similarities among local features lying in the same spatial neighborhood and, therefore, seek an accurate joint representation of these local features w. [sent-69, score-0.255]
31 In LRSC, the codes of local features are sparse and low-rank, which encourages that only a few (but the same) visual words are used to represent all features in a local region. [sent-73, score-0.513]
32 As opposed to sparse coding based image classification methods [36, 18] that handle local features independently, our use of sparse low-rank learning realizes the benefits of a sparse feature representation, while respecting the underlying spatial relationship among local features. [sent-74, score-1.011]
33 Feature codes are computed by solving a sparse low-rank optimization problem, which comprises a sequence of closed form update steps made possible by the Inexact Augmented Lagrange Multiplier (IALM) that guarantees fast convergence. [sent-75, score-0.276]
34 (1) We propose a low-rank sparse learning method for feature coding, which is a robust sparse coding method that mines correlations among different local features to obtain better coding results than learning each feature individually. [sent-77, score-1.168]
35 (2) We show that popular sparse coding methods [36, 18] are a special case of our LRSC formulation. [sent-79, score-0.481]
36 (3) We learn local feature codes jointly with an efficient IALM method. [sent-80, score-0.262]
37 As a result, LRSC outperforms state-of-the-art coding methods in general, while remaining computationally attractive. [sent-81, score-0.399]
38 Related Work In this section, we survey commonly used coding schemes. [sent-83, score-0.373]
39 Matrix B = denotes a visual codebook or a set of b? [sent-94, score-0.188]
40 Hard-assignment coding (HC) [22]: For a local feature xi, there is one and only one nonzero coding coefficient. [sent-102, score-0.844]
41 2 Soft-assignme⎩nt coding (SC∗) [33]: The j-th coding coefficient represents the degree of membership of a local feature xi to the jth visual word, where α is the smoothing factor controlling the softness of the assignment. [sent-111, score-0.895]
42 gnment coding (LSC) [25]: The basic idea is to adopt the k visual words in the neighborhood of a local feature to refine the soft-assignment coding [33]. [sent-123, score-0.88]
43 ing (SCSPM) [36]: It represents a local feature xi by a linear combination of a sparse set of basis vectors in the codebook. [sent-136, score-0.236]
44 1 Locality-constrained linear coding (LLC) [18]: Unlike sparse coding, LLC enforces codebook locality instead of sparsity. [sent-145, score-0.783]
45 Laplacian sparse coding (LScSPM) [11]: It is the first method that improves the consistency of sparse coding, by encouraging similar local features in the dataset to have similar sparse codes. [sent-174, score-0.853]
46 Due to the extremely large number of features in a dataset, constructing the Laplacian matrix and learning sparse codes simultaneously is computationally infeasible. [sent-192, score-0.317]
47 Salient coding (SC) [15]: This is an alternative to sparse coding. [sent-194, score-0.481]
48 It exploits codebook locality by setting the code to a “saliency” degree based on the nearest codebook bases to xi. [sent-195, score-0.575]
49 patially regularized coding (LCSRC) [31]: The spatial layout of local features in the same image is used to select “optimal” bases for each local feature. [sent-215, score-0.68]
50 It assumes that local feature should have similar bases as its nearest neighbors This is done by solving a pairwise multi-label MRF problem. [sent-216, score-0.183]
51 Once bases are selected for local features, their codes can be computed by using any of the previous coding methods. [sent-217, score-0.608]
52 Most of the aforementioned coding schemes (except for LScSPM) produce feature codes independently. [sent-218, score-0.595]
53 Although LScSPM [11] adopts a global similarity between local features, it ignores local spatial contextual information [16, 39, 40] and is computationally expensive. [sent-219, score-0.206]
54 LCSRC [3 1] makes use of the spatial layout of local features in the same image. [sent-220, score-0.189]
55 However, it only does so to constrain codebook selection. [sent-221, score-0.188]
56 It fails to directly enforce consistency on codes themselves. [sent-222, score-0.19]
57 xp 283 To the best of our knowledge, the proposed low-rank sparse coding (LRSC) method is the first to introduce spatial consistency and joint feature coding explicitly in the coding step of the BoW model. [sent-224, score-1.377]
58 Low-Rank Sparse Coding (LRSC) Here, we give a detailed description of our local feature coding method that makes use of low-rank sparse learning. [sent-227, score-0.579]
59 Low-Rank Sparse Representation As seen in Figure 1, SIFT descriptors tend to be collectively sparse and low-rank across natural images and specifically in spatial neighborhoods of the same image. [sent-230, score-0.255]
60 In this paper, we formulate local feature coding as a low-rank sparse learning problem, which encourages sparsity and low-rankness locally among features in the image. [sent-232, score-0.777]
61 Since the low-rank sparsity property is more evident locally, we apply low-rank sparse learning to code features in the same region of an image, by dividing an image into homogeneous superpixels. [sent-233, score-0.293]
62 Following many coding methods [22, 36, 18, 25, 15], LRSC densely samples SIFT features in an image. [sent-239, score-0.446]
63 as a linear combination zi of elements forming the codebook D, such that X = DZ. [sent-252, score-0.24]
64 In fact, sparse feature coding has been shown to be quite helpful in image classification [36, 26, 18]. [sent-267, score-0.579]
65 (a) Image partition results; (b) All SIFT descriptors in the local region depicted in red in (a); (c) and (d) are coding results produced by SCSPM [36] and LRSC. [sent-270, score-0.522]
66 their features are similar but their codes and the supports of their codes are not. [sent-273, score-0.327]
67 This is because SCSPM solves the coding problem for each feature independently. [sent-274, score-0.44]
68 a few (but the same) visual words are used to represent all the local features together, which renders the codes consistent and more robust to noise. [sent-277, score-0.269]
69 Discussion As stated earlier, many feature coding schemes exist in the literature. [sent-308, score-0.451]
70 In LLC [18], LSC [25], and SC [15], locality in codebook selection is adopted and better performance is obtained. [sent-313, score-0.345]
71 In LScSPM [11], a global similarity among features is adopted to consider the relationship among feature points in feature space. [sent-314, score-0.199]
72 In Figure 2, we show an example of how LRSC compares with traditional sparse coding (SCSPM) [36]. [sent-320, score-0.481]
73 The following three observations explain how LRSC is related to other coding schemes. [sent-326, score-0.373]
74 rSeigmioilna,r w wtoe LLC and LSC, we construct D from elements in the universal codebook that are nearest to each local feature. [sent-333, score-0.285]
75 In comparison, LCSRC [31] incorporates spatial consistency in selecting an ‘optimal’ codebook for each feature separately and then computes feature codes independently in the image. [sent-335, score-0.56]
76 lexity of O(m2nd), which is signficantly slower than our coding method. [sent-449, score-0.373]
77 The effectiveness and efficiency of LRSC are validated by a comparison with 7 popular coding methods and other stateof-the-art approaches where applicable. [sent-453, score-0.373]
78 feature extraction and classification) are kept the same and only the coding stage is varied. [sent-458, score-0.421]
79 Implementation Details: For fair comparison with type (1) methods, we fix all stages of the BoW classification pipeline except for the feature coding stage. [sent-464, score-0.511]
80 (s3a)m Cpoleddeb foroomk: eTacheh universal codebook is obtained using K-means on a randomly selected subset of SIFT descriptors (200K) in the training ×× set. [sent-475, score-0.257]
81 As in [3 1], the codebook size depends on the size of the dataset: 1024 for Scene-13, Caltech-101, and UIUC 8Sport and 4096 for Caltech-256. [sent-476, score-0.188]
82 As discussed in [36, 18, 6], increasing the codebook size can improve the performance. [sent-477, score-0.188]
83 2, our algorithm will incur a slightly higher computational cost to find the nearest neighbors in codebook for each feature point. [sent-479, score-0.259]
84 Therefore, our algorithm can retain a good performance level even for large codebook sizes. [sent-480, score-0.188]
85 The classification accuracy is reported in Table 1, which shows the average (and standard deviation) results of the state-of-the-art coding approaches and the proposed LRSC method. [sent-496, score-0.423]
86 Results of LScSPM show that the relationships among features in their d−dimensional feature space improves classification further. [sent-499, score-0.179]
87 91 In LCSRC and our LRSC, adding local spatial information improves classification accuracy significantly as compared to the first six methods, and our LRSC has a moderate improvement over the state-of-the-art feature coding methods. [sent-508, score-0.598]
88 In Table 3, the runtime for all coding methods on the same image is reported. [sent-529, score-0.396]
89 e wW ihtehn 1 0al8l fseega-tures are coded with a 1024 codebook, LRSC is computationally much faster than SCSPM [36] because our LRSC encodes local features jointly, which is much more efficient than SCSPM encoding features independently (6984 ? [sent-532, score-0.206]
90 Runtime of different coding methods on a 300 400 image with 6984 SIFT descripRutonrtsim. [sent-544, score-0.373]
91 LRSC performs best among all the feature coding methods and has about 2% improvement. [sent-556, score-0.442]
92 As compared to the sparse coding methods SCSPM and LLC, LRSC’s performance is much better, since it makes a 3% improvement. [sent-568, score-0.481]
93 As such, we conclude that exploiting spatial consistency directly in the coding stage improves classification performances by 3% on average. [sent-570, score-0.546]
94 From this table, we see that our LRSC method outperforms the other coding methods on this data set, and makes about 3% improvement. [sent-629, score-0.373]
95 Conclusion In this paper, we present a new coding technique for local features that employs low-rank sparse learning. [sent-650, score-0.57]
96 This method exploits sparsity in individual codes, locality in codebook selection, and low-rankness in constraining sparse codes belonging to the same spatial neighborhood. [sent-651, score-0.68]
97 For future work, we will systematically study how image partition can be combined with low-rank sparse coding in one unified framework. [sent-653, score-0.513]
98 Local features are not lonely - laplacian sparse coding for image classification. [sent-730, score-0.56]
99 A two-layer sparse coding model learns simple and complex cell receptive fields and topography from natural images. [sent-759, score-0.481]
100 Linear spatial pyramid matching using sparse coding for image classification. [sent-895, score-0.537]
wordName wordTfidf (topN-words)
[('lrsc', 0.686), ('coding', 0.373), ('codebook', 0.188), ('scspm', 0.167), ('lcsrc', 0.157), ('lscspm', 0.157), ('codes', 0.144), ('ialm', 0.139), ('locality', 0.114), ('sparse', 0.108), ('llc', 0.098), ('lsc', 0.094), ('uiuc', 0.086), ('eq', 0.086), ('sift', 0.075), ('sparsity', 0.07), ('bow', 0.07), ('rank', 0.067), ('zij', 0.064), ('sc', 0.063), ('hc', 0.062), ('slic', 0.059), ('dist', 0.057), ('spatial', 0.056), ('regionsize', 0.052), ('zi', 0.052), ('classification', 0.05), ('local', 0.05), ('feature', 0.048), ('encourages', 0.047), ('superpixels', 0.046), ('consistency', 0.046), ('descriptors', 0.045), ('layout', 0.044), ('inexact', 0.043), ('bases', 0.041), ('laplacian', 0.04), ('features', 0.039), ('words', 0.036), ('concatentated', 0.035), ('desc', 0.035), ('lscmclscerpt', 0.035), ('minregionsize', 0.035), ('narendra', 0.035), ('pthm', 0.035), ('densely', 0.034), ('property', 0.033), ('partition', 0.032), ('singapore', 0.031), ('thm', 0.031), ('lagrange', 0.031), ('pooling', 0.031), ('codebooks', 0.03), ('independently', 0.03), ('schemes', 0.03), ('xi', 0.03), ('dtd', 0.029), ('aij', 0.028), ('word', 0.028), ('splits', 0.027), ('xib', 0.027), ('regularized', 0.027), ('thousands', 0.026), ('svd', 0.026), ('computationally', 0.026), ('ignores', 0.024), ('closed', 0.024), ('tend', 0.024), ('universal', 0.024), ('illinois', 0.024), ('ghanem', 0.024), ('nearest', 0.023), ('todorovic', 0.023), ('runtime', 0.023), ('coded', 0.022), ('collectively', 0.022), ('dz', 0.022), ('defense', 0.022), ('accuracies', 0.022), ('encouraged', 0.022), ('lowrank', 0.022), ('boureau', 0.022), ('region', 0.022), ('adopted', 0.022), ('locally', 0.021), ('tr', 0.021), ('code', 0.021), ('selection', 0.021), ('pipeline', 0.021), ('coefficient', 0.021), ('done', 0.021), ('improves', 0.021), ('among', 0.021), ('jointly', 0.02), ('regularizer', 0.02), ('mrf', 0.02), ('clearly', 0.02), ('regularizers', 0.019), ('type', 0.019), ('solves', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000007 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
Author: Tianzhu Zhang, Bernard Ghanem, Si Liu, Changsheng Xu, Narendra Ahuja
Abstract: In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36].
2 0.13524927 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
Author: Yu-Tseh Chi, Mohsen Ali, Muhammad Rushdi, Jeffrey Ho
Abstract: This paper proposes a novel approach for sparse coding that further improves upon the sparse representation-based classification (SRC) framework. The proposed framework, Affine-Constrained Group Sparse Coding (ACGSC), extends the current SRC framework to classification problems with multiple input samples. Geometrically, the affineconstrained group sparse coding essentially searches for the vector in the convex hull spanned by the input vectors that can best be sparse coded using the given dictionary. The resulting objectivefunction is still convex and can be efficiently optimized using iterative block-coordinate descent scheme that is guaranteed to converge. Furthermore, we provide a form of sparse recovery result that guarantees, at least theoretically, that the classification performance of the constrained group sparse coding should be at least as good as the group sparse coding. We have evaluated the proposed approach using three different recognition experiments that involve illumination variation of faces and textures, and face recognition under occlusions. Prelimi- nary experiments have demonstrated the effectiveness of the proposed approach, and in particular, the results from the recognition/occlusion experiment are surprisingly accurate and robust.
3 0.13197613 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
Author: Jiajia Luo, Wei Wang, Hairong Qi
Abstract: Human action recognition based on the depth information provided by commodity depth sensors is an important yet challenging task. The noisy depth maps, different lengths of action sequences, and free styles in performing actions, may cause large intra-class variations. In this paper, a new framework based on sparse coding and temporal pyramid matching (TPM) is proposed for depthbased human action recognition. Especially, a discriminative class-specific dictionary learning algorithm isproposed for sparse coding. By adding the group sparsity and geometry constraints, features can be well reconstructed by the sub-dictionary belonging to the same class, and the geometry relationships among features are also kept in the calculated coefficients. The proposed approach is evaluated on two benchmark datasets captured by depth cameras. Experimental results show that the proposed algorithm repeatedly hqi } @ ut k . edu GB ImagesR epth ImagesD setkonlSy0 896.5170d4ept.3h021 .x02y 19.876504.dep3th02.1 x02. achieves superior performance to the state of the art algorithms. Moreover, the proposed dictionary learning method also outperforms classic dictionary learning approaches.
4 0.12768529 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
Author: Hans Lobel, René Vidal, Alvaro Soto
Abstract: Currently, Bag-of-Visual-Words (BoVW) and part-based methods are the most popular approaches for visual recognition. In both cases, a mid-level representation is built on top of low-level image descriptors and top-level classifiers use this mid-level representation to achieve visual recognition. While in current part-based approaches, mid- and top-level representations are usually jointly trained, this is not the usual case for BoVW schemes. A main reason for this is the complex data association problem related to the usual large dictionary size needed by BoVW approaches. As a further observation, typical solutions based on BoVW and part-based representations are usually limited to extensions of binary classification schemes, a strategy that ignores relevant correlations among classes. In this work we propose a novel hierarchical approach to visual recognition based on a BoVW scheme that jointly learns suitable midand top-level representations. Furthermore, using a maxmargin learning framework, the proposed approach directly handles the multiclass case at both levels of abstraction. We test our proposed method using several popular bench- mark datasets. As our main result, we demonstrate that, by coupling learning of mid- and top-level representations, the proposed approach fosters sharing of discriminative visual words among target classes, being able to achieve state-ofthe-art recognition performance using far less visual words than previous approaches.
5 0.10755304 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
Author: Zhuoyuan Chen, Ying Wu
Abstract: Sparsity models have recently shown great promise in many vision tasks. Using a learned dictionary in sparsity models can in general outperform predefined bases in clean data. In practice, both training and testing data may be corrupted and contain noises and outliers. Although recent studies attempted to cope with corrupted data and achieved encouraging results in testing phase, how to handle corruption in training phase still remains a very difficult problem. In contrast to most existing methods that learn the dictionaryfrom clean data, this paper is targeted at handling corruptions and outliers in training data for dictionary learning. We propose a general method to decompose the reconstructive residual into two components: a non-sparse component for small universal noises and a sparse component for large outliers, respectively. In addition, , further analysis reveals the connection between our approach and the “partial” dictionary learning approach, updating only part of the prototypes (or informative codewords) with remaining (or noisy codewords) fixed. Experiments on synthetic data as well as real applications have shown satisfactory per- formance of this new robust dictionary learning approach.
7 0.09657263 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors
8 0.090912305 161 iccv-2013-Fast Sparsity-Based Orthogonal Dictionary Learning for Image Restoration
9 0.089682817 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
10 0.088470645 232 iccv-2013-Latent Space Sparse Subspace Clustering
11 0.07587862 331 iccv-2013-Pyramid Coding for Functional Scene Element Recognition in Video Scenes
12 0.075748473 249 iccv-2013-Learning to Share Latent Tasks for Action Recognition
13 0.074992098 236 iccv-2013-Learning Discriminative Part Detectors for Image Classification and Cosegmentation
14 0.074263625 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
15 0.074234165 412 iccv-2013-Synergistic Clustering of Image and Segment Descriptors for Unsupervised Scene Understanding
16 0.072763234 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology
17 0.068220563 29 iccv-2013-A Scalable Unsupervised Feature Merging Approach to Efficient Dimensionality Reduction of High-Dimensional Visual Data
18 0.067478232 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
19 0.065325633 390 iccv-2013-Shufflets: Shared Mid-level Parts for Fast Object Detection
20 0.062723897 114 iccv-2013-Dictionary Learning and Sparse Coding on Grassmann Manifolds: An Extrinsic Solution
topicId topicWeight
[(0, 0.151), (1, 0.066), (2, -0.015), (3, -0.037), (4, -0.117), (5, 0.028), (6, -0.059), (7, 0.006), (8, -0.042), (9, -0.061), (10, 0.017), (11, 0.029), (12, -0.017), (13, -0.006), (14, -0.045), (15, -0.015), (16, 0.0), (17, 0.012), (18, 0.015), (19, -0.005), (20, 0.006), (21, -0.031), (22, -0.001), (23, -0.01), (24, -0.003), (25, -0.026), (26, 0.055), (27, 0.028), (28, 0.033), (29, 0.079), (30, 0.063), (31, -0.03), (32, -0.065), (33, 0.075), (34, 0.034), (35, 0.028), (36, 0.05), (37, -0.086), (38, 0.055), (39, 0.04), (40, -0.017), (41, 0.037), (42, -0.043), (43, 0.061), (44, -0.14), (45, -0.048), (46, -0.03), (47, -0.01), (48, 0.032), (49, 0.031)]
simIndex simValue paperId paperTitle
same-paper 1 0.9306764 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
Author: Tianzhu Zhang, Bernard Ghanem, Si Liu, Changsheng Xu, Narendra Ahuja
Abstract: In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36].
2 0.76744694 287 iccv-2013-Neighbor-to-Neighbor Search for Fast Coding of Feature Vectors
Author: Nakamasa Inoue, Koichi Shinoda
Abstract: Assigning a visual code to a low-level image descriptor, which we call code assignment, is the most computationally expensive part of image classification algorithms based on the bag of visual word (BoW) framework. This paper proposes a fast computation method, Neighbor-toNeighbor (NTN) search, for this code assignment. Based on the fact that image features from an adjacent region are usually similar to each other, this algorithm effectively reduces the cost of calculating the distance between a codeword and a feature vector. This method can be applied not only to a hard codebook constructed by vector quantization (NTN-VQ), but also to a soft codebook, a Gaussian mixture model (NTN-GMM). We evaluated this method on the PASCAL VOC 2007 classification challenge task. NTN-VQ reduced the assignment cost by 77.4% in super-vector coding, and NTN-GMM reduced it by 89.3% in Fisher-vector coding, without any significant degradation in classification performance.
Author: Lingqiao Liu, Lei Wang
Abstract: To achieve a good trade-off between recognition accuracy and computational efficiency, it is often needed to reduce high-dimensional visual data to medium-dimensional ones. For this task, even applying a simple full-matrixbased linear projection causes significant computation and memory use. When the number of visual data is large, how to efficiently learn such a projection could even become a problem. The recent feature merging approach offers an efficient way to reduce the dimensionality, which only requires a single scan of features to perform reduction. However, existing merging algorithms do not scale well with highdimensional data, especially in the unsupervised case. To address this problem, we formulate unsupervised feature merging as a PCA problem imposed with a special structure constraint. By exploiting its connection with kmeans, we transform this constrained PCA problem into a feature clustering problem. Moreover, we employ the hashing technique to improve its scalability. These produce a scalable feature merging algorithm for our dimensional- ity reduction task. In addition, we develop an extension of this method by leveraging the neighborhood structure in the data to further improve dimensionality reduction performance. In further, we explore the incorporation of bipolar merging a variant of merging function which allows the subtraction operation into our algorithms. Through three applications in visual recognition, we demonstrate that our methods can not only achieve good dimensionality reduction performance with little computational cost but also help to create more powerful representation at both image level and local feature level. – –
4 0.66183245 45 iccv-2013-Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications
Author: Yu-Tseh Chi, Mohsen Ali, Muhammad Rushdi, Jeffrey Ho
Abstract: This paper proposes a novel approach for sparse coding that further improves upon the sparse representation-based classification (SRC) framework. The proposed framework, Affine-Constrained Group Sparse Coding (ACGSC), extends the current SRC framework to classification problems with multiple input samples. Geometrically, the affineconstrained group sparse coding essentially searches for the vector in the convex hull spanned by the input vectors that can best be sparse coded using the given dictionary. The resulting objectivefunction is still convex and can be efficiently optimized using iterative block-coordinate descent scheme that is guaranteed to converge. Furthermore, we provide a form of sparse recovery result that guarantees, at least theoretically, that the classification performance of the constrained group sparse coding should be at least as good as the group sparse coding. We have evaluated the proposed approach using three different recognition experiments that involve illumination variation of faces and textures, and face recognition under occlusions. Prelimi- nary experiments have demonstrated the effectiveness of the proposed approach, and in particular, the results from the recognition/occlusion experiment are surprisingly accurate and robust.
5 0.65700483 401 iccv-2013-Stacked Predictive Sparse Coding for Classification of Distinct Regions in Tumor Histopathology
Author: Hang Chang, Yin Zhou, Paul Spellman, Bahram Parvin
Abstract: Image-based classification ofhistology sections, in terms of distinct components (e.g., tumor, stroma, normal), provides a series of indices for tumor composition. Furthermore, aggregation of these indices, from each whole slide image (WSI) in a large cohort, can provide predictive models of the clinical outcome. However, performance of the existing techniques is hindered as a result of large technical variations and biological heterogeneities that are always present in a large cohort. We propose a system that automatically learns a series of basis functions for representing the underlying spatial distribution using stacked predictive sparse decomposition (PSD). The learned representation is then fed into the spatial pyramid matching framework (SPM) with a linear SVM classifier. The system has been evaluated for classification of (a) distinct histological components for two cohorts of tumor types, and (b) colony organization of normal and malignant cell lines in 3D cell culture models. Throughput has been increased through the utility of graphical processing unit (GPU), and evalu- ation indicates a superior performance results, compared with previous research.
6 0.64746195 197 iccv-2013-Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition
7 0.62898135 365 iccv-2013-SIFTpack: A Compact Representation for Efficient SIFT Matching
8 0.61167294 20 iccv-2013-A Max-Margin Perspective on Sparse Representation-Based Classification
9 0.60625118 14 iccv-2013-A Generalized Iterated Shrinkage Algorithm for Non-convex Sparse Coding
10 0.58265764 188 iccv-2013-Group Sparsity and Geometry Constrained Dictionary Learning for Action Recognition from Depth Maps
11 0.57939178 34 iccv-2013-Abnormal Event Detection at 150 FPS in MATLAB
12 0.56897503 331 iccv-2013-Pyramid Coding for Functional Scene Element Recognition in Video Scenes
13 0.55487132 354 iccv-2013-Robust Dictionary Learning by Error Source Decomposition
14 0.53571737 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
16 0.53280348 384 iccv-2013-Semi-supervised Robust Dictionary Learning via Efficient l-Norms Minimization
17 0.530141 288 iccv-2013-Nested Shape Descriptors
18 0.52740556 55 iccv-2013-Automatic Kronecker Product Model Based Detection of Repeated Patterns in 2D Urban Images
19 0.52692056 253 iccv-2013-Linear Sequence Discriminant Analysis: A Model-Based Dimensionality Reduction Method for Vector Sequences
20 0.52541536 193 iccv-2013-Heterogeneous Auto-similarities of Characteristics (HASC): Exploiting Relational Information for Classification
topicId topicWeight
[(2, 0.078), (4, 0.017), (7, 0.018), (12, 0.017), (23, 0.177), (26, 0.148), (31, 0.065), (40, 0.012), (42, 0.1), (48, 0.011), (64, 0.032), (73, 0.023), (89, 0.152), (93, 0.012), (98, 0.025)]
simIndex simValue paperId paperTitle
1 0.83389425 364 iccv-2013-SGTD: Structure Gradient and Texture Decorrelating Regularization for Image Decomposition
Author: Qiegen Liu, Jianbo Liu, Pei Dong, Dong Liang
Abstract: This paper presents a novel structure gradient and texture decorrelating regularization (SGTD) for image decomposition. The motivation of the idea is under the assumption that the structure gradient and texture components should be properly decorrelated for a successful decomposition. The proposed model consists of the data fidelity term, total variation regularization and the SGTD regularization. An augmented Lagrangian method is proposed to address this optimization issue, by first transforming the unconstrained problem to an equivalent constrained problem and then applying an alternating direction method to iteratively solve the subproblems. Experimental results demonstrate that the proposed method presents better or comparable performance as state-of-the-art methods do.
same-paper 2 0.82053977 258 iccv-2013-Low-Rank Sparse Coding for Image Classification
Author: Tianzhu Zhang, Bernard Ghanem, Si Liu, Changsheng Xu, Narendra Ahuja
Abstract: In this paper, we propose a low-rank sparse coding (LRSC) method that exploits local structure information among features in an image for the purpose of image-level classification. LRSC represents densely sampled SIFT descriptors, in a spatial neighborhood, collectively as lowrank, sparse linear combinations of codewords. As such, it casts the feature coding problem as a low-rank matrix learning problem, which is different from previous methods that encode features independently. This LRSC has a number of attractive properties. (1) It encourages sparsity in feature codes, locality in codebook construction, and low-rankness for spatial consistency. (2) LRSC encodes local features jointly by considering their low-rank structure information, and is computationally attractive. We evaluate the LRSC by comparing its performance on a set of challenging benchmarks with that of 7 popular coding and other state-of-theart methods. Our experiments show that by representing local features jointly, LRSC not only outperforms the state-ofthe-art in classification accuracy but also improves the time complexity of methods that use a similar sparse linear repre- sentation model for feature coding [36].
3 0.80189449 289 iccv-2013-Network Principles for SfM: Disambiguating Repeated Structures with Local Context
Author: Kyle Wilson, Noah Snavely
Abstract: Repeated features are common in urban scenes. Many objects, such as clock towers with nearly identical sides, or domes with strong radial symmetries, pose challenges for structure from motion. When similar but distinct features are mistakenly equated, the resulting 3D reconstructions can have errors ranging from phantom walls and superimposed structures to a complete failure to reconstruct. We present a new approach to solving such problems by considering the local visibility structure of such repeated features. Drawing upon network theory, we present a new way of scoring features using a measure of local clustering. Our model leads to a simple, fast, and highly scalable technique for disambiguating repeated features based on an analysis of an underlying visibility graph, without relying on explicit geometric reasoning. We demonstrate our method on several very large datasets drawn from Internet photo collections, and compare it to a more traditional geometry-based disambiguation technique.
4 0.7961992 156 iccv-2013-Fast Direct Super-Resolution by Simple Functions
Author: Chih-Yuan Yang, Ming-Hsuan Yang
Abstract: The goal of single-image super-resolution is to generate a high-quality high-resolution image based on a given low-resolution input. It is an ill-posed problem which requires exemplars or priors to better reconstruct the missing high-resolution image details. In this paper, we propose to split the feature space into numerous subspaces and collect exemplars to learn priors for each subspace, thereby creating effective mapping functions. The use of split input space facilitates both feasibility of using simple functionsfor super-resolution, and efficiency ofgenerating highresolution results. High-quality high-resolution images are reconstructed based on the effective learned priors. Experimental results demonstrate that theproposed algorithmperforms efficiently and effectively over state-of-the-art methods.
5 0.79422355 125 iccv-2013-Drosophila Embryo Stage Annotation Using Label Propagation
Author: Tomáš Kazmar, Evgeny Z. Kvon, Alexander Stark, Christoph H. Lampert
Abstract: In this work we propose a system for automatic classification of Drosophila embryos into developmental stages. While the system is designed to solve an actual problem in biological research, we believe that the principle underlying it is interesting not only for biologists, but also for researchers in computer vision. The main idea is to combine two orthogonal sources of information: one is a classifier trained on strongly invariant features, which makes it applicable to images of very different conditions, but also leads to rather noisy predictions. The other is a label propagation step based on a more powerful similarity measure that however is only consistent within specific subsets of the data at a time. In our biological setup, the information sources are the shape and the staining patterns of embryo images. We show experimentally that while neither of the methods can be used by itself to achieve satisfactory results, their combination achieves prediction quality comparable to human per- formance.
6 0.79403633 316 iccv-2013-Pictorial Human Spaces: How Well Do Humans Perceive a 3D Articulated Pose?
7 0.79093754 198 iccv-2013-Hierarchical Part Matching for Fine-Grained Visual Categorization
8 0.78786302 326 iccv-2013-Predicting Sufficient Annotation Strength for Interactive Foreground Segmentation
9 0.78735417 295 iccv-2013-On One-Shot Similarity Kernels: Explicit Feature Maps and Properties
10 0.7866227 414 iccv-2013-Temporally Consistent Superpixels
11 0.78572965 8 iccv-2013-A Deformable Mixture Parsing Model with Parselets
12 0.78440624 102 iccv-2013-Data-Driven 3D Primitives for Single Image Understanding
13 0.78097159 150 iccv-2013-Exemplar Cut
14 0.78074968 180 iccv-2013-From Where and How to What We See
15 0.7804873 282 iccv-2013-Multi-view Object Segmentation in Space and Time
16 0.7793116 395 iccv-2013-Slice Sampling Particle Belief Propagation
17 0.77746439 95 iccv-2013-Cosegmentation and Cosketch by Unsupervised Learning
18 0.77679461 51 iccv-2013-Anchored Neighborhood Regression for Fast Example-Based Super-Resolution
19 0.77460551 391 iccv-2013-Sieving Regression Forest Votes for Facial Feature Detection in the Wild
20 0.77277327 245 iccv-2013-Learning a Dictionary of Shape Epitomes with Applications to Image Labeling