cvpr cvpr2013 cvpr2013-164 knowledge-graph by maker-knowledge-mining

164 cvpr-2013-Fast Convolutional Sparse Coding


Source: pdf

Author: Hilton Bristow, Anders Eriksson, Simon Lucey

Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 au Abstract Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. [sent-7, score-0.462]

2 For many natural signals however, sparse coding is applied to sub-elements (i. [sent-9, score-0.437]

3 Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. [sent-12, score-0.619]

4 In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence. [sent-13, score-0.163]

5 nN=1||xn− Dzn||22+ β||zn||1 subject to | |dk | |22 ≤ 1 for k = 1. [sent-20, score-0.05]

6 K, (1) where β controls the L1 penalty, and the inequality constraint on the columns of D prevent the dictionary from absorbing all of the system’s energy. [sent-23, score-0.084]

7 A selection of filters learned from an unaligned set of lions. [sent-26, score-0.164]

8 The spatially invariant algorithm produces expression of generic Gabor-like filters as well as specialized domain specific filters, such as the highlighted “eye” . [sent-27, score-0.185]

9 Sparse coding has a fundamental drawback however, as it assumes the ensemble of input vectors are tin adsesupmenedsen tht eof e one abnleot ohfe irn, pi. [sent-28, score-0.231]

10 This independence assumption, when applied to nat- {xn}nN=1 ural images, leads to many basis elements that are translated versions of each other. [sent-31, score-0.046]

11 Convolutional sparse coding attempts to remedy this shortcoming by modelling shift invariance directly within the objective, argmdi,zn subject to 21||x −? [sent-32, score-0.45]

12 (2) Now zk takes the role of a sparse feature map which, when convolved with a filter dk and added over all k, should approximate the input signal x: a full signal 333898991 ( e. [sent-38, score-0.72]

13 image or audio sequence) rather than independent patches as in the objective of Equation (1) . [sent-40, score-0.13]

14 Like traditional sparse coding the estimated sparse basis {dk}kK=1 twioilnl able s pofa a efi cxoeddi nspga tthiael e support. [sent-41, score-0.541]

15 Hspoawrseeve bra, uisn {lidke} traditional sparse coding the input signal x and the sparse feature maps {zk}kK=1 are of a different and usually mfeautcuhr ela mrgearp sdim {zen}sionality. [sent-42, score-0.587]

16 Note also that we assume there is only a single signal x in our formulation in Equation (2) ; it is trivial in our proposed formulation to handle multiple signals each of varying length. [sent-43, score-0.166]

17 We persist with the single signal assumption throughout the derivation of our approach for the sake of clarity and brevity. [sent-44, score-0.092]

18 A 2D convolution operation is represented as the ∗ operatcoonr. [sent-52, score-0.124]

19 We have chosen to use a Fourier representation in this paper due to its particularly useful ability to represent convolutions as a Hadamard product in the Fourier domain. [sent-55, score-0.042]

20 represents tthhee H faacdtam thaatrd product, =anˆ zd diag() ish an operator tehnatts transforms a D dimensional vector into a D D ditmreannssifoonrmals diagonal menatsriioxn. [sent-58, score-0.119]

21 a Commutativity Dho ×lds D Dw dithithis property such that role of filter or signal ˆa can be interchanged. [sent-59, score-0.147]

22 Any transpose operator T on a complex vector or matrix in this paper additionally takes the complex conjugate in a similar fashion to the Hermitian adjoint [17] . [sent-60, score-0.095]

23 With respect to Equation (2) the objective of central interest in this paper – we will often omit filter indices (e. [sent-61, score-0.145]

24 dk refers to the kth filter and zk refers to the kth filter response) when referring z z to the variables being optimized. [sent-63, score-0.493]

25 z z Prior Art: The idea that sparse part-based representations form the computational foundations of visual systems has been supported by Olshausen and Field [15, 16] and many other studies [8, 20, 21] . [sent-75, score-0.165]

26 Neurons in the inferotemporal cortex respond to moderately complex features which are invariant to the position of stimulus within the visual field. [sent-76, score-0.109]

27 The resulting features were complex patterns rather than the Gabor-like features obtained by sparse coding. [sent-78, score-0.132]

28 The idea of truly shift-invariant or “convolutional” sparse coding was first proposed by Lewicki and Sejnowski for discrete 1D time-varying signals [12] , and later generalized to images by Mrup et al. [sent-80, score-0.437]

29 ’s work in convolutional sparse coding was motivated by the study of deconvolutional networks [26, 27] , which are closely related to the seminal works of Lecun on convolutional networks [9, 10] . [sent-83, score-1.027]

30 proposed to solve the objective in Equation (2) through an alternation strategy where one solves a sequence of convex subproblems until convergence. [sent-85, score-0.209]

31 The approach alternates between solving the subproblem d given a fixed z, and the subproblem z given a fixed d. [sent-86, score-0.586]

32 A drawback to this strategy however, is the computational overhead associated with both subproblems. [sent-87, score-0.048]

33 The introduction of convolution necessitates the use of gradient solvers for each subproblem, with linear convergence properties dramatically affecting the convergence properties of the overall algorithm. [sent-88, score-0.415]

34 Zeiler further introduced an auxiliary variable, t, to separate the convolution from the L1 regularization (allowing for an explicit and efficient solution to t using soft thresholding) . [sent-89, score-0.197]

35 Instead of enforcing the equality constraint z = t explicitly, the authors add a quadratic term 2μ | |z − t | |22 to penalize violations. [sent-90, score-0.108]

36 This quadratic penalty can |b|e reinterpreted as a trust region constraint | |z −t | |22 ≤ ? [sent-91, score-0.131]

37 In order to satisfy tshtrea equality constraint, μ m ? [sent-94, score-0.066]

38 The optimal value of subproblem d requires solution of a QCQP. [sent-97, score-0.293]

39 To avoid this added computational burden, Zeiler normalizes the solution to an unconstrained minimization, and while this tends to work in practice, it is an approximation not guaranteed to converge to the global minima of the original constrained objective (see Figure 2) . [sent-98, score-0.159]

40 Similar to Zeiler’s method is FISTA, from the family of proximal gradient methods [1] . [sent-99, score-0.038]

41 It is a well known iterative method capable of solving L1 regularized least squares problems with quadratic convergence properties. [sent-100, score-0.132]

42 m Aeuthgomde notfe dm uLlatgipralinergse ( mAeDthModMs) we use – – hs uavche similar quadratic convergence properties under more modest conditions [2, 25] and through their capacity 333999002 to compose functions, present fast, scalable and distributed solvers. [sent-103, score-0.17]

43 We argue that an algorithmic speedup can be obtained by applying an ADMM approach to the objective as a whole rather than the L1 subproblem alone. [sent-105, score-0.383]

44 • We demonstrate that the convolution subproblWeme can oben tsoralvteed heffaitci tehnetly c aonnvdo eluxtpiloicnit lsyu bipnr tohbeFourier domain; outperforming conventional gradient solvers that use spatial convolution. [sent-106, score-0.159]

45 By incorporating this approach within an ADMM optimization strategy the inequality constraints on • • the norm of the dictionary elements can be satisfied exactly by scaling the solution to an isotropic problem through the introduction of an additional auxillary variable. [sent-107, score-0.167]

46 We propose a quad-decomposition of the objectWivee pinrotop ssueb apr qoublaedm-dse ctohmatp are convex aen odb can be solved efficiently and explicitly without the need for gradient or sparse solvers. [sent-108, score-0.132]

47 As a result, we demonstrate an improvement in the computational efficiency of convolutional sparse coding over canonical methods (i. [sent-109, score-0.635]

48 Finally, we present a convolutional sparse coding alillbyr,ar wy,e w phreicshe can p cloungv-oalnudt-ipolnaayl d sipreacrtsely cinodtoexisting image and audio coding applications: hiltonbristow . [sent-113, score-0.906]

49 Problem Reformulation Our proposed approach to solving convolutional sparse coding involves the introduction of two auxiliary variables t and s as well as the posing of the convolutional portion of the objective in the Fourier domain, argd m,si,zn,t subject to 21D|| ˆx −k? [sent-116, score-1.154]

50 (3) Φ is a D M submatrix of the Fourier matrix F = [ΦΦ, i Φs ⊥a] Dth ×at Mcor sruebsmpoantdrsix xto o a shmea Fllo supriaetria ml support o=f the filter where M ? [sent-129, score-0.055]

51 Fourier convolution is not etxhaect fillyte req wuhivearelen Mt t ? [sent-131, score-0.124]

52 Ignoring these for the moment (see Boundary Effects on mitigating these differences) the objective in Equation (3) is equivalent to the original objective in Equation (2) . [sent-134, score-0.18]

53 In our proposed Fourier formulation is a D dimensional vector like ˆx and zk, whereas in the original spatial formulation in Equation (2) dk ∈ RM is of a significantly smaller dimensionality to M∈ ? [sent-135, score-0.142]

54 en Dforc ceo rtrheesmaller spatial constraint through the auxiliary variable s, which now becomes separable (in terms of variables) to the convolutional component of the objective. [sent-138, score-0.345]

55 original approach, we also separate the L1 penalty term from the convolutional component of the objective using the auxiliary variable t. [sent-140, score-0.524]

56 dˆk ’s Augmented Lagrangian: In this paper we propose to handle the introduced equality constraints through an augmented Lagrangian approach [3] . [sent-141, score-0.146]

57 The augmented Lagrangian of our proposed objective can be formed as, L(d, s, z, t, λs, λt) = 21D|| ˆx −k? [sent-142, score-0.17]

58 zˆ k||22+ β||t||1 + λsT(s − [ΦT ⊗ IK]dˆ) + λtT(z − t) +μ2s||s − [ΦT⊗ IK]dˆ||22 +μ2t||z − t||22 subject to | |sk | |22 ≤ 1 for k = 1. [sent-144, score-0.05]

59 (4) where λp and μp denote the Lagrange multiplier and penalty weighting for the two auxiliary variables p ∈ {pes,n ta}l yres wpeeigchtitvienlgy. [sent-148, score-0.288]

60 A full description of ADMMs is outside the scope of this paper (readers are encouraged to inspect [3] for a full treatment and review) , but they can be loosely interpreted as applying a GaussSeidel optimization strategy to the augmented Lagragian objective. [sent-151, score-0.128]

61 Such a strategy is advantageous as it often leads to extremely efficient subproblem decompo- × sitions. [sent-152, score-0.341]

62 We detail each of the subproblems following: Subproblem z: z∗ = arg mzin L(z;d,s,t,λs, λt) (5) = F−1{argm zˆin21|| xˆ −Dˆ zˆ||22+ ˆλtT(ˆ z −tˆ) +μ2t|| zˆ −tˆ||22} (6) = F−1{(DˆTDˆ + μtI)−1(DˆT xˆ + μtˆt −λˆt)} (7) where Dˆ = [diag(ˆd1) , . [sent-159, score-0.114]

63 Although the size of the matrix Dˆ is KD KD, it is sparse banded, and an efficient Dvar iisab KleD Dre ×or KdeDrin,g i tex i ssts sp (asersee Figure d3), such that the optimal z∗ can be found as the solution to D independent K K linear systems. [sent-163, score-0.132]

64 Subproblem t: t∗ = arg mtin L(t; d, s, z, λs, λt) (8) × argmtinμ2t||z − t||22+ λtT(z − t) + β||t||1 = (9) Unlike subproblem z, the solution to t cannot be efficiently computed in the Fourier domain, since the L1 norm is not rotation invariant. [sent-164, score-0.369]

65 Since the objective in Equation (9) does not contain any rotations of the data, each element of t = [t1, . [sent-166, score-0.09]

66 , tD]T can be solved independently, z λˆt = argmtinβ|t| + λ(z − t) +2μ(z − t)2 t∗ (10) where the optimal solution for each t can be found efficiently using the shrinkage function, t∗ = sgn? [sent-169, score-0.04]

67 Subproblem d: d∗ = arg msin L(d;s, z,t,λs, λt) = (12) F−1{argmdˆin21|| xˆ −Zˆdˆ||22+ λˆsT(ˆd − sˆ ) +μ2s||dˆ − sˆ ||22} = . [sent-173, score-0.093]

68 In a similar fashion to subproblem z, even though the size of the matrix is KD KD, a similar variable reordering exists suchZ t ish Kat Dfin×dKinDg ,t ahe s iomptiliamra val dria∗ inlev orelvoersd sroinlugti eoxn- Zˆ to D independent K Subproblem s: s∗ = = × K linear systems. [sent-178, score-0.366]

69 arg msin L(s;d,z,t,λs, λt) (15) argmsinμ2s||dˆ − [ΦT⊗ IK]s||22+ ˆλsT(dˆ − [ΦT ⊗ IK]s) subject to | |sk | |22 ≤ 1 for k = 1. [sent-179, score-0.143]

70 (16) In its general form, solving Equation (16) efficiently is problematic as it is a quadratically constrained quadratic programming (QCQP) problem. [sent-183, score-0.128]

71 Fortunately due to the kronecker product with the identity matrix IK it can be broken down into K independent problems, sk∗ = argmsiknμ2s||dˆk− ΦTsk||22+λˆsTk(dˆk− ΦTsk) subject to ||sk||22 ≤ 1 . [sent-184, score-0.05]

72 | ˜sk| ˜s|2k−,1 s˜k, oitfh | e ˜rswk|i 2 se≥ 1 where, (18) sk = (μsΦΦT)−1(Φdˆk + Φλˆsk) . [sent-186, score-0.189]

73 As a result one never needs to actually construct the sub-matrix Φ in order to estimate sk . [sent-194, score-0.189]

74 Lagrange Multiplier Update: λ(ti+1) λ(si+1) ← ← λt(i) + μt(z(i+1) − t(i+1)) λs(i) + μs(d(i+1) − s(i+1)) (21) (22) Penalty Update: Superlinear convergence of the ADMM may be achieved if μ(i) → ∞ [18] . [sent-195, score-0.09]

75 We use their “Fruit” dataset consisting of 10 images, apply local contrast normalization and select random 50 50 subimages Number of filters Number of input training images Figure 4. [sent-205, score-0.114]

76 Time to convergence as a function of (left) the number of filters learned with fixed number of input images, and (right) the number input images with fixed number of filters. [sent-206, score-0.204]

77 Comparison of objective when learning 64 filters from the experiments of Figure 4. [sent-208, score-0.204]

78 Our objective starts at a much larger value than Zeiler et al. [sent-209, score-0.09]

79 ’s, due to added Lagrange multipliers and penalty terms, but quickly converges to a good solution. [sent-210, score-0.199]

80 We perform two experiments, first holding the number of training examples fixed whilst varying the number of filters learned, then visa versa. [sent-212, score-0.114]

81 A representative contains Gabor-like set of 450 filters learned on a laptop using a collection of whitened components, as well as more expressive the spatially-invariant centre-surround natural images. [sent-216, score-0.114]

82 and cross-like components The set which appear since learning strategy produces less spatially shifted versions of the filters. [sent-217, score-0.083]

83 ’s method (red line) with respect to the number of filters being learned. [sent-219, score-0.114]

84 This is largely due to two contributing factors: (i) the direct methods we use to solve each subproblem have super-linear convergence properties, whereas the conjugate gradient method employed by Zeiler et al. [sent-220, score-0.432]

85 is limited to linear convergence, and (ii) convolution in the Fourier domain involves a simple Hadamard product whereas convolutions must be explicitly recomputed for each iteration of conjugate gradients. [sent-221, score-0.248]

86 Our method starts at a much larger objective value, due to the additional Lagrange multiplier and penalty terms. [sent-227, score-0.238]

87 The objective quickly decreases however, as these terms vanish and the equality constraints are satisfied. [sent-228, score-0.156]

88 The overall curve of our objective is typical of ADMMs: steep convergence to begin with, followed by flat-lining and minimal convergence beyond that point. [sent-229, score-0.27]

89 Applying sparse coding algorithms to the original Olshausen and Field dataset [15] has become a standard “sanity check” , to ensure that the method is capable of producing Gabor-like oriented edge filters. [sent-230, score-0.363]

90 While our convolutional algorithm produces some Gabor-like responses, it also has a greater expression of non-Gabor filters which are tailored more towards the semantics of the dataset. [sent-234, score-0.386]

91 Figure 1 shows a compelling example of this artifact, with one of the filters clearly synthesizing an “eye” feature from a set of unaligned lions. [sent-235, score-0.164]

92 Conclusions We presented a method for solving convolutional sparse coding problems in a fast manner through quaddecomposition of the original objective into subproblems that have an efficient parameterization in the Fourier domain. [sent-237, score-0.796]

93 These components working in union allow us to solve the rotation invariant L1 subproblem for each index independently using soft thresholding, and transform the quadratically constrained filter update equation to an unconstrained isotropic system. [sent-238, score-0.573]

94 As filter support size increases, the appeal of Fourier convolution becomes more apparent, and where boundary effects are problematic the Fourier transform can be seamlessly replaced by the Discrete Cosine Transform. [sent-239, score-0.213]

95 Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. [sent-260, score-0.092]

96 Properties of basis functions generated by shift invariant sparse representations of natural images. [sent-288, score-0.253]

97 Shift invariant sparse coding of image and music data. [sent-329, score-0.401]

98 Emergence of simplecell receptive field properties by learning a sparse code for natural images. [sent-334, score-0.17]

99 Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. [sent-369, score-0.071]

100 Linear spatial pyramid matching using sparse coding for image classification. [sent-386, score-0.363]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('zeiler', 0.326), ('subproblem', 0.293), ('convolutional', 0.272), ('coding', 0.231), ('zk', 0.207), ('fourier', 0.201), ('sk', 0.189), ('dk', 0.142), ('lagrange', 0.134), ('sparse', 0.132), ('convolution', 0.124), ('filters', 0.114), ('admms', 0.11), ('olshausen', 0.1), ('diag', 0.097), ('signal', 0.092), ('objective', 0.09), ('convergence', 0.09), ('penalty', 0.089), ('admm', 0.081), ('augmented', 0.08), ('signals', 0.074), ('kd', 0.074), ('auxiliary', 0.073), ('alms', 0.073), ('argmtin', 0.073), ('hashimoto', 0.073), ('lewicki', 0.073), ('wersing', 0.073), ('multipliers', 0.073), ('ish', 0.073), ('subproblems', 0.071), ('cortex', 0.071), ('equality', 0.066), ('anders', 0.065), ('eggert', 0.065), ('fista', 0.065), ('equation', 0.064), ('australia', 0.06), ('tsk', 0.06), ('multiplier', 0.059), ('ik', 0.058), ('tt', 0.056), ('filter', 0.055), ('alternating', 0.054), ('monotone', 0.054), ('hilton', 0.052), ('quadratically', 0.052), ('lagrangian', 0.051), ('unaligned', 0.05), ('bue', 0.05), ('msin', 0.05), ('hadamard', 0.05), ('deconvolutional', 0.05), ('subject', 0.05), ('conjugate', 0.049), ('inequality', 0.048), ('strategy', 0.048), ('adelaide', 0.047), ('basis', 0.046), ('operator', 0.046), ('vectorized', 0.044), ('arg', 0.043), ('quadratic', 0.042), ('convolutions', 0.042), ('shrinkage', 0.04), ('del', 0.04), ('audio', 0.04), ('lecun', 0.038), ('invariant', 0.038), ('properties', 0.038), ('simon', 0.038), ('proximal', 0.038), ('converges', 0.037), ('shift', 0.037), ('unconstrained', 0.036), ('dictionary', 0.036), ('bilinear', 0.035), ('shifted', 0.035), ('award', 0.035), ('networks', 0.035), ('isotropic', 0.035), ('solvers', 0.035), ('problematic', 0.034), ('nn', 0.034), ('variables', 0.034), ('domain', 0.033), ('minima', 0.033), ('td', 0.033), ('foundations', 0.033), ('xne', 0.033), ('qcqp', 0.033), ('ofd', 0.033), ('romberg', 0.033), ('yres', 0.033), ('aslmlearll', 0.033), ('mtin', 0.033), ('oppenheim', 0.033), ('ottoe', 0.033), ('vthecet', 0.033)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000007 164 cvpr-2013-Fast Convolutional Sparse Coding

Author: Hilton Bristow, Anders Eriksson, Simon Lucey

Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.

2 0.202177 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

Author: Yi Sun, Xiaogang Wang, Xiaoou Tang

Abstract: We propose a new approach for estimation of the positions of facial keypoints with three-level carefully designed convolutional networks. At each level, the outputs of multiple networks are fused for robust and accurate estimation. Thanks to the deep structures of convolutional networks, global high-level features are extracted over the whole face region at the initialization stage, which help to locate high accuracy keypoints. There are two folds of advantage for this. First, the texture context information over the entire face is utilized to locate each keypoint. Second, since the networks are trained to predict all the keypoints simultaneously, the geometric constraints among keypoints are implicitly encoded. The method therefore can avoid local minimum caused by ambiguity and data corruption in difficult image samples due to occlusions, large pose variations, and extreme lightings. The networks at the following two levels are trained to locally refine initial predictions and their inputs are limited to small regions around the initial predictions. Several network structures critical for accurate and robust facial point detection are investigated. Extensive experiments show that our approach outperforms state-ofthe-art methods in both detection accuracy and reliability1.

3 0.19011118 255 cvpr-2013-Learning Separable Filters

Author: Roberto Rigamonti, Amos Sironi, Vincent Lepetit, Pascal Fua

Abstract: Learning filters to produce sparse image representations in terms of overcomplete dictionaries has emerged as a powerful way to create image features for many different purposes. Unfortunately, these filters are usually both numerous and non-separable, making their use computationally expensive. In this paper, we show that such filters can be computed as linear combinations of a smaller number of separable ones, thus greatly reducing the computational complexity at no cost in terms of performance. This makes filter learning approaches practical even for large images or 3D volumes, and we show that we significantly outperform state-of-theart methods on the linear structure extraction task, in terms ofboth accuracy and speed. Moreover, our approach is general and can be used on generic filter banks to reduce the complexity of the convolutions.

4 0.16835059 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

Author: Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann Lecun

Abstract: Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-theart and competitive results on all major pedestrian datasets with a convolutional network model. The model uses a few new twists, such as multi-stage features, connections that skip layers to integrate global shape information with local distinctive motif information, and an unsupervised method based on convolutional sparse coding to pre-train the filters at each stage.

5 0.14552425 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification

Author: Amirreza Shaban, Hamid R. Rabiee, Mehrdad Farajtabar, Marjan Ghazvininejad

Abstract: Bag of words models for feature extraction have demonstrated top-notch performance in image classification. These representations are usually accompanied by a coding method. Recently, methods that code a descriptor giving regard to its nearby bases have proved efficacious. These methods take into account the nonlinear structure of descriptors, since local similarities are a good approximation of global similarities. However, they confine their usage of the global similarities to nearby bases. In this paper, we propose a coding scheme that brings into focus the manifold structure of descriptors, and devise a method to compute the global similarities of descriptors to the bases. Given a local similarity measure between bases, a global measure is computed. Exploiting the local similarity of a descriptor and its nearby bases, a global measure of association of a descriptor to all the bases is computed. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. Experiments on benchmark image classification datasets substantiate the superiority oftheproposed method over its locality and sparsity based rivals.

6 0.11740869 419 cvpr-2013-Subspace Interpolation via Dictionary Learning for Unsupervised Domain Adaptation

7 0.10807315 53 cvpr-2013-BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification

8 0.10372517 388 cvpr-2013-Semi-supervised Learning of Feature Hierarchies for Object Detection in a Video

9 0.10207013 257 cvpr-2013-Learning Structured Low-Rank Representations for Image Classification

10 0.096097209 23 cvpr-2013-A Practical Rank-Constrained Eight-Point Algorithm for Fundamental Matrix Estimation

11 0.095820226 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation

12 0.095731243 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition

13 0.094620183 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning

14 0.093567133 66 cvpr-2013-Block and Group Regularized Sparse Modeling for Dictionary Learning

15 0.093057536 95 cvpr-2013-Continuous Inference in Graphical Models with Polynomial Energies

16 0.092599943 304 cvpr-2013-Multipath Sparse Coding Using Hierarchical Matching Pursuit

17 0.089568548 163 cvpr-2013-Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

18 0.086866915 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition

19 0.084510237 392 cvpr-2013-Separable Dictionary Learning

20 0.081759259 296 cvpr-2013-Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.177), (1, -0.021), (2, -0.101), (3, 0.125), (4, 0.001), (5, 0.0), (6, 0.022), (7, -0.031), (8, -0.055), (9, -0.024), (10, -0.005), (11, -0.025), (12, -0.015), (13, -0.076), (14, 0.02), (15, 0.071), (16, -0.035), (17, 0.036), (18, 0.142), (19, 0.045), (20, -0.047), (21, -0.033), (22, -0.023), (23, -0.087), (24, 0.01), (25, 0.141), (26, -0.04), (27, -0.024), (28, 0.001), (29, -0.064), (30, -0.015), (31, -0.044), (32, -0.036), (33, -0.098), (34, -0.117), (35, 0.083), (36, 0.011), (37, 0.099), (38, 0.094), (39, 0.016), (40, -0.082), (41, -0.068), (42, 0.021), (43, -0.021), (44, -0.102), (45, -0.05), (46, 0.073), (47, -0.034), (48, -0.019), (49, 0.038)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95823985 164 cvpr-2013-Fast Convolutional Sparse Coding

Author: Hilton Bristow, Anders Eriksson, Simon Lucey

Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.

2 0.81327093 255 cvpr-2013-Learning Separable Filters

Author: Roberto Rigamonti, Amos Sironi, Vincent Lepetit, Pascal Fua

Abstract: Learning filters to produce sparse image representations in terms of overcomplete dictionaries has emerged as a powerful way to create image features for many different purposes. Unfortunately, these filters are usually both numerous and non-separable, making their use computationally expensive. In this paper, we show that such filters can be computed as linear combinations of a smaller number of separable ones, thus greatly reducing the computational complexity at no cost in terms of performance. This makes filter learning approaches practical even for large images or 3D volumes, and we show that we significantly outperform state-of-theart methods on the linear structure extraction task, in terms ofboth accuracy and speed. Moreover, our approach is general and can be used on generic filter banks to reduce the complexity of the convolutions.

3 0.71354097 369 cvpr-2013-Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination

Author: Laurent Sifre, Stéphane Mallat

Abstract: An affine invariant representation is constructed with a cascade of invariants, which preserves information for classification. A joint translation and rotation invariant representation of image patches is calculated with a scattering transform. It is implemented with a deep convolution network, which computes successive wavelet transforms and modulus non-linearities. Invariants to scaling, shearing and small deformations are calculated with linear operators in the scattering domain. State-of-the-art classification results are obtained over texture databases with uncontrolled viewing conditions.

4 0.7092616 346 cvpr-2013-Real-Time No-Reference Image Quality Assessment Based on Filter Learning

Author: Peng Ye, Jayant Kumar, Le Kang, David Doermann

Abstract: This paper addresses the problem of general-purpose No-Reference Image Quality Assessment (NR-IQA) with the goal ofdeveloping a real-time, cross-domain model that can predict the quality of distorted images without prior knowledge of non-distorted reference images and types of distortions present in these images. The contributions of our work are two-fold: first, the proposed method is highly efficient. NR-IQA measures are often used in real-time imaging or communication systems, therefore it is important to have a fast NR-IQA algorithm that can be used in these real-time applications. Second, the proposed method has the potential to be used in multiple image domains. Previous work on NR-IQA focus primarily on predicting quality of natural scene image with respect to human perception, yet, in other image domains, the final receiver of a digital image may not be a human. The proposed method consists of the following components: (1) a local feature extractor; (2) a global feature extractor and (3) a regression model. While previous approaches usually treat local feature extraction and regres- sion model training independently, we propose a supervised method based on back-projection, which links the two steps by learning a compact set of filters which can be applied to local image patches to obtain discriminative local features. Using a small set of filters, the proposed method is extremely fast. We have tested this method on various natural scene and document image datasets and obtained stateof-the-art results.

5 0.64730388 304 cvpr-2013-Multipath Sparse Coding Using Hierarchical Matching Pursuit

Author: Liefeng Bo, Xiaofeng Ren, Dieter Fox

Abstract: Complex real-world signals, such as images, contain discriminative structures that differ in many aspects including scale, invariance, and data channel. While progress in deep learning shows the importance of learning features through multiple layers, it is equally important to learn features through multiple paths. We propose Multipath Hierarchical Matching Pursuit (M-HMP), a novel feature learning architecture that combines a collection of hierarchical sparse features for image classification to capture multiple aspects of discriminative structures. Our building blocks are MI-KSVD, a codebook learning algorithm that balances the reconstruction error and the mutual incoherence of the codebook, and batch orthogonal matching pursuit (OMP); we apply them recursively at varying layers and scales. The result is a highly discriminative image representation that leads to large improvements to the state-of-the-art on many standard benchmarks, e.g., Caltech-101, Caltech-256, MITScenes, Oxford-IIIT Pet and Caltech-UCSD Bird-200.

6 0.62812847 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

7 0.62783837 442 cvpr-2013-Transfer Sparse Coding for Robust Image Representation

8 0.61558437 421 cvpr-2013-Supervised Kernel Descriptors for Visual Recognition

9 0.60649604 178 cvpr-2013-From Local Similarity to Global Coding: An Application to Image Classification

10 0.59953195 328 cvpr-2013-Pedestrian Detection with Unsupervised Multi-stage Feature Learning

11 0.59763032 403 cvpr-2013-Sparse Output Coding for Large-Scale Visual Recognition

12 0.5890556 83 cvpr-2013-Classification of Tumor Histology via Morphometric Context

13 0.57020748 371 cvpr-2013-SCaLE: Supervised and Cascaded Laplacian Eigenmaps for Visual Object Recognition Based on Nearest Neighbors

14 0.55632043 53 cvpr-2013-BFO Meets HOG: Feature Extraction Based on Histograms of Oriented p.d.f. Gradients for Image Classification

15 0.54514211 251 cvpr-2013-Learning Discriminative Illumination and Filters for Raw Material Classification with Optimal Projections of Bidirectional Texture Functions

16 0.54234618 210 cvpr-2013-Illumination Estimation Based on Bilayer Sparse Coding

17 0.53727204 41 cvpr-2013-An Iterated L1 Algorithm for Non-smooth Non-convex Optimization in Computer Vision

18 0.52541 427 cvpr-2013-Texture Enhanced Image Denoising via Gradient Histogram Preservation

19 0.5171988 129 cvpr-2013-Discriminative Brain Effective Connectivity Analysis for Alzheimer's Disease: A Kernel Learning Approach upon Sparse Gaussian Bayesian Network

20 0.50819218 9 cvpr-2013-A Fast Semidefinite Approach to Solving Binary Quadratic Problems


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(10, 0.142), (16, 0.028), (26, 0.032), (27, 0.247), (28, 0.016), (33, 0.255), (57, 0.022), (67, 0.054), (69, 0.065), (80, 0.011), (87, 0.049)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.84628445 324 cvpr-2013-Part-Based Visual Tracking with Online Latent Structural Learning

Author: Rui Yao, Qinfeng Shi, Chunhua Shen, Yanning Zhang, Anton van_den_Hengel

Abstract: Despite many advances made in the area, deformable targets and partial occlusions continue to represent key problems in visual tracking. Structured learning has shown good results when applied to tracking whole targets, but applying this approach to a part-based target model is complicated by the need to model the relationships between parts, and to avoid lengthy initialisation processes. We thus propose a method which models the unknown parts using latent variables. In doing so we extend the online algorithm pegasos to the structured prediction case (i.e., predicting the location of the bounding boxes) with latent part variables. To better estimate the parts, and to avoid over-fitting caused by the extra model complexity/capacity introduced by theparts, wepropose a two-stage trainingprocess, based on the primal rather than the dual form. We then show that the method outperforms the state-of-the-art (linear and non-linear kernel) trackers.

same-paper 2 0.83195901 164 cvpr-2013-Fast Convolutional Sparse Coding

Author: Hilton Bristow, Anders Eriksson, Simon Lucey

Abstract: Sparse coding has become an increasingly popular method in learning and vision for a variety of classification, reconstruction and coding tasks. The canonical approach intrinsically assumes independence between observations during learning. For many natural signals however, sparse coding is applied to sub-elements (i.e. patches) of the signal, where such an assumption is invalid. Convolutional sparse coding explicitly models local interactions through the convolution operator, however the resulting optimization problem is considerably more complex than traditional sparse coding. In this paper, we draw upon ideas from signal processing and Augmented Lagrange Methods (ALMs) to produce a fast algorithm with globally optimal subproblems and super-linear convergence.

3 0.81568617 451 cvpr-2013-Unsupervised Salience Learning for Person Re-identification

Author: Rui Zhao, Wanli Ouyang, Xiaogang Wang

Abstract: Human eyes can recognize person identities based on some small salient regions. However, such valuable salient information is often hidden when computing similarities of images with existing approaches. Moreover, many existing approaches learn discriminative features and handle drastic viewpoint change in a supervised way and require labeling new training data for a different pair of camera views. In this paper, we propose a novel perspective for person re-identification based on unsupervised salience learning. Distinctive features are extracted without requiring identity labels in the training procedure. First, we apply adjacency constrained patch matching to build dense correspondence between image pairs, which shows effectiveness in handling misalignment caused by large viewpoint and pose variations. Second, we learn human salience in an unsupervised manner. To improve the performance of person re-identification, human salience is incorporated in patch matching to find reliable and discriminative matched patches. The effectiveness of our approach is validated on the widely used VIPeR dataset and ETHZ dataset.

4 0.79636258 145 cvpr-2013-Efficient Object Detection and Segmentation for Fine-Grained Recognition

Author: Anelia Angelova, Shenghuo Zhu

Abstract: We propose a detection and segmentation algorithm for the purposes of fine-grained recognition. The algorithm first detects low-level regions that could potentially belong to the object and then performs a full-object segmentation through propagation. Apart from segmenting the object, we can also ‘zoom in ’ on the object, i.e. center it, normalize it for scale, and thus discount the effects of the background. We then show that combining this with a state-of-the-art classification algorithm leads to significant improvements in performance especially for datasets which are considered particularly hard for recognition, e.g. birds species. The proposed algorithm is much more efficient than other known methods in similar scenarios [4, 21]. Our method is also simpler and we apply it here to different classes of objects, e.g. birds, flowers, cats and dogs. We tested the algorithm on a number of benchmark datasets for fine-grained categorization. It outperforms all the known state-of-the-art methods on these datasets, sometimes by as much as 11%. It improves the performance of our baseline algorithm by 3-4%, consistently on all datasets. We also observed more than a 4% improvement in the recognition performance on a challenging largescale flower dataset, containing 578 species of flowers and 250,000 images.

5 0.77153552 248 cvpr-2013-Learning Collections of Part Models for Object Recognition

Author: Ian Endres, Kevin J. Shih, Johnston Jiaa, Derek Hoiem

Abstract: We propose a method to learn a diverse collection of discriminative parts from object bounding box annotations. Part detectors can be trained and applied individually, which simplifies learning and extension to new features or categories. We apply the parts to object category detection, pooling part detections within bottom-up proposed regions and using a boosted classifier with proposed sigmoid weak learners for scoring. On PASCAL VOC 2010, we evaluate the part detectors ’ ability to discriminate and localize annotated keypoints. Our detection system is competitive with the best-existing systems, outperforming other HOG-based detectors on the more deformable categories.

6 0.77055609 414 cvpr-2013-Structure Preserving Object Tracking

7 0.77042216 450 cvpr-2013-Unsupervised Joint Object Discovery and Segmentation in Internet Images

8 0.76879555 285 cvpr-2013-Minimum Uncertainty Gap for Robust Visual Tracking

9 0.76860362 325 cvpr-2013-Part Discovery from Partial Correspondence

10 0.76794982 225 cvpr-2013-Integrating Grammar and Segmentation for Human Pose Estimation

11 0.7678653 70 cvpr-2013-Bottom-Up Segmentation for Top-Down Detection

12 0.7675513 408 cvpr-2013-Spatiotemporal Deformable Part Models for Action Detection

13 0.76734066 445 cvpr-2013-Understanding Bayesian Rooms Using Composite 3D Object Models

14 0.76692641 446 cvpr-2013-Understanding Indoor Scenes Using 3D Geometric Phrases

15 0.76662219 457 cvpr-2013-Visual Tracking via Locality Sensitive Histograms

16 0.76628208 314 cvpr-2013-Online Object Tracking: A Benchmark

17 0.76588738 323 cvpr-2013-POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation

18 0.76575798 104 cvpr-2013-Deep Convolutional Network Cascade for Facial Point Detection

19 0.76555121 360 cvpr-2013-Robust Estimation of Nonrigid Transformation for Point Set Registration

20 0.76513916 221 cvpr-2013-Incorporating Structural Alternatives and Sharing into Hierarchy for Multiclass Object Recognition and Detection