nips nips2010 nips2010-109 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Pierre Garrigues, Bruno A. Olshausen
Abstract: We propose a class of sparse coding models that utilizes a Laplacian Scale Mixture (LSM) prior to model dependencies among coefficients. Each coefficient is modeled as a Laplacian distribution with a variable scale parameter, with a Gamma distribution prior over the scale parameter. We show that, due to the conjugacy of the Gamma prior, it is possible to derive efficient inference procedures for both the coefficients and the scale parameter. When the scale parameters of a group of coefficients are combined into a single variable, it is possible to describe the dependencies that occur due to common amplitude fluctuations among coefficients, which have been shown to constitute a large fraction of the redundancy in natural images [1]. We show that, as a consequence of this group sparse coding, the resulting inference of the coefficients follows a divisive normalization rule, and that this may be efficiently implemented in a network architecture similar to that which has been proposed to occur in primary visual cortex. We also demonstrate improvements in image coding and compressive sensing recovery using the LSM model. 1
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract We propose a class of sparse coding models that utilizes a Laplacian Scale Mixture (LSM) prior to model dependencies among coefficients. [sent-7, score-0.439]
2 Each coefficient is modeled as a Laplacian distribution with a variable scale parameter, with a Gamma distribution prior over the scale parameter. [sent-8, score-0.263]
3 We show that, due to the conjugacy of the Gamma prior, it is possible to derive efficient inference procedures for both the coefficients and the scale parameter. [sent-9, score-0.203]
4 When the scale parameters of a group of coefficients are combined into a single variable, it is possible to describe the dependencies that occur due to common amplitude fluctuations among coefficients, which have been shown to constitute a large fraction of the redundancy in natural images [1]. [sent-10, score-0.372]
5 We show that, as a consequence of this group sparse coding, the resulting inference of the coefficients follows a divisive normalization rule, and that this may be efficiently implemented in a network architecture similar to that which has been proposed to occur in primary visual cortex. [sent-11, score-0.491]
6 We also demonstrate improvements in image coding and compressive sensing recovery using the LSM model. [sent-12, score-0.516]
7 1 Introduction The concept of sparsity is widely used in the signal processing, machine learning and statistics communities for model fitting and solving inverse problems. [sent-13, score-0.19]
8 This approach has been applied to problems such as image coding, compressive sensing [4], or classification [5]. [sent-16, score-0.286]
9 Hence, the 2 signal model assumed by BPDN is linear, generative, and the basis function coefficients are independent. [sent-20, score-0.191]
10 It has also been observed in the context of generative image models that the inferred sparse coefficients exhibit pronounced statistical dependencies [15, 16], and therefore the independence assumption is violated. [sent-22, score-0.358]
11 It has been proposed in block- 1 methods to account for dependencies among the coefficients by dividing them into subspaces such that dependencies within the subspaces are allowed, but not across the subspaces [17] . [sent-23, score-0.311]
12 The coefficient prior is thus a mixture of Laplacian distributions which we denote “Laplacian Scale Mixture” (LSM), which is an analogy to the Gaussian scale mixture (GSM) [12]. [sent-28, score-0.29]
13 Higher-order dependencies of feedforward responses of wavelet coefficients [12] or basis functions learned using independent component analysis [14] have been captured using GSMs, and we extend this approach to a generative sparse coding model using LSMs. [sent-29, score-0.622]
14 We define the Laplacian scale mixture in Section 2, and we describe the inference algorithms in the resulting sparse coding models with an LSM prior on the coefficients in Section 3. [sent-30, score-0.536]
15 We present an example of a factorial LSM model in Section 4, and of a non-factorial LSM model in Section 5 that is particularly well suited to signals having the “group sparsity” property. [sent-31, score-0.198]
16 We show that the nonfactorial LSM results in a divisive normalization rule for inferring the coefficients. [sent-32, score-0.256]
17 When the groups are organized topographically and the basis is trained on natural images, the resulting model resembles the neighborhood divisive normalization that has been hypothesized to occur in visual cortex. [sent-33, score-0.564]
18 We also demonstrate that the proposed LSM inference algorithm provides superior performance in image coding and compressive sensing recovery. [sent-34, score-0.484]
19 2 The Laplacian Scale Mixture distribution A random variable si is a Laplacian scale mixture if it can be written si = λ−1 ui , where ui has i 1 a Laplacian distribution with scale 1, i. [sent-35, score-0.884]
20 Conditioned on the parameter λi , the coefficient si has a Laplacian distribution with inverse scale λi , i. [sent-39, score-0.412]
21 The distribution over si is therefore a continuous mixture of Laplacian 2 distributions with different inverse scales, and it can be computed by integrating out λi ∞ ∞ p(si | λi )p(λi )dλi = p(si ) = 0 0 λi −λi |si | e p(λi )dλi . [sent-42, score-0.421]
22 3 Inference in a sparse coding model with LSM prior We propose the linear generative model m x = Φs + ν = si ϕi + ν, (2) i=1 where x ∈ Rn , Φ = [ϕ1 , . [sent-46, score-0.687]
23 , ϕm ] ∈ Rn×m is an overcomplete transform or basis set, and the columns ϕi are its basis functions. [sent-49, score-0.244]
24 2 Given a signal x, we wish to infer its sparse representation s in the dictionary Φ. [sent-53, score-0.263]
25 1 that it is tractable if the prior on λ is factorial and each λi has a Gamma distribution, as the Laplacian distribution and the Gamma distribution are conjugate. [sent-73, score-0.29]
26 4 Sparse coding with a factorial LSM prior We propose in this Section a sparse coding model where the distribution of the multipliers is factoα−1 rial, and each multiplier has a Gamma distribution, i. [sent-76, score-0.782]
27 With this particular choice of a prior on the multiplier, we can compute the probability distribution of si analytically: αβ α . [sent-79, score-0.384]
28 the posterior probability of λi given si is also a Gamma distribution when the prior over λi is a Gamma distribution and the conditional probability of si given λi is a Laplace distribution with inverse scale λi . [sent-85, score-0.823]
29 Hence, the posterior of λi given si is a Gamma distribution with parameters α + 1 and β + |si |. [sent-86, score-0.301]
30 In the factorial model we have (t) p(λ | s) = i p(λi | si ). [sent-89, score-0.427]
31 We saw that with the Gamma prior we can compute the distribution of si analytically, and therefore we can compute the gradient of log p(s | x) with respect to s. [sent-94, score-0.426]
32 Whereas they derive this rule from mathematical intuitions regarding the L1 ball, we show that this update rule follows from from Bayesian inference assuming a Gamma prior over λ. [sent-103, score-0.258]
33 It was also shown that evidence maximization in a sparse coding model with an automatic relevance determination prior can also be solved via a sequence of reweighted 1 optimization problems [22]. [sent-104, score-0.399]
34 2 Application to image coding It has been shown that the convex relaxation consisting of replacing the 0 norm with the 1 norm is able to identify the sparsest solution under some conditions on the dictionary of basis functions [23]. [sent-106, score-0.435]
35 For instance, it was observed in [16] that it is possible to infer sparser representations with a prior over the coefficients that is a mixture of a delta function at zero and a Gaussian distribution than with the Laplacian prior. [sent-108, score-0.243]
36 We show that our proposed inference algorithm also leads to representations that are more sparse, as the LSM prior with Gamma hyperprior has heavier tails than the Laplacian distribution. [sent-109, score-0.256]
37 We selected 1000 16 × 16 image patches at random, and computed their sparse representations in a dictionary with 256 basis functions using both the conventional Laplacian prior and our LSM prior. [sent-110, score-0.532]
38 The dictionary is learned from the statistics of natural images [24] using a Laplacian prior over the coefficients. [sent-111, score-0.285]
39 We can see in Figure 2 that the representations using the LSM prior are indeed more sparse by approximately a factor of 2. [sent-115, score-0.239]
40 Note that the computational complexity to compute these sparse representations is much lower than that of [16]. [sent-116, score-0.156]
41 Sparsity of the representation 140 λ1 λ2 λj 120 λm s1 s2 sj LSM prior 100 sm 80 60 40 φij 20 x1 xi xn 00 Figure 1: Graphical model representation of our proposed generative model where the multipliers distribution is factorial. [sent-117, score-0.259]
42 We show here that the LSM prior can be used to capture this group structure in natural images, and we propose an efficient inference algorithm for this case. [sent-124, score-0.284]
43 1 Group sparsity We consider a dictionary Φ such that the basis functions can be divided in a set of disjoint groups or neighborhoods indexed by Nk , i. [sent-126, score-0.463]
44 A signal having the group sparsity property is such that the sparse coefficients occur in groups, i. [sent-132, score-0.376]
45 The group sparsity structure can be captured with the LSM prior by having all the coefficients in a group share the same inverse scale parameter, i. [sent-135, score-0.446]
46 This addresses the case where dependencies are allowed within groups, but not across groups as in the block- 1 method [17]. [sent-139, score-0.18]
47 Note that for some types of dictionaries it is more natural to consider overlapping groups to avoid blocking artifacts. [sent-140, score-0.191]
48 λ(k) si-2 si-1 λ(l) si si+1 si+2 λi-1 si+3 si-2 Figure 3: The two groups N(k) = {i − 2, i − 1, i} and N(l) = {i + 1, i + 2, i + 3} are nonoverlapping. [sent-143, score-0.338]
49 2 λi λi+1 λi+2 si-1 si si+1 si+2 si+3 Figure 4: The basis function coefficients in the neighborhood defined by N (i) = {i−1, i, i+1} share the same multiplier λi . [sent-145, score-0.536]
50 Using the structure of the dependencies in the probabilistic model shown in Figure 3, we have (11) λi p(λi |s(t) ) = λ(k) p(λ |s(t) ) (k) Nk where the index i is in the group Nk , and sNk = (sj )j∈Nk is the vector containing all the coefficients in the group. [sent-149, score-0.193]
51 Using the conjugacy of the Laplacian and Gamma distributions, the distribution of λ(k) given all the coefficients in the neighborhood is a Gamma distribution with parameters α + |Nk | and β + j∈Nk |sj |, where |Nk | denotes the size of the neighborhood. [sent-150, score-0.196]
52 (12) (t) β + j∈Nk |sj | The resulting update rule is a form of divisive normalization. [sent-152, score-0.213]
53 We saw in Section 2 that we can write sk = λ−1 uk , where uk is a Laplacian random variable with scale 1, and thus after convergence we (k) (∞) (∞) (∞) have uk = (α + |Nk |)sk /(β + j∈Nk |sj play an important role in the visual system. [sent-153, score-0.204]
54 Let N (i) denote the indices of the neighborhood that is centered around si (see Figure 4 for an example). [sent-156, score-0.329]
55 A coefficient is indeed a member of many neighborhoods as shown in Figure 4, and the structure of the dependencies implies p(λi | s) = p(λi | sN (i) ). [sent-160, score-0.18]
56 The authors show improved coding efficiency in the context of natural images. [sent-164, score-0.191]
57 3 Compressive sensing recovery In compressed sensing, we observe a number n of random projections of a signal s0 ∈ Rm , and it is in principle impossible to recover s0 if n < m. [sent-166, score-0.294]
58 If the signal has structure beyond sparsity, one can in principle recover the signal with even fewer measurements using an algorithm that exploits this structure [19, 29]. [sent-171, score-0.188]
59 (15) i=1 We denote by RWBP the algorithm with the factorial update (9), and RW3 BP (resp. [sent-173, score-0.195]
60 RW5 BP) the algorithm with our proposed divisive normalization update (13) with group size 3 (resp. [sent-174, score-0.299]
61 We consider 50-dimensional signals that are sparse in the canonical basis and where the neighborhood size is 3. [sent-176, score-0.342]
62 To sample such a signal s ∈ R50 , we draw a number d of “centroids” i, and we sample three values for si−1 , si and si+1 using a normal distribution of variance 1. [sent-177, score-0.37]
63 A compressive sensing recovery problem is parameterized by (m, n, d). [sent-179, score-0.324]
64 We fix m = 50 and parameterize the phase plots using the indeterminacy of the system indexed by δ = n/m, and the approximate sparsity of the system 6 indexed by ρ = 3d/m. [sent-181, score-0.25]
65 For a given value (δ, ρ) on the grid, we sample 10 sparse signals using the corresponding (m, n, d) parameters. [sent-185, score-0.165]
66 The underlying sparse signal is recovered using the three algorithms and we average the recovery error s − s0 2 / s0 2 for each of them. [sent-186, score-0.274]
67 0 Figure 5: Compressive sensing recovery results using synthetic data. [sent-249, score-0.19]
68 Shown are the phase plots for a sequence of BP problems with the factorial update (RWBP), and a sequence of BP problems with the divisive normalization update with neighborhood size 3 (RW3 BP). [sent-250, score-0.524]
69 On the x-axis is the sparsity of the system indexed by ρ = 3d/m, and on the y-axis is the indeterminacy of the system indexed by δ = n/m. [sent-251, score-0.198]
70 However, the basis functions are learned under a probabilistic model where the probability density over the basis functions coefficients is factorial, whereas the sparse coefficients exhibit statistical dependencies [15, 16]. [sent-255, score-0.527]
71 Hence, a generative model with factorial LSM is not rich enough to capture the complex statistics of natural images. [sent-256, score-0.264]
72 We fix a topography where the basis functions coefficients are arranged on a 2D grid, and with overlapping neighborhoods of fixed size 3 × 3. [sent-258, score-0.311]
73 The corresponding inference algorithm uses the divisive normalization update (13). [sent-259, score-0.275]
74 We learn the optimal dictionary of basis functions Φ using the learning rule ∆Φ = η (x − Φˆ)ˆT ss as in [24], where η is the learning rate, s are the basis functions coefficients inferred under the model ˆ (13), and the average is taken over a batch of size 100. [sent-260, score-0.441]
75 The learned basis functions are shown in Figure 6. [sent-262, score-0.17]
76 We see here that the neighborhoods of size 3 × 3 group basis functions at a similar position, scale and orientation. [sent-263, score-0.35]
77 The topography is similar to how neurons are arranged in the visual cortex, and is reminiscent of the results obtained in topographic ICA [13] and topographic mixture of experts models [31]. [sent-264, score-0.261]
78 An important difference is that our model is based on a generative sparse coding model in which both inference and learning can be implemented via local network interactions [7]. [sent-265, score-0.383]
79 Because of the topographic organization, we also obtain a neighborhoodbased divisive normalization rule. [sent-266, score-0.236]
80 Does the proposed non-factorial model represent image structure more efficiently than those with factorial priors? [sent-267, score-0.225]
81 To answer this question we measured the models’ ability to recover sparse structure in the compressed sensing setting. [sent-268, score-0.285]
82 We note that the basis functions are learned such that they represent the sparse structure in images, as opposed to representing the images exactly (there is a noise term in the generative model (2)). [sent-269, score-0.438]
83 Hence, we design our experiment such that we measure the recovery of this sparse structure. [sent-270, score-0.205]
84 Using the basis functions shown in Figure 6, we first infer the 7 sparse coefficients s0 of an image patch x such that x − Φs0 2 < δ using the inference algorithm corresponding to the model. [sent-271, score-0.366]
85 We fix δ such that the SNR is 10, and thus the three sparse approximations for the three models contain the same amount of signal power. [sent-272, score-0.189]
86 We compare the recovery performance Φˆ − Φs0 2 / Φs0 0 for 100 16 × 16 image patches selected at random, and s we use 110 random projections. [sent-276, score-0.158]
87 We can see in Figure 7 that the model with non-factorial LSM prior outperforms the other models as it is able to capture the group sparsity structure in natural images. [sent-277, score-0.304]
88 Figure 6: Basis functions learned in a nonfactorial LSM model with overlapping groups of size 3 × 3 6 Figure 7: Compressive sensing recovery. [sent-278, score-0.301]
89 On the x-axis is the recovery performance for the factorial LSM model (RWBP), and on the y-axis the recovery performance for the non-factorial LSM model with 3 × 3 overlapping groups (RW3×3 BP). [sent-279, score-0.435]
90 Conclusion We introduced a new class of probability densities that can be used as a prior for the coefficients in a generative sparse coding model of images. [sent-282, score-0.413]
91 By exploiting the conjugacy of the Gamma and Laplacian prior, we were able to derive an efficient inference algorithm that consists of solving a sequence of reweighted 1 least-square problems, thus leveraging the multitude of algorithms already developed for BPDN. [sent-283, score-0.191]
92 Our framework also makes it possible to capture higher-order dependencies through group sparsity. [sent-284, score-0.168]
93 When applied to natural images, the learned basis functions of the model may be topographically organized according to the specified group structure. [sent-285, score-0.329]
94 We also showed that exploiting the group sparsity results in performance gains for compressive sensing recovery on natural images. [sent-286, score-0.52]
95 Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. [sent-352, score-0.308]
96 A multi-layer sparse coding network learns contour coding from natural a images. [sent-391, score-0.456]
97 Learning horizontal connections in a sparse coding model of natural images. [sent-398, score-0.311]
98 Just relax: convex programming methods for identifying sparse signals in noise. [sent-447, score-0.165]
99 Emergence of simple-cell receptive field properties by learning a sparse code for natural images. [sent-454, score-0.166]
100 Natural image statistics and divisive normalization: Modeling nonlinearity and adaptation in cortical neurons. [sent-462, score-0.178]
wordName wordTfidf (topN-words)
[('lsm', 0.549), ('si', 0.274), ('coef', 0.242), ('laplacian', 0.226), ('cients', 0.209), ('bp', 0.206), ('bpdn', 0.161), ('factorial', 0.153), ('gamma', 0.151), ('coding', 0.145), ('rwbp', 0.142), ('compressive', 0.134), ('divisive', 0.131), ('basis', 0.122), ('nk', 0.121), ('sparse', 0.12), ('sensing', 0.105), ('dependencies', 0.091), ('conjugacy', 0.087), ('recovery', 0.085), ('multiplier', 0.085), ('prior', 0.083), ('group', 0.077), ('dictionary', 0.074), ('sparsity', 0.073), ('mixture', 0.072), ('signal', 0.069), ('generative', 0.065), ('neighborhoods', 0.064), ('groups', 0.064), ('scale', 0.063), ('gsm', 0.061), ('sj', 0.06), ('images', 0.058), ('topographic', 0.056), ('neighborhood', 0.055), ('inference', 0.053), ('reweighted', 0.051), ('em', 0.05), ('normalization', 0.049), ('overlapping', 0.048), ('inverse', 0.048), ('image', 0.047), ('natural', 0.046), ('signals', 0.045), ('subspaces', 0.043), ('update', 0.042), ('indexed', 0.042), ('ui', 0.042), ('saw', 0.042), ('indeterminacy', 0.041), ('rule', 0.04), ('vancouver', 0.039), ('occur', 0.037), ('representations', 0.036), ('garrigues', 0.036), ('nonfactorial', 0.036), ('topographically', 0.036), ('cand', 0.035), ('compressed', 0.035), ('inferred', 0.035), ('reconstruction', 0.035), ('olshausen', 0.033), ('cevher', 0.033), ('duarte', 0.033), ('blocking', 0.033), ('arg', 0.031), ('wavelet', 0.031), ('hoyer', 0.031), ('topography', 0.031), ('hyperprior', 0.031), ('phase', 0.03), ('heavier', 0.028), ('distribution', 0.027), ('baraniuk', 0.027), ('hyv', 0.027), ('donoho', 0.027), ('patches', 0.026), ('canada', 0.026), ('structure', 0.025), ('uk', 0.025), ('allowed', 0.025), ('sparser', 0.025), ('tails', 0.025), ('synthesis', 0.025), ('wainwright', 0.025), ('functions', 0.024), ('multipliers', 0.024), ('february', 0.024), ('proc', 0.024), ('learned', 0.024), ('visual', 0.024), ('jensen', 0.023), ('pursuit', 0.023), ('berkeley', 0.023), ('rn', 0.023), ('solution', 0.023), ('analytically', 0.023), ('plots', 0.022), ('arranged', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999982 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior
Author: Pierre Garrigues, Bruno A. Olshausen
Abstract: We propose a class of sparse coding models that utilizes a Laplacian Scale Mixture (LSM) prior to model dependencies among coefficients. Each coefficient is modeled as a Laplacian distribution with a variable scale parameter, with a Gamma distribution prior over the scale parameter. We show that, due to the conjugacy of the Gamma prior, it is possible to derive efficient inference procedures for both the coefficients and the scale parameter. When the scale parameters of a group of coefficients are combined into a single variable, it is possible to describe the dependencies that occur due to common amplitude fluctuations among coefficients, which have been shown to constitute a large fraction of the redundancy in natural images [1]. We show that, as a consequence of this group sparse coding, the resulting inference of the coefficients follows a divisive normalization rule, and that this may be efficiently implemented in a network architecture similar to that which has been proposed to occur in primary visual cortex. We also demonstrate improvements in image coding and compressive sensing recovery using the LSM model. 1
2 0.16856603 59 nips-2010-Deep Coding Network
Author: Yuanqing Lin, Zhang Tong, Shenghuo Zhu, Kai Yu
Abstract: This paper proposes a principled extension of the traditional single-layer flat sparse coding scheme, where a two-layer coding scheme is derived based on theoretical analysis of nonlinear functional approximation that extends recent results for local coordinate coding. The two-layer approach can be easily generalized to deeper structures in a hierarchical multiple-layer manner. Empirically, it is shown that the deep coding approach yields improved performance in benchmark datasets.
3 0.1162459 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts
Author: Shaul Druckmann, Dmitri B. Chklovskii
Abstract: A striking aspect of cortical neural networks is the divergence of a relatively small number of input channels from the peripheral sensory apparatus into a large number of cortical neurons, an over-complete representation strategy. Cortical neurons are then connected by a sparse network of lateral synapses. Here we propose that such architecture may increase the persistence of the representation of an incoming stimulus, or a percept. We demonstrate that for a family of networks in which the receptive field of each neuron is re-expressed by its outgoing connections, a represented percept can remain constant despite changing activity. We term this choice of connectivity REceptive FIeld REcombination (REFIRE) networks. The sparse REFIRE network may serve as a high-dimensional integrator and a biologically plausible model of the local cortical circuit. 1
4 0.11388585 56 nips-2010-Deciphering subsampled data: adaptive compressive sampling as a principle of brain communication
Author: Guy Isely, Christopher Hillar, Fritz Sommer
Abstract: A new algorithm is proposed for a) unsupervised learning of sparse representations from subsampled measurements and b) estimating the parameters required for linearly reconstructing signals from the sparse codes. We verify that the new algorithm performs efficient data compression on par with the recent method of compressive sampling. Further, we demonstrate that the algorithm performs robustly when stacked in several stages or when applied in undercomplete or overcomplete situations. The new algorithm can explain how neural populations in the brain that receive subsampled input through fiber bottlenecks are able to form coherent response properties. 1
5 0.11186928 266 nips-2010-The Maximal Causes of Natural Scenes are Edge Filters
Author: Jose Puertas, Joerg Bornschein, Joerg Luecke
Abstract: We study the application of a strongly non-linear generative model to image patches. As in standard approaches such as Sparse Coding or Independent Component Analysis, the model assumes a sparse prior with independent hidden variables. However, in the place where standard approaches use the sum to combine basis functions we use the maximum. To derive tractable approximations for parameter estimation we apply a novel approach based on variational Expectation Maximization. The derived learning algorithm can be applied to large-scale problems with hundreds of observed and hidden variables. Furthermore, we can infer all model parameters including observation noise and the degree of sparseness. In applications to image patches we find that Gabor-like basis functions are obtained. Gabor-like functions are thus not a feature exclusive to approaches assuming linear superposition. Quantitatively, the inferred basis functions show a large diversity of shapes with many strongly elongated and many circular symmetric functions. The distribution of basis function shapes reflects properties of simple cell receptive fields that are not reproduced by standard linear approaches. In the study of natural image statistics, the implications of using different superposition assumptions have so far not been investigated systematically because models with strong non-linearities have been found analytically and computationally challenging. The presented algorithm represents the first large-scale application of such an approach. 1
6 0.10755602 65 nips-2010-Divisive Normalization: Justification and Effectiveness as Efficient Coding Transform
7 0.10490483 143 nips-2010-Learning Convolutional Feature Hierarchies for Visual Recognition
8 0.099895261 117 nips-2010-Identifying graph-structured activation patterns in networks
9 0.091490574 89 nips-2010-Factorized Latent Spaces with Structured Sparsity
10 0.090493582 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework
11 0.090423062 181 nips-2010-Network Flow Algorithms for Structured Sparsity
12 0.083874241 185 nips-2010-Nonparametric Density Estimation for Stochastic Optimization with an Observable State Variable
13 0.082391769 7 nips-2010-A Family of Penalty Functions for Structured Sparsity
14 0.081768513 217 nips-2010-Probabilistic Multi-Task Feature Selection
15 0.081640631 246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
16 0.080107011 101 nips-2010-Gaussian sampling by local perturbations
17 0.077534653 12 nips-2010-A Primal-Dual Algorithm for Group Sparse Regularization with Overlapping Groups
18 0.072661459 265 nips-2010-The LASSO risk: asymptotic results and real world examples
19 0.067423098 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks
20 0.066519454 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing
topicId topicWeight
[(0, 0.196), (1, 0.07), (2, -0.084), (3, 0.087), (4, 0.025), (5, -0.091), (6, 0.047), (7, 0.154), (8, -0.113), (9, -0.04), (10, 0.04), (11, 0.034), (12, -0.066), (13, -0.068), (14, -0.092), (15, -0.113), (16, -0.009), (17, 0.101), (18, 0.003), (19, 0.012), (20, 0.084), (21, -0.026), (22, -0.025), (23, 0.019), (24, -0.065), (25, -0.039), (26, -0.055), (27, 0.001), (28, -0.043), (29, 0.085), (30, 0.023), (31, 0.146), (32, -0.095), (33, 0.012), (34, -0.07), (35, -0.042), (36, -0.04), (37, -0.026), (38, 0.016), (39, 0.016), (40, -0.053), (41, 0.025), (42, 0.135), (43, -0.005), (44, 0.018), (45, -0.025), (46, 0.048), (47, 0.077), (48, 0.029), (49, -0.063)]
simIndex simValue paperId paperTitle
same-paper 1 0.9608435 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior
Author: Pierre Garrigues, Bruno A. Olshausen
Abstract: We propose a class of sparse coding models that utilizes a Laplacian Scale Mixture (LSM) prior to model dependencies among coefficients. Each coefficient is modeled as a Laplacian distribution with a variable scale parameter, with a Gamma distribution prior over the scale parameter. We show that, due to the conjugacy of the Gamma prior, it is possible to derive efficient inference procedures for both the coefficients and the scale parameter. When the scale parameters of a group of coefficients are combined into a single variable, it is possible to describe the dependencies that occur due to common amplitude fluctuations among coefficients, which have been shown to constitute a large fraction of the redundancy in natural images [1]. We show that, as a consequence of this group sparse coding, the resulting inference of the coefficients follows a divisive normalization rule, and that this may be efficiently implemented in a network architecture similar to that which has been proposed to occur in primary visual cortex. We also demonstrate improvements in image coding and compressive sensing recovery using the LSM model. 1
2 0.76355368 76 nips-2010-Energy Disaggregation via Discriminative Sparse Coding
Author: J. Z. Kolter, Siddharth Batra, Andrew Y. Ng
Abstract: Energy disaggregation is the task of taking a whole-home energy signal and separating it into its component appliances. Studies have shown that having devicelevel energy information can cause users to conserve significant amounts of energy, but current electricity meters only report whole-home data. Thus, developing algorithmic methods for disaggregation presents a key technical challenge in the effort to maximize energy conservation. In this paper, we examine a large scale energy disaggregation task, and apply a novel extension of sparse coding to this problem. In particular, we develop a method, based upon structured prediction, for discriminatively training sparse coding algorithms specifically to maximize disaggregation performance. We show that this significantly improves the performance of sparse coding algorithms on the energy task and illustrate how these disaggregation results can provide useful information about energy usage. 1
3 0.7341212 59 nips-2010-Deep Coding Network
Author: Yuanqing Lin, Zhang Tong, Shenghuo Zhu, Kai Yu
Abstract: This paper proposes a principled extension of the traditional single-layer flat sparse coding scheme, where a two-layer coding scheme is derived based on theoretical analysis of nonlinear functional approximation that extends recent results for local coordinate coding. The two-layer approach can be easily generalized to deeper structures in a hierarchical multiple-layer manner. Empirically, it is shown that the deep coding approach yields improved performance in benchmark datasets.
4 0.70678812 246 nips-2010-Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
Author: Taehwan Kim, Gregory Shakhnarovich, Raquel Urtasun
Abstract: Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretability as well as smoothness constraints that are inherent to human motion. We demonstrate the effectiveness of our approach to learn interpretable representations of human motion from motion capture data, and show that our approach outperforms recently developed matching pursuit and sparse coding algorithms. 1
5 0.69373381 266 nips-2010-The Maximal Causes of Natural Scenes are Edge Filters
Author: Jose Puertas, Joerg Bornschein, Joerg Luecke
Abstract: We study the application of a strongly non-linear generative model to image patches. As in standard approaches such as Sparse Coding or Independent Component Analysis, the model assumes a sparse prior with independent hidden variables. However, in the place where standard approaches use the sum to combine basis functions we use the maximum. To derive tractable approximations for parameter estimation we apply a novel approach based on variational Expectation Maximization. The derived learning algorithm can be applied to large-scale problems with hundreds of observed and hidden variables. Furthermore, we can infer all model parameters including observation noise and the degree of sparseness. In applications to image patches we find that Gabor-like basis functions are obtained. Gabor-like functions are thus not a feature exclusive to approaches assuming linear superposition. Quantitatively, the inferred basis functions show a large diversity of shapes with many strongly elongated and many circular symmetric functions. The distribution of basis function shapes reflects properties of simple cell receptive fields that are not reproduced by standard linear approaches. In the study of natural image statistics, the implications of using different superposition assumptions have so far not been investigated systematically because models with strong non-linearities have been found analytically and computationally challenging. The presented algorithm represents the first large-scale application of such an approach. 1
6 0.68756753 56 nips-2010-Deciphering subsampled data: adaptive compressive sampling as a principle of brain communication
7 0.61696297 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts
8 0.59301835 143 nips-2010-Learning Convolutional Feature Hierarchies for Visual Recognition
9 0.53175503 65 nips-2010-Divisive Normalization: Justification and Effectiveness as Efficient Coding Transform
10 0.50723118 45 nips-2010-CUR from a Sparse Optimization Viewpoint
11 0.50710487 89 nips-2010-Factorized Latent Spaces with Structured Sparsity
12 0.45954391 26 nips-2010-Adaptive Multi-Task Lasso: with Application to eQTL Detection
13 0.45807821 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework
14 0.45746428 7 nips-2010-A Family of Penalty Functions for Structured Sparsity
15 0.45697153 37 nips-2010-Basis Construction from Power Series Expansions of Value Functions
16 0.4428761 204 nips-2010-Penalized Principal Component Regression on Graphs for Analysis of Subnetworks
17 0.43354085 172 nips-2010-Multi-Stage Dantzig Selector
18 0.41392612 256 nips-2010-Structural epitome: a way to summarize one’s visual experience
19 0.4095982 129 nips-2010-Inter-time segment information sharing for non-homogeneous dynamic Bayesian networks
20 0.40455458 181 nips-2010-Network Flow Algorithms for Structured Sparsity
topicId topicWeight
[(13, 0.043), (17, 0.038), (27, 0.102), (30, 0.057), (35, 0.082), (45, 0.193), (50, 0.068), (52, 0.08), (60, 0.019), (77, 0.067), (83, 0.149), (90, 0.017)]
simIndex simValue paperId paperTitle
same-paper 1 0.89055264 109 nips-2010-Group Sparse Coding with a Laplacian Scale Mixture Prior
Author: Pierre Garrigues, Bruno A. Olshausen
Abstract: We propose a class of sparse coding models that utilizes a Laplacian Scale Mixture (LSM) prior to model dependencies among coefficients. Each coefficient is modeled as a Laplacian distribution with a variable scale parameter, with a Gamma distribution prior over the scale parameter. We show that, due to the conjugacy of the Gamma prior, it is possible to derive efficient inference procedures for both the coefficients and the scale parameter. When the scale parameters of a group of coefficients are combined into a single variable, it is possible to describe the dependencies that occur due to common amplitude fluctuations among coefficients, which have been shown to constitute a large fraction of the redundancy in natural images [1]. We show that, as a consequence of this group sparse coding, the resulting inference of the coefficients follows a divisive normalization rule, and that this may be efficiently implemented in a network architecture similar to that which has been proposed to occur in primary visual cortex. We also demonstrate improvements in image coding and compressive sensing recovery using the LSM model. 1
2 0.82813525 200 nips-2010-Over-complete representations on recurrent neural networks can support persistent percepts
Author: Shaul Druckmann, Dmitri B. Chklovskii
Abstract: A striking aspect of cortical neural networks is the divergence of a relatively small number of input channels from the peripheral sensory apparatus into a large number of cortical neurons, an over-complete representation strategy. Cortical neurons are then connected by a sparse network of lateral synapses. Here we propose that such architecture may increase the persistence of the representation of an incoming stimulus, or a percept. We demonstrate that for a family of networks in which the receptive field of each neuron is re-expressed by its outgoing connections, a represented percept can remain constant despite changing activity. We term this choice of connectivity REceptive FIeld REcombination (REFIRE) networks. The sparse REFIRE network may serve as a high-dimensional integrator and a biologically plausible model of the local cortical circuit. 1
3 0.82730663 97 nips-2010-Functional Geometry Alignment and Localization of Brain Areas
Author: Georg Langs, Yanmei Tie, Laura Rigolo, Alexandra Golby, Polina Golland
Abstract: Matching functional brain regions across individuals is a challenging task, largely due to the variability in their location and extent. It is particularly difficult, but highly relevant, for patients with pathologies such as brain tumors, which can cause substantial reorganization of functional systems. In such cases spatial registration based on anatomical data is only of limited value if the goal is to establish correspondences of functional areas among different individuals, or to localize potentially displaced active regions. Rather than rely on spatial alignment, we propose to perform registration in an alternative space whose geometry is governed by the functional interaction patterns in the brain. We first embed each brain into a functional map that reflects connectivity patterns during a fMRI experiment. The resulting functional maps are then registered, and the obtained correspondences are propagated back to the two brains. In application to a language fMRI experiment, our preliminary results suggest that the proposed method yields improved functional correspondences across subjects. This advantage is pronounced for subjects with tumors that affect the language areas and thus cause spatial reorganization of the functional regions. 1
4 0.8253653 21 nips-2010-Accounting for network effects in neuronal responses using L1 regularized point process models
Author: Ryan Kelly, Matthew Smith, Robert Kass, Tai S. Lee
Abstract: Activity of a neuron, even in the early sensory areas, is not simply a function of its local receptive field or tuning properties, but depends on global context of the stimulus, as well as the neural context. This suggests the activity of the surrounding neurons and global brain states can exert considerable influence on the activity of a neuron. In this paper we implemented an L1 regularized point process model to assess the contribution of multiple factors to the firing rate of many individual units recorded simultaneously from V1 with a 96-electrode “Utah” array. We found that the spikes of surrounding neurons indeed provide strong predictions of a neuron’s response, in addition to the neuron’s receptive field transfer function. We also found that the same spikes could be accounted for with the local field potentials, a surrogate measure of global network states. This work shows that accounting for network fluctuations can improve estimates of single trial firing rate and stimulus-response transfer functions. 1
5 0.82037652 238 nips-2010-Short-term memory in neuronal networks through dynamical compressed sensing
Author: Surya Ganguli, Haim Sompolinsky
Abstract: Recent proposals suggest that large, generic neuronal networks could store memory traces of past input sequences in their instantaneous state. Such a proposal raises important theoretical questions about the duration of these memory traces and their dependence on network size, connectivity and signal statistics. Prior work, in the case of gaussian input sequences and linear neuronal networks, shows that the duration of memory traces in a network cannot exceed the number of neurons (in units of the neuronal time constant), and that no network can out-perform an equivalent feedforward network. However a more ethologically relevant scenario is that of sparse input sequences. In this scenario, we show how linear neural networks can essentially perform compressed sensing (CS) of past inputs, thereby attaining a memory capacity that exceeds the number of neurons. This enhanced capacity is achieved by a class of “orthogonal” recurrent networks and not by feedforward networks or generic recurrent networks. We exploit techniques from the statistical physics of disordered systems to analytically compute the decay of memory traces in such networks as a function of network size, signal sparsity and integration time. Alternately, viewed purely from the perspective of CS, this work introduces a new ensemble of measurement matrices derived from dynamical systems, and provides a theoretical analysis of their asymptotic performance. 1
6 0.81990206 96 nips-2010-Fractionally Predictive Spiking Neurons
7 0.81426829 56 nips-2010-Deciphering subsampled data: adaptive compressive sampling as a principle of brain communication
8 0.81319994 117 nips-2010-Identifying graph-structured activation patterns in networks
9 0.81239736 51 nips-2010-Construction of Dependent Dirichlet Processes based on Poisson Processes
10 0.80732089 161 nips-2010-Linear readout from a neural population with partial correlation data
11 0.80724686 44 nips-2010-Brain covariance selection: better individual functional connectivity models using population prior
12 0.80631846 260 nips-2010-Sufficient Conditions for Generating Group Level Sparsity in a Robust Minimax Framework
13 0.80619794 17 nips-2010-A biologically plausible network for the computation of orientation dominance
14 0.80443817 7 nips-2010-A Family of Penalty Functions for Structured Sparsity
15 0.80105466 55 nips-2010-Cross Species Expression Analysis using a Dirichlet Process Mixture Model with Latent Matchings
16 0.80065972 123 nips-2010-Individualized ROI Optimization via Maximization of Group-wise Consistency of Structural and Functional Profiles
17 0.80037093 268 nips-2010-The Neural Costs of Optimal Control
18 0.79913956 18 nips-2010-A novel family of non-parametric cumulative based divergences for point processes
19 0.79884768 26 nips-2010-Adaptive Multi-Task Lasso: with Application to eQTL Detection
20 0.79805487 98 nips-2010-Functional form of motion priors in human motion perception