nips nips2011 nips2011-44 knowledge-graph by maker-knowledge-mining

44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

Source: pdf

Author: Jonathan W. Pillow, Il M. Park

Abstract: Neurons typically respond to a restricted number of stimulus features within the high-dimensional space of natural stimuli. Here we describe an explicit modelbased interpretation of traditional estimators for a neuron’s multi-dimensional feature space, which allows for several important generalizations and extensions. First, we show that traditional estimators based on the spike-triggered average (STA) and spike-triggered covariance (STC) can be formalized in terms of the “expected log-likelihood” of a Linear-Nonlinear-Poisson (LNP) model with Gaussian stimuli. This model-based formulation allows us to deﬁne maximum-likelihood and Bayesian estimators that are statistically consistent and efﬁcient in a wider variety of settings, such as with naturalistic (non-Gaussian) stimuli. It also allows us to employ Bayesian methods for regularization, smoothing, sparsiﬁcation, and model comparison, and provides Bayesian conﬁdence intervals on model parameters. We describe an empirical Bayes method for selecting the number of features, and extend the model to accommodate an arbitrary elliptical nonlinear response function, which results in a more powerful and more ﬂexible model for feature space inference. We validate these methods using neural data recorded extracellularly from macaque primary visual cortex. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Neurons typically respond to a restricted number of stimulus features within the high-dimensional space of natural stimuli. [sent-6, score-0.202]

2 Here we describe an explicit modelbased interpretation of traditional estimators for a neuron’s multi-dimensional feature space, which allows for several important generalizations and extensions. [sent-7, score-0.16]

3 First, we show that traditional estimators based on the spike-triggered average (STA) and spike-triggered covariance (STC) can be formalized in terms of the “expected log-likelihood” of a Linear-Nonlinear-Poisson (LNP) model with Gaussian stimuli. [sent-8, score-0.132]

4 We describe an empirical Bayes method for selecting the number of features, and extend the model to accommodate an arbitrary elliptical nonlinear response function, which results in a more powerful and more ﬂexible model for feature space inference. [sent-11, score-0.167]

5 1 Introduction A central problem in systems neuroscience is to understand the probabilistic relationship between sensory stimuli and neural responses. [sent-13, score-0.27]

6 Most neurons in the early sensory pathway are only sensitive to a low-dimensional space of stimulus features, and ignore the other axes in the high-dimensional space of stimuli. [sent-14, score-0.263]

7 The most popular dimensionality-reduction method for neural data uses the ﬁrst two moments of the spike-triggered stimulus distribution: the spike-triggered average (STA) and the eigenvectors of the spike-triggered covariance (STC) [1–5]. [sent-16, score-0.314]

8 In this model, stimuli are projected onto a bank of linear ﬁlters, whose outputs are combined via a nonlinear function, which drives spiking as an inhomogeneous Poisson process (see Fig. [sent-18, score-0.283]

9 Prior work has established the conditions for statistical consistency and efﬁciency of the STA and STC as feature space estimators [1, 2, 8, 9]. [sent-20, score-0.165]

10 However, these moment-based estimators have not yet been interpreted in terms of an explicit probabilistic encoding model. [sent-21, score-0.145]

11 1 linear filters nonlinearity Poisson spiking Figure 1: Schematic of linear-nonlinear-Poisson (LNP) neural encoding model [6]. [sent-24, score-0.451]

12 Here we show, ﬁrst of all, that STA and STC arise naturally from the expected log-likelihood of an LNP model with an “exponentiated-quadratic” nonlinearity, where expectation is taken with respect to a Gaussian stimulus distribution. [sent-25, score-0.187]

13 This insight allows us to formulate exact maximum-likelihood estimators that apply to arbitrary stimulus distributions. [sent-26, score-0.29]

14 We then introduce Bayesian methods for regularizing and smoothing receptive ﬁeld estimates, and an approximate empirical Bayes method for selecting the feature space dimensionality, which obviates nested hypothesis tests, bootstrapping, or cross-validation based methods [5]. [sent-27, score-0.315]

15 Finally, we generalize these estimators to accommodate LNP models with arbitrary elliptically symmetric nonlinearities. [sent-28, score-0.174]

16 The resulting model class provides a richer and more ﬂexible model of neural responses but can still recover a high-dimensional feature space (unlike more general information-theoretic estimators [8, 15], which do not scale easily to more than 2 ﬁlters). [sent-29, score-0.275]

17 We apply these methods to a variety of simulated datasets and to responses from neurons in macaque primary visual cortex stimulated with binary white noise stimuli [16]. [sent-30, score-0.337]

18 2 Model-based STA and STC In a typical neural characterization experiment, the experimenter presents a train of rapidly varying sensory stimuli and records a spike train response. [sent-31, score-0.404]

19 Let x denote a D-dimensional vector containing the spatio-temporal stimulus affecting a neuron’s scalar spike response y in a single time bin. [sent-32, score-0.281]

20 A principal goal of neural characterization is to identify , a low-dimensional projection matrix such that > x captures the neuron’s dependence on the stimulus x. [sent-33, score-0.24]

21 The columns of can be regarded as linear receptive ﬁelds that provide a basis for the neural feature space. [sent-34, score-0.26]

22 The methods we consider here all assume that neural responses can be described by an LNP cascade model (Fig. [sent-35, score-0.148]

23 Under this model, the conditional probability of a response y|x is Poisson with rate f ( > x), where f is a vector function mapping feature space to instantaneous spike rate. [sent-37, score-0.203]

24 1 STA and STC analysis The STA and the STC matrix are the (empirical) ﬁrst and second moments, respectively, of the spike-triggered stimulus ensemble {xi |yi }N . [sent-39, score-0.153]

25 They are deﬁned as: i=1 STA: µ = N 1 X yi x i , nsp i=1 and STC: ⇤ = N 1 X yi (xi nsp i=1 µ)(xi > µ) , (1) P where nsp = yi is the number of spikes and N is the total number of time bins. [sent-40, score-0.465]

26 Traditional STA/STC analysis provides an estimate for the feature space basis consisting of: (1) µ, if it is signiﬁcantly different from zero; and (2) the eigenvectors of ⇤ whose eigenvalues are signiﬁcantly smaller or larger from those of the prior stimulus covariance = E[xxT ]. [sent-41, score-0.448]

27 This estimate is provably consistent only in the case of stimuli drawn from a spherically symmetric (for STA) or independent Gaussian distribution (for STC) [17]. [sent-42, score-0.227]

28 2 For elliptically symmetric or colored Gaussian stimuli, a consistent estimate requires whitening the stimuli by 1 2 and then multiplying the estimated features (STA and STC eigenvectors) again by 2 1 2 (see [5]). [sent-46, score-0.329]

29 2 Equivalent model-based formulation Motivated by [9], we consider an LNP model where the spike rate is deﬁned by an exponentiated general quadratic function: 1 > 2 x Cx f (x) = exp + b> x + a , (2) where C is a symmetric matrix, b is a vector, and a is a scalar. [sent-48, score-0.177]

30 4) yields a quantity we call the expected log-likelihood L, which can be expressed in terms of the STA, STC, , and the model parameters: ⇣ ⌘ 1 1 N ˜ L = 1 Tr [C⇤] + 1 µ> Cµ + b> µ + a nsp |I C| 2 exp 1 b> ( 1 C) b + a . [sent-54, score-0.186]

31 (6) 2 2 2 Maximizing this expression yields expected-ML estimates (see online supplement for derivation): ˜ Cml = aml = log ˜ 1 ✓ nsp N ⇤ 1 ⇤ 1 2 , ◆ 1 ˜ml = ⇤ b 1 > 2µ 1 1 µ, ⇤ 1 µ. [sent-55, score-0.203]

32 (7) Thus, for an LNP model with exponentiated-quadratic nonlinearity stimulated with Gaussian noise, the (expected) maximum likelihood estimates can be obtained in closed form from the STA, STC, stimulus covariance, and mean spike rate nsp /N . [sent-56, score-0.776]

33 Second, if the stimuli are white, meaning = I, ˜ml = I ⇤ 1 , which has the same eigenvectors as the STC matrix. [sent-59, score-0.228]

34 The iSTAC estimator ﬁnds the subspace that maximizes the “single-spike information” [18] under a Gaussian model of the raw and spike-triggered stimulus distributions (that coincides with (eq. [sent-61, score-0.153]

35 4) yields a consistent and asymptotically efﬁcient estimator even when stimuli are not Gaussian. [sent-66, score-0.175]

36 Numerically optimizing this loss 3 If it is not, then this expectation does not exist, and simulations of the corresponding model will produce impossibly high spike counts, with STA and STC dominated by the response to a single stimulus. [sent-67, score-0.128]

37 We refer to ML estimators for (b, W ) as maximum-likelihood STA and STC (or exact ML, as opposed to expectedML estimates from moment-based formulas (eq. [sent-73, score-0.209]

38 These estimates will closely match the standard STA and STC-based feature space when stimuli are Gaussian, but (as maximum-likelihood estimates) are also consistent and asymptotically efﬁcient for arbitrary stimuli. [sent-76, score-0.322]

39 By contrast, the standard STA and STC eigenvectors are usually taken as unit vectors, providing a basis for the neural feature space in which the nonlinearity (“N” stage) must still be estimated. [sent-81, score-0.445]

40 We are free to normalize the ML estimates (ˆ W ) and estimate an b, ˆ arbitrary nonlinearity in a similar manner, but it is noteworthy that the parameters (a, b, W ) specify a complete encoding model in and of themselves. [sent-82, score-0.38]

41 Here we consider two types of priors: (1) a smoothing prior, which holds the ﬁlters to be smooth in space/time; and (2) a sparsifying prior, which we employ to directly estimate the feature space dimensionality (i. [sent-88, score-0.302]

42 We apply these priors to b and the columns of W , in conjunction with either exact (for accuracy) or expected (for speed) log-likelihood functions deﬁned above. [sent-91, score-0.143]

43 6) are also written in terms of STA/STC, optimization using the expected log-likelihood can be carried out more efﬁciently—it reduces the cost of each iteration by a factor of N compared to optimizing the exact likelihood (eq. [sent-95, score-0.133]

44 1 Smoothing prior Neural receptive ﬁelds are generally smooth, so a prior that encourages this tendency will tend to improve performance. [sent-98, score-0.251]

45 Receptive ﬁeld estimates under such a prior will be smooth unless the likelihood provides sufﬁcient evidence for jaggedness. [sent-99, score-0.26]

46 To encourage smoothness, we placed a zeromean Gaussian prior on the second-order differences of each ﬁlter [29]: Lw ⇠ N (0, 1 I), (10) 4 A true filter STA/STC Expected ML Exact ML Bayesian smoothing B C Gaussian stimuli sparse binary stimuli reconstruction error 1 0. [sent-100, score-0.562]

47 An LNP model with 4 orthogonal 32-elements ﬁlters (see text) was simulated with two types of stimuli (A-B: white Gaussian: C: sparse binary). [sent-105, score-0.233]

48 This is equivalent to imposing a penalty (given by 1 wi > LL> wi ) on the squared 2 second derivatives b and W in the optimization function. [sent-113, score-0.136]

49 To illustrate the effects of this prior, we simulated an example dataset from an LNP neuron with exponentiated-quadratic nonlinearity and four 32-element, 1-dimensional (temporal) ﬁlters. [sent-116, score-0.313]

50 We ﬁxed the dimensionality of our feature space estimates to be the same as the true model, since our focus was the quality of each corresponding ﬁlter estimate. [sent-119, score-0.185]

51 However, for “sparse” binary stimuli (3 of the 32 pixels set randomly to ±1), for which STA/STC and expected-ML estimates are no longer consistent, we found signiﬁcantly better performance from the exact-ML estimates (Fig. [sent-122, score-0.319]

52 Most importantly, for both Gaussian and sparse stimuli alike, the smoothing prior provided a large improvement in the quality of feature space estimates, achieving similar error with 2 orders of magnitude fewer stimuli. [sent-124, score-0.462]

53 2 Automatic selection of feature space dimensionality While smoothing regularizes receptive ﬁeld estimates by penalizing ﬁlter roughness, a perhaps more critical aspect of the STA/STC model is its vast number of possible parameters due to uncertainty in the number of ﬁlters. [sent-126, score-0.425]

54 Unlike PCA, we seek to preserve components of the STC matrix with both large and small eigenvalues, which correspond to excitatory and suppressive ﬁlters, respectively. [sent-129, score-0.128]

55 One solution to this problem, Bayesian Extreme Components Analysis [14], preserves large and small eigenvalues of the covariance matrix, but does not incorporate additional priors on ﬁlter shape, and has not yet been formulated for our (Poisson) likelihood function. [sent-130, score-0.164]

56 We put the ARD prior on each column of W : wi ⇠ N 0, ↵i 1 I , 5 (11) 8 expected likelihood 6 final # of dimensions 100000 0. [sent-136, score-0.266]

57 2 goodness-of-fit (nats/spk) cross-validation expected ARD expected smooth+ARD exact ARD exact smooth+ARD 7 500000 0. [sent-137, score-0.162]

58 05 0 10000 1 ML smooth ARD both ML smooth ARD both 0 3 10 10 4 10 5 # of samples 10 6 Figure 3: Goodness-of-ﬁt of estimated models and the estimated dimension as a function of number of samples. [sent-141, score-0.136]

59 Models were estimated from 103 , 104 , and 5 ⇥ 104 stimuli respectively. [sent-145, score-0.199]

60 When both smoothing and ARD priors are used, the variability rapidly diminishes to near zero. [sent-147, score-0.18]

61 where ↵i is a hyperparameter controlling the prior variance of wi . [sent-148, score-0.169]

62 ˜ We initialize b to its ML estimate and the wi to the eigenvectors of Cml , scaled by the square root of their eigenvalues. [sent-150, score-0.145]

63 This update is valid when each element of the receptive ﬁeld wi |||wi ||2 is well deﬁned (non-zero), otherwise it overestimates the corresponding ↵i . [sent-153, score-0.187]

64 The algorithm begins with all ↵i set to zero (inﬁnite prior variance), giving ML estimates for the parameters. [sent-154, score-0.138]

65 Subsequent updates will cause some ↵i to grow without bound, shrinking the prior variance of the corresponding feature vector wi until it drops out of the model entirely as ↵i ! [sent-155, score-0.203]

66 Note that these updates are fast (especially with expected log-likelihood), providing a much less computationally intensive estimate of feature space dimensionality than bootstrap-based methods [5]. [sent-158, score-0.171]

67 Figure 3 (left) shows that ARD prior greatly increases the model goodness-of-ﬁt (likelihood on test data), and is synergistic with the smoothing prior deﬁned above. [sent-159, score-0.253]

68 The improvement (relative to ML estimates) is greatest when the number of samples is small, and it enhances both expected and exact likelihood estimates. [sent-160, score-0.133]

69 We ﬁrst ﬁt a full-rank model with exact likelihood, and built a sparse model by adding ﬁlters from this set greedily until the likelihood of test data began to decrease. [sent-162, score-0.124]

70 When both smoothing and ARD priors were used, the variability decreased markedly and always achieved the correct dimension even for moderate amounts of data. [sent-166, score-0.156]

71 We can replace the exponential function which operates on the quadratic form in the model nonlinearity (eq. [sent-169, score-0.311]

72 2) 6 14 12 Figure 4: 1-D nonlinear functions g mapping z, the output of the quadratic stage, to spike rate for a V1 complex cell [16]. [sent-170, score-0.186]

73 The exact-ML ﬁlter estimate for W and b were obtained using the smoothing BSTC with an exponential nonlinearity. [sent-171, score-0.166]

74 We ﬁxed the ﬁtted cubic spline nonlinearity and then reﬁt the ﬁlters, resulting in an estimate of the elliptical-LNP model. [sent-175, score-0.426]

75 data exp(x) log(1+exp(x)) spline fit rate (spk/bin) 10 8 6 4 2 0 0 with an arbitrary function g(·), resulting in a model class that includes any elliptically symmetric mapping of the stimulus to spike rate. [sent-176, score-0.437]

76 The elliptical-LNP model can be formalized by writing the nonlinearity f (x) (depicted in Fig. [sent-178, score-0.251]

77 1) as the composition of two nonlinear functions: a quadratic function that maps high dimensional stimulus to real line z(x) = 1 x> Cx + b> x + a, and a 1-D nonlinearity g(z). [sent-179, score-0.468]

78 Although LNP with exponential nonlinearity has been widely adapted in neuroscience for its simplicity, the actual nonlinearity of neural systems is often sub-exponential. [sent-181, score-0.589]

79 Moreover, the effect of nonlinearity is even more pronounced in the exponentiated-quadratic function, and hence it may be helpful to use a sub-exponential function g. [sent-182, score-0.251]

80 Figure 4 shows the nonlinearity of an example neuron from V1 (see next section) compared to g(z) = ez (the assumption implicit in STA/STC), a more linear function g(z) = log(1 + ez ), and a cubic spline ﬁt by maximum likelihood. [sent-183, score-0.534]

81 For fast optimization, we ﬁrst used the exponentiated-quadratic nonlinearity as an initialization (expected then exact-ML), then we reﬁned the model with a spline nonlinearity. [sent-187, score-0.362]

82 The stimulus consisted of oriented binary white noise (“ﬂickering bars”) aligned with the cell’s preferred orientation. [sent-189, score-0.186]

83 The size of receptive ﬁeld was chosen to be 16 bars ⇥ 10 time bins, yielding a 160-dimensional stimulus space. [sent-192, score-0.272]

84 Three features of this data that make BSTC appropriate: (1) the stimulus is non-Gaussian; (2) the nonlinearity is not exponential (Fig. [sent-193, score-0.447]

85 We estimated the nonlinearity using a cubic spline, and applied a smoothing BSTC to 104 samples presented at 100 Hz (Fig. [sent-196, score-0.436]

86 The ARD-prior BSTC estimate trained on 2⇥105 stimuli preserved 14 ﬁlters (Fig. [sent-198, score-0.199]

87 On a test set, 7 excitatory b suppresive STA/STC BSTC STA/STC BSTC+ARD Figure 5: Estimating visual receptive ﬁelds from a complex cell. [sent-206, score-0.164]

88 Bayesian STC (BSTC) with smoothing prior and ﬁxed spline nonlinearity applied to a ﬁxed number of ﬁlters. [sent-209, score-0.549]

89 BSTC with ARD, smoothing, and spline nonlinearity recovers 14 receptive ﬁelds out of 160. [sent-211, score-0.481]

90 05 Figure 6: Goodness-of-model ﬁts from exact ML solution with exponential nonlinearity compared to BSTC with a ﬁxed spline nonlinearity and smoothing prior (2 ⇥ 105 samples). [sent-218, score-0.868]

91 6 Conclusion We have provided an explicit, probabilistic, model-based framework that formalizes the classical moment-based estimators (STA, STC) and a more recent information-theoretic estimator (iSTAC) for neural feature spaces. [sent-223, score-0.204]

92 The maximum of the “expected log-likelihood” under this model, where expectation is taken with respect to Gaussian stimulus distribution, corresponds precisely to the moment-based estimators for uncorrelated stimuli. [sent-224, score-0.243]

93 Although the assumption of elliptical symmetry makes it less general than information-theoretic estimators such as maximally informative dimensions (MID) [8, 15], it has signiﬁcant advantages in computational efﬁciency, number of local optima, and suitability for high-dimensional feature spaces. [sent-227, score-0.188]

94 We feel the synthesis of multi-dimensional nonlinear stimulus sensitivity (as described here) and non-Poisson, history-dependent spiking presents a promising tool for unlocking the statistical structure of the neural code. [sent-229, score-0.302]

95 Bayesian inference for spiking neuron models with a sparsity prior. [sent-284, score-0.12]

96 Maximum likelihood estimation of cascade point-process neural encoding models. [sent-356, score-0.189]

97 Maximum likelihood estimation of cascade point-process neural encoding models. [sent-372, score-0.189]

98 Estimating spatio-temporal receptive ﬁelds of auditory and visual neurons from their responses to natural stimuli. [sent-381, score-0.19]

99 A generalized linear model for estimating spectrotemporal receptive ﬁelds from responses to natural sounds. [sent-430, score-0.163]

100 A point process framework for relating neural spiking activity to spiking history, neural ensemble and extrinsic covariate effects. [sent-471, score-0.248]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('stc', 0.525), ('sta', 0.363), ('bstc', 0.319), ('nonlinearity', 0.251), ('lters', 0.18), ('ard', 0.175), ('stimuli', 0.175), ('lnp', 0.156), ('stimulus', 0.153), ('ml', 0.137), ('nsp', 0.131), ('smoothing', 0.121), ('receptive', 0.119), ('spline', 0.111), ('estimators', 0.09), ('spike', 0.089), ('bayesian', 0.075), ('istac', 0.075), ('estimates', 0.072), ('wi', 0.068), ('prior', 0.066), ('neural', 0.066), ('lter', 0.065), ('neuron', 0.062), ('spiking', 0.058), ('elliptically', 0.056), ('suppressive', 0.056), ('pillow', 0.055), ('eigenvectors', 0.053), ('likelihood', 0.052), ('cml', 0.049), ('feature', 0.048), ('exact', 0.047), ('cx', 0.046), ('rust', 0.045), ('excitatory', 0.045), ('responses', 0.044), ('smooth', 0.044), ('elds', 0.043), ('filters', 0.043), ('covariance', 0.042), ('austin', 0.04), ('cubic', 0.04), ('quadratic', 0.039), ('poisson', 0.039), ('response', 0.039), ('schwartz', 0.039), ('dimensionality', 0.038), ('gaussian', 0.038), ('cascade', 0.038), ('memming', 0.038), ('glm', 0.037), ('priors', 0.035), ('ez', 0.035), ('eigenvalues', 0.035), ('hyperparameter', 0.035), ('expected', 0.034), ('encoding', 0.033), ('cell', 0.033), ('white', 0.033), ('pca', 0.031), ('macaque', 0.03), ('sw', 0.03), ('bins', 0.029), ('sensory', 0.029), ('stimulated', 0.028), ('elliptical', 0.028), ('symmetric', 0.028), ('network', 0.028), ('neurons', 0.027), ('components', 0.027), ('ruyter', 0.027), ('columns', 0.027), ('space', 0.027), ('evidence', 0.026), ('sparse', 0.025), ('nonlinear', 0.025), ('extreme', 0.025), ('inhomogeneous', 0.025), ('estimate', 0.024), ('yi', 0.024), ('estimated', 0.024), ('final', 0.024), ('gerwinn', 0.024), ('rapidly', 0.024), ('relevance', 0.024), ('automatic', 0.023), ('macke', 0.022), ('explicit', 0.022), ('features', 0.022), ('dimensions', 0.022), ('si', 0.022), ('paninski', 0.022), ('tx', 0.022), ('exponential', 0.021), ('characterization', 0.021), ('exp', 0.021), ('sparsi', 0.021), ('shrinking', 0.021), ('comput', 0.021)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

Author: Jonathan W. Pillow, Il M. Park

2 0.3450703 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

Author: Alyson K. Fletcher, Sundeep Rangan, Lav R. Varshney, Aniruddha Bhargava

Abstract: Many functional descriptions of spiking neurons assume a cascade structure where inputs are passed through an initial linear ﬁltering stage that produces a lowdimensional signal that drives subsequent nonlinear stages. This paper presents a novel and systematic parameter estimation procedure for such models and applies the method to two neural estimation problems: (i) compressed-sensing based neural mapping from multi-neuron excitation, and (ii) estimation of neural receptive ﬁelds in sensory neurons. The proposed estimation algorithm models the neurons via a graphical model and then estimates the parameters in the model using a recently-developed generalized approximate message passing (GAMP) method. The GAMP method is based on Gaussian approximations of loopy belief propagation. In the neural connectivity problem, the GAMP-based method is shown to be computational efﬁcient, provides a more exact modeling of the sparsity, can incorporate nonlinearities in the output and signiﬁcantly outperforms previous compressed-sensing methods. For the receptive ﬁeld estimation, the GAMP method can also exploit inherent structured sparsity in the linear weights. The method is validated on estimation of linear nonlinear Poisson (LNP) cascade models for receptive ﬁelds of salamander retinal ganglion cells. 1

3 0.22098498 24 nips-2011-Active learning of neural response functions with Gaussian processes

Author: Mijung Park, Greg Horwitz, Jonathan W. Pillow

Abstract: A sizeable literature has focused on the problem of estimating a low-dimensional feature space for a neuron’s stimulus sensitivity. However, comparatively little work has addressed the problem of estimating the nonlinear function from feature space to spike rate. Here, we use a Gaussian process (GP) prior over the inﬁnitedimensional space of nonlinear functions to obtain Bayesian estimates of the “nonlinearity” in the linear-nonlinear-Poisson (LNP) encoding model. This approach offers increased ﬂexibility, robustness, and computational tractability compared to traditional methods (e.g., parametric forms, histograms, cubic splines). We then develop a framework for optimal experimental design under the GP-Poisson model using uncertainty sampling. This involves adaptively selecting stimuli according to an information-theoretic criterion, with the goal of characterizing the nonlinearity with as little experimental data as possible. Our framework relies on a method for rapidly updating hyperparameters under a Gaussian approximation to the posterior. We apply these methods to neural data from a color-tuned simple cell in macaque V1, characterizing its nonlinear response function in the 3D space of cone contrasts. We ﬁnd that it combines cone inputs in a highly nonlinear manner. With simulated experiments, we show that optimal design substantially reduces the amount of data required to estimate these nonlinear combination rules. 1

4 0.21472606 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Author: Yan Karklin, Eero P. Simoncelli

Abstract: Efﬁcient coding provides a powerful principle for explaining early sensory coding. Most attempts to test this principle have been limited to linear, noiseless models, and when applied to natural images, have yielded oriented ﬁlters consistent with responses in primary visual cortex. Here we show that an efﬁcient coding model that incorporates biologically realistic ingredients – input and output noise, nonlinear response functions, and a metabolic cost on the ﬁring rate – predicts receptive ﬁelds and response nonlinearities similar to those observed in the retina. Speciﬁcally, we develop numerical methods for simultaneously learning the linear ﬁlters and response nonlinearities of a population of model neurons, so as to maximize information transmission subject to metabolic costs. When applied to an ensemble of natural images, the method yields ﬁlters that are center-surround and nonlinearities that are rectifying. The ﬁlters are organized into two populations, with On- and Off-centers, which independently tile the visual space. As observed in the primate retina, the Off-center neurons are more numerous and have ﬁlters with smaller spatial extent. In the absence of noise, our method reduces to a generalized version of independent components analysis, with an adapted nonlinear “contrast” function; in this case, the optimal ﬁlters are localized and oriented.

5 0.14318728 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

Author: Kamiar R. Rad, Liam Paninski

Abstract: Many fundamental questions in theoretical neuroscience involve optimal decoding and the computation of Shannon information rates in populations of spiking neurons. In this paper, we apply methods from the asymptotic theory of statistical inference to obtain a clearer analytical understanding of these quantities. We ﬁnd that for large neural populations carrying a ﬁnite total amount of information, the full spiking population response is asymptotically as informative as a single observation from a Gaussian process whose mean and covariance can be characterized explicitly in terms of network and single neuron properties. The Gaussian form of this asymptotic sufﬁcient statistic allows us in certain cases to perform optimal Bayesian decoding by simple linear transformations, and to obtain closed-form expressions of the Shannon information carried by the network. One technical advantage of the theory is that it may be applied easily even to non-Poisson point process network models; for example, we ﬁnd that under some conditions, neural populations with strong history-dependent (non-Poisson) effects carry exactly the same information as do simpler equivalent populations of non-interacting Poisson neurons with matched ﬁring rates. We argue that our ﬁndings help to clarify some results from the recent literature on neural decoding and neuroprosthetic design.

6 0.13760294 244 nips-2011-Selecting Receptive Fields in Deep Networks

7 0.12250464 298 nips-2011-Unsupervised learning models of primary cortical receptive fields and receptive field plasticity

8 0.10941751 258 nips-2011-Sparse Bayesian Multi-Task Learning

9 0.099682271 37 nips-2011-Analytical Results for the Error in Filtering of Gaussian Processes

10 0.097167842 219 nips-2011-Predicting response time and error rates in visual search

11 0.096746445 302 nips-2011-Variational Learning for Recurrent Spiking Networks

12 0.09272594 200 nips-2011-On the Analysis of Multi-Channel Neural Spike Data

13 0.092139453 86 nips-2011-Empirical models of spiking in neural populations

14 0.091649987 224 nips-2011-Probabilistic Modeling of Dependencies Among Visual Short-Term Memory Representations

15 0.089049451 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data

16 0.072106913 261 nips-2011-Sparse Filtering

17 0.07151752 273 nips-2011-Structural equations and divisive normalization for energy-dependent component analysis

18 0.071256422 13 nips-2011-A blind sparse deconvolution method for neural spike identification

19 0.064354874 34 nips-2011-An Unsupervised Decontamination Procedure For Improving The Reliability Of Human Judgments

20 0.063216515 276 nips-2011-Structured sparse coding via lateral inhibition

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.191), (1, 0.133), (2, 0.237), (3, -0.045), (4, 0.057), (5, 0.076), (6, 0.034), (7, 0.123), (8, -0.011), (9, -0.0), (10, -0.067), (11, 0.004), (12, 0.024), (13, -0.016), (14, 0.052), (15, 0.099), (16, 0.083), (17, -0.043), (18, 0.055), (19, -0.084), (20, -0.172), (21, -0.118), (22, 0.124), (23, -0.006), (24, 0.014), (25, 0.078), (26, -0.049), (27, 0.14), (28, 0.026), (29, 0.012), (30, 0.029), (31, 0.083), (32, -0.044), (33, 0.048), (34, 0.114), (35, 0.026), (36, -0.124), (37, -0.101), (38, -0.012), (39, 0.004), (40, -0.079), (41, -0.156), (42, 0.04), (43, 0.174), (44, -0.122), (45, -0.129), (46, -0.02), (47, -0.078), (48, 0.106), (49, -0.009)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92054445 44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

Author: Jonathan W. Pillow, Il M. Park

2 0.86667669 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

Author: Alyson K. Fletcher, Sundeep Rangan, Lav R. Varshney, Aniruddha Bhargava

3 0.73025978 24 nips-2011-Active learning of neural response functions with Gaussian processes

Author: Mijung Park, Greg Horwitz, Jonathan W. Pillow

4 0.65194213 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Author: Yan Karklin, Eero P. Simoncelli

5 0.6190691 298 nips-2011-Unsupervised learning models of primary cortical receptive fields and receptive field plasticity

Author: Maneesh Bhand, Ritvik Mudur, Bipin Suresh, Andrew Saxe, Andrew Y. Ng

Abstract: The efﬁcient coding hypothesis holds that neural receptive ﬁelds are adapted to the statistics of the environment, but is agnostic to the timescale of this adaptation, which occurs on both evolutionary and developmental timescales. In this work we focus on that component of adaptation which occurs during an organism’s lifetime, and show that a number of unsupervised feature learning algorithms can account for features of normal receptive ﬁeld properties across multiple primary sensory cortices. Furthermore, we show that the same algorithms account for altered receptive ﬁeld properties in response to experimentally altered environmental statistics. Based on these modeling results we propose these models as phenomenological models of receptive ﬁeld plasticity during an organism’s lifetime. Finally, due to the success of the same models in multiple sensory areas, we suggest that these algorithms may provide a constructive realization of the theory, ﬁrst proposed by Mountcastle [1], that a qualitatively similar learning algorithm acts throughout primary sensory cortices. 1

6 0.46652064 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

7 0.44631061 34 nips-2011-An Unsupervised Decontamination Procedure For Improving The Reliability Of Human Judgments

8 0.41581643 224 nips-2011-Probabilistic Modeling of Dependencies Among Visual Short-Term Memory Representations

9 0.41248426 219 nips-2011-Predicting response time and error rates in visual search

10 0.40714785 244 nips-2011-Selecting Receptive Fields in Deep Networks

11 0.38368288 13 nips-2011-A blind sparse deconvolution method for neural spike identification

12 0.35741618 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data

13 0.3557826 37 nips-2011-Analytical Results for the Error in Filtering of Gaussian Processes

14 0.35516834 2 nips-2011-A Brain-Machine Interface Operating with a Real-Time Spiking Neural Network Control Algorithm

15 0.34841499 200 nips-2011-On the Analysis of Multi-Channel Neural Spike Data

16 0.34658474 243 nips-2011-Select and Sample - A Model of Efficient Neural Inference and Learning

17 0.32516125 258 nips-2011-Sparse Bayesian Multi-Task Learning

18 0.3241376 269 nips-2011-Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning

19 0.3237761 86 nips-2011-Empirical models of spiking in neural populations

20 0.31880277 104 nips-2011-Generalized Beta Mixtures of Gaussians

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.012), (4, 0.024), (20, 0.024), (26, 0.012), (31, 0.07), (33, 0.017), (43, 0.459), (45, 0.078), (57, 0.044), (65, 0.03), (74, 0.044), (83, 0.064), (84, 0.011), (99, 0.04)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98509371 26 nips-2011-Additive Gaussian Processes

Author: David K. Duvenaud, Hannes Nickisch, Carl E. Rasmussen

Abstract: We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efﬁcient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks. 1

2 0.95852953 146 nips-2011-Learning Higher-Order Graph Structure with Features by Structure Penalty

Author: Shilin Ding, Grace Wahba, Xiaojin Zhu

Abstract: In discrete undirected graphical models, the conditional independence of node labels Y is speciﬁed by the graph structure. We study the case where there is another input random vector X (e.g. observed features) such that the distribution P (Y | X) is determined by functions of X that characterize the (higher-order) interactions among the Y ’s. The main contribution of this paper is to learn the graph structure and the functions conditioned on X at the same time. We prove that discrete undirected graphical models with feature X are equivalent to multivariate discrete models. The reparameterization of the potential functions in graphical models by conditional log odds ratios of the latter offers advantages in representation of the conditional independence structure. The functional spaces can be ﬂexibly determined by kernels. Additionally, we impose a Structure Lasso (SLasso) penalty on groups of functions to learn the graph structure. These groups with overlaps are designed to enforce hierarchical function selection. In this way, we are able to shrink higher order interactions to obtain a sparse graph structure. 1

same-paper 3 0.95514262 44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

Author: Jonathan W. Pillow, Il M. Park

4 0.95207244 282 nips-2011-The Fast Convergence of Boosting

Author: Matus J. Telgarsky

Abstract: This manuscript considers the convergence rate of boosting under a large class of losses, including the exponential and logistic losses, where the best previous rate of convergence was O(exp(1/✏2 )). First, it is established that the setting of weak learnability aids the entire class, granting a rate O(ln(1/✏)). Next, the (disjoint) conditions under which the inﬁmal empirical risk is attainable are characterized in terms of the sample and weak learning class, and a new proof is given for the known rate O(ln(1/✏)). Finally, it is established that any instance can be decomposed into two smaller instances resembling the two preceding special cases, yielding a rate O(1/✏), with a matching lower bound for the logistic loss. The principal technical hurdle throughout this work is the potential unattainability of the inﬁmal empirical risk; the technique for overcoming this barrier may be of general interest. 1

5 0.93889433 264 nips-2011-Sparse Recovery with Brownian Sensing

Author: Alexandra Carpentier, Odalric-ambrym Maillard, Rémi Munos

Abstract: We consider the problem of recovering the parameter α ∈ RK of a sparse function f (i.e. the number of non-zero entries of α is small compared to the number K of features) given noisy evaluations of f at a set of well-chosen sampling points. We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points N , even when the features are arbitrarily non-orthogonal. Under the assumption that f is H¨ lder continuous with exponent at least √ we proo 1/2, vide an estimate α of the parameter such that �α − α�2 = O(�η�2 / N ), where � � η is the observation noise. The method uses a set of sampling points uniformly distributed along a one-dimensional curve selected according to the features. We report numerical experiments illustrating our method. 1

6 0.93857849 288 nips-2011-Thinning Measurement Models and Questionnaire Design

7 0.77538562 117 nips-2011-High-Dimensional Graphical Model Selection: Tractable Graph Families and Necessary Conditions

8 0.75485128 123 nips-2011-How biased are maximum entropy models?

9 0.75253314 281 nips-2011-The Doubly Correlated Nonparametric Topic Model

10 0.75170118 203 nips-2011-On the accuracy of l1-filtering of signals with block-sparse structure

11 0.73508018 195 nips-2011-On Learning Discrete Graphical Models using Greedy Methods

12 0.73077744 132 nips-2011-Inferring Interaction Networks using the IBP applied to microRNA Target Prediction

13 0.72708154 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

14 0.72645241 24 nips-2011-Active learning of neural response functions with Gaussian processes

15 0.72504109 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

16 0.71340811 273 nips-2011-Structural equations and divisive normalization for energy-dependent component analysis

17 0.71315646 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

18 0.71128565 296 nips-2011-Uniqueness of Belief Propagation on Signed Graphs

19 0.70130205 83 nips-2011-Efficient inference in matrix-variate Gaussian models with \iid observation noise

20 0.70051485 294 nips-2011-Unifying Framework for Fast Learning Rate of Non-Sparse Multiple Kernel Learning