nips nips2013 nips2013-41 knowledge-graph by maker-knowledge-mining

41 nips-2013-Approximate inference in latent Gaussian-Markov models from continuous time observations

Source: pdf

Author: Botond Cseke, Manfred Opper, Guido Sanguinetti

Abstract: We propose an approximate inference algorithm for continuous time Gaussian Markov process models with both discrete and continuous time likelihoods. We show that the continuous time limit of the expectation propagation algorithm exists and results in a hybrid ﬁxed point iteration consisting of (1) expectation propagation updates for discrete time terms and (2) variational updates for the continuous time term. We introduce postinference corrections methods that improve on the marginals of the approximation. This approach extends the classical Kalman-Bucy smoothing procedure to non-Gaussian observations, enabling continuous-time inference in a variety of models, including spiking neuronal models (state-space models with point process observations) and box likelihood models. Experimental results on real and simulated data demonstrate high distributional accuracy and signiﬁcant computational savings compared to discrete-time approaches in a neural application. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Approximate inference in latent Gaussian-Markov models from continuous time observations Botond Cseke1 Manfred Opper2 School of Informatics University of Edinburgh, U. [sent-1, score-0.201]

2 de 2 Abstract We propose an approximate inference algorithm for continuous time Gaussian Markov process models with both discrete and continuous time likelihoods. [sent-8, score-0.359]

3 We show that the continuous time limit of the expectation propagation algorithm exists and results in a hybrid ﬁxed point iteration consisting of (1) expectation propagation updates for discrete time terms and (2) variational updates for the continuous time term. [sent-9, score-0.427]

4 We introduce postinference corrections methods that improve on the marginals of the approximation. [sent-10, score-0.106]

5 This approach extends the classical Kalman-Bucy smoothing procedure to non-Gaussian observations, enabling continuous-time inference in a variety of models, including spiking neuronal models (state-space models with point process observations) and box likelihood models. [sent-11, score-0.162]

6 1 Introduction Continuous time stochastic processes provide a ﬂexible and popular framework for data modelling in a broad spectrum of scientiﬁc and engineering disciplines. [sent-13, score-0.1]

7 Other scenarios give intrinsically continuous time observations: for example, sensors monitoring the transit of a particle through a barrier provide continuous time data on the particle’s position. [sent-18, score-0.232]

8 In this paper, we propose an expectation-propagation (EP)-type algorithm [Opper and Winther, 2000, Minka, 2001] for latent diffusion processes observed in either discrete or continuous time. [sent-20, score-0.24]

9 We derive ﬁxed-point update equations by considering a continuous time limit of the parallel EP algorithm [e. [sent-21, score-0.188]

10 Opper and Winther, 2005, Cseke and Heskes, 2011b]: these ﬁxed point updates naturally become differential equations in the continuous time limit. [sent-23, score-0.224]

11 Remarkably, we show that, in the presence of continuous time observations, the update equations for the EP algorithm reduce to updates for a variational Gaussian approximation [Archambeau et al. [sent-24, score-0.233]

12 We also generalise to the continuous-time limit the EP correction scheme of [Cseke and Heskes, 2011b], which enable us to capture some of the non-Gaussian behaviour of the time marginals. [sent-26, score-0.064]

13 The process can be observed (noisily) both at discrete time points, and for continuous time intervals; we d c will partition the observations in yti , ti ∈ Td and yt , t ∈ [0, 1] accordingly. [sent-29, score-0.406]

14 We assume that the likelihood function admits the general formulation d c p({yti }i , {yt }| {xt }) ∝ ti ∈Td 1 d c p(yti |xti ) × exp − dtV (t, yt , xt ) . [sent-30, score-0.263]

15 (2) 0 d c We refer to p(yti |xti ) and V (t, yt , xt ) as discrete time likelihood term and continuous time loss function, respectively. [sent-31, score-0.384]

16 We notice that, using Girsanov’s theorem and Ito’s lemma, non-linear diffusion equations with constant (diagonal) diffusion matrix can be re-written in the form (1)-(2), provided the drift can be obtained as the gradient of a potential function [e. [sent-32, score-0.162]

17 d c Our aim is to propose approximate inference methods to compute the marginals p(xt |{yti }i , {yt }) of the posterior distribution d c d c p({xt }t |{yti }i , {yt }) ∝ p({yti }i , {yt }| {xt }) × p0 ({xt }). [sent-35, score-0.149]

18 1 Exact inference in Gaussian models We start form the exact case of Gaussian observations and quadratic loss function. [sent-37, score-0.077]

19 The linearity of equation (1) implies that the marginal distributions of the process at every time point are Gaussian (assuming Gaussian initial conditions). [sent-38, score-0.133]

20 The time evolution of the marginal mean mt and covariance Vt is governed by the pair of differential equations [Gardiner, 2002] d mt = At mt + ct dt and d Vt = At Vt + Vt AT + Bt . [sent-39, score-0.32]

21 t dt (3) c In the case of Gaussian observations and a quadratic loss function V (t, yt , xt ) = const. [sent-40, score-0.241]

22 − xT hc + t t 1 T c 2 xt Qt xt , these equations, together with their backward analogues, enable an exact recursive inference algorithm, known as the Kalman-Bucy smoother [e. [sent-41, score-0.399]

23 This algorithm arises because we can a a recast the loss function as an auxiliary (observation) process 1/2 c dyt = xt dt + Rt dWt , (4) −1 −1 c where Rt = Qc and Rt dyt /dt = hc . [sent-44, score-0.349]

24 The Kalman-Bucy algorithm computes the posterior marginal means and covariances by solving the differential equations in a forward-backward fashion. [sent-46, score-0.193]

25 The exact form of the equations as well as the variational derivation of the Kalman-Bucy problem are given in Section B of the Supplementary Material. [sent-48, score-0.079]

26 2 Approximate inference In this section we use an Euler discretisation of the prior and the continuous time likelihood to turn our model into a multivariate latent Gaussian model. [sent-50, score-0.273]

27 We review the EP algorithm for such models and then we show that when taking the limit ∆t → 0 the updates of the EP algorithm exist. [sent-51, score-0.067]

28 The resulting approximate posterior process is again an OU process and we compute its parameters. [sent-52, score-0.143]

29 Finally, we show how corrections to the marginals proposed [Cseke and Heskes, 2011b] can be extended to the continuous time case. [sent-53, score-0.211]

30 , tK−1 , tK = 1} be a discretisation of the [0, 1] interval and let the matrix x = [xt1 , . [sent-59, score-0.076]

31 , xtK ] represent the process {xt }t using the discretisation given by T . [sent-62, score-0.117]

32 Consequently we approximate our model by the latent Gaussian model d p({yti }i , y c , x) = p0 (x) × i d p(yti |xti ) k c exp −∆tk V (tk , ytk , xtk ) where we remark that the prior p0 has a block-diagonal precision structure. [sent-68, score-0.706]

33 To simplify notation, the in d c following we use the aliases φd (xti ) = p(yti |xti ) and φc (xtk ; ∆tk ) = exp −∆tk V (tk , ytk , xtk ) . [sent-69, score-0.665]

34 2 Inference using expectation propagation Expectation propagation [Opper and Winther, 2000, Minka, 2001] is a well known algorithm that provides good approximations of the posterior marginals in latent Gaussian models. [sent-72, score-0.098]

35 Cseke and Heskes, 2011b]; similar continuous time limiting arguments can be made for the d original (sequential) EP approach. [sent-75, score-0.105]

36 , 2005]; this alternative approach can also be shown to extend to the continuous time limit (see Section A. [sent-82, score-0.142]

37 Equation (5) does not depend on the time discretisation, and hence provides a valid update equation also working directly with the continuous time process. [sent-84, score-0.156]

38 On the other hand, the quantities in equation (6) depend explicitly on ∆tk , and it is necessary to ensure that they remain well deﬁned (and computable) in the continuous time limit. [sent-85, score-0.129]

39 By using this notation we can rewrite (6) as [λck ]new = λck + t t with 1 Collapse(qc (xtk ); f ) − Collapse(q0 (xtk ); f ) ∆tk (7) qc (xtk ) ∝ exp(−∆tk [V (tk , xtk ) + λck · f (xtk )])q0 (xtk ). [sent-90, score-0.703]

40 t (9) By direct Taylor expansion of Collapse(qc (xtk ); f ) one can show that the update equation (7) remains ﬁnite when we take the limit ∆tk → 0. [sent-92, score-0.061]

41 3 Continuous time limit of the update equations Let µtk = Collapse(q0 (xtk ); f ) and denote by Z(∆tk , µtk ) and Z(µtk ) the normalisation constant of qc (xtk ) and q0 (xtk ) respectively. [sent-96, score-0.223]

42 The notation emphasises that qc (xtk ) differs from q0 (xtk ) by a term dependent on the granularity of the discretisation ∆tk . [sent-97, score-0.189]

43 From the deﬁnition of qc (xtk ) in equation (8) we then have that its ﬁrst two moments can be computed as ∂µtk log Z(∆tk , µtk ). [sent-99, score-0.173]

44 By substituting (12) into (7), we take the limit ∆tk → 0 and obtain the update equations [λc ]new = −JΨ (µt )∂µt V (t, xt )q0 (xt ) t for all t ∈ [0, 1]. [sent-108, score-0.236]

45 Algorithmically, computing the marginal moments and covariances of the discretised Gaussian q0 (x) in (9) can be done by solving a sparse linear system and doing partial matrix inversion using the Cholesky factorisation and the Takahashi equations as in Cseke and Heskes [2011b]. [sent-111, score-0.19]

46 This corresponds to a junction tree algorithm on a (block) chain graph [Davis, 2006] which, in the continuous time limit, can be reduced to a set of differential equations 4 due to the chain structure of the graph. [sent-112, score-0.194]

47 Readers familiar with the fractional free energies and the power EP algorithm may notice that the time lag ∆tk plays a similar role as the fractional or power parameter α. [sent-117, score-0.069]

48 It is well known property that in the α → 0 limit the algorithm and the free energy collapses to variational [e. [sent-118, score-0.07]

49 Wiegerinck and Heskes, 2003, Cseke and Heskes, 2011a] and thus, intuitively, the collapse and the existence of the limit is related to this property. [sent-120, score-0.189]

50 The algorithm performs well in the comfort zone of EP, that is, log-concave discrete likelihood terms and convex loss. [sent-125, score-0.063]

51 4 Parameters of the approximating OU process The ﬁxed point iteration scheme computes only the marginal means and covariances of q0 ({xt }) and it does not provide a parametric OU process as an approximation. [sent-133, score-0.147]

52 That is, if q ∗ ({xt }) minimises D[q0 ({xt })||q ∗ ({xt })], then the parameters of q ∗ are given by A∗ = At − Bt [Vtbw ]−1 , t c∗ = ct + Bt [Vtbw ]−1 mbw t t and ∗ Bt = Bt , (15) where mbw and Vtbw are computed by the backward Kalman-Bucy ﬁltering equations. [sent-135, score-0.1]

53 5 Corrections to the marginals In this section we extend the factorised correction method for multivariate latent Gaussian models introduced in Cseke and Heskes [2011b] to continuous time observations. [sent-140, score-0.183]

54 To begin with, we focus on the corrections from the continuous time observation process. [sent-146, score-0.171]

55 By removing the Gaussian terms (with canonical parameters λck ) from the approximate posterior and t replacing them with the exact likelihood, we can rewrite the exact discretised posterior as p(x) ∝ q0 (x) × exp − k ∆tk [V (tk , xtk ) + λck · f (xt )] . [sent-147, score-0.801]

56 t The exact posterior marginal at time tj is thus given by with p(xtj ) ∝ q0 (xtj )× exp − ∆tj [V (tj , xtj + λcj · f (xtj ))] × cT (xtj ) t cT (xtj ) = dx\tj q0 (x\tj |xtj ) × exp − ∆tk [V (tk , xtk ) + λck · f . [sent-148, score-0.872]

57 By approximating the joint conditional q0 (x\tj |xtj ) with a product of its marginals and taking the ∆tk → 0 limit, we obtain c(xt ) exp − 1 0 ds V (s, xs ) + λc · f (xs )q0 (xs |xt ) . [sent-150, score-0.099]

58 The evaluations for a ﬁxed xt are also linear in time. [sent-153, score-0.153]

59 The continuous time potential is deﬁned as V (t, xt ) = (2xt )8 I[1/2,2/3] (t) and we assume two hard box discrete likelihood terms I[−0. [sent-169, score-0.369]

60 The prior is deﬁned by the parameters at = −1, ct = 4π cos(4πt) and bt = 4. [sent-174, score-0.099]

61 The left panel shows the prior’s and the posterior approximation’s marginal means and standard deviations. [sent-175, score-0.123]

62 The right panel shows the marginal approximations at t = 0. [sent-176, score-0.084]

63 3351, a region where we expect the corrections to be strongly inﬂuenced by both types of likelihoods. [sent-177, score-0.066]

64 Samples were generated by using the lag ∆t = 10−3 , the approximate inference was run using RK4 at ∆t = 10−4 . [sent-178, score-0.07]

65 1 Experiments Inference in a (soft) box The ﬁrst example we consider is a mixed discrete-continuous time inference under box and soft box likelihood observations respectively. [sent-180, score-0.273]

66 We consider a diffusing particle on the line under an OU prior process of the form dxt = (−axt + ct )dt + √ bdWt with a = −1, ct = 4π cos(4πt) and b = 4. [sent-181, score-0.183]

67 The likelihood model is given by the loss function V (t, xt ) = (2xt )8 for all t ∈ [1/2, 2/3] and 0 otherwise, effectively conﬁning the process to a narrow strip near zero (soft box). [sent-182, score-0.219]

68 This likelihood is therefore an approximation to physically realistic situations where particles can perform diffusion in a conﬁned environment. [sent-183, score-0.083]

69 The box has hard gates: two discrete time likelihoods given by the indicator functions I[−0. [sent-184, score-0.144]

70 The right panel in Figure 1 shows the marginal approximations at a time point shortly after the “gate” to the box, these are: (i) sampling (grey) (ii) the Gaussian EP approximation (blue line), and (iii) its corrected version (red line). [sent-190, score-0.136]

71 The time point was chosen as we expect the strongest non-Gaussian effects to be felt near the discrete likelihoods; the corrected distribution does indeed show strong skewness. [sent-191, score-0.09]

72 We emphasise that this is an approximation to the model, hence the benchmark is not a true gold standard; however, we are not aware of sampling schemes that would be able to perform inference under the exact continuous time likelihood. [sent-194, score-0.153]

73 2 Log Gaussian Cox processes Another family of models where one encounters continuous time likelihoods is point processes; these processes ﬁnd wide application in a number of disciplines, from neuroscience Smith and Brown [2003] to conﬂict modelling Zammit-Mangion et al. [sent-199, score-0.275]

74 We assume that we have a multivariate log Gaussian Cox process model [Kingman, 1992]: this is deﬁned by a d-variate Ornstein-Uhlenbeck process {xt }t 6 Figure 2: A toy example for the point process model in Section 3. [sent-201, score-0.123]

75 The prior means t and standard deviations, the sampled process path, and the sampled events are shown on the left panel while the posterior approximations are shown on the right panel. [sent-205, score-0.123]

76 The likelihood of this point process model is formed by t both discrete time (point probabilities) and continuous time (void probability) terms and can be written as log i . [sent-211, score-0.236]

77 xt The results are shown in Figure 2, with four colours distinguishing the four processes. [sent-217, score-0.153]

78 The left panel shows prior processes (mean ± standard deviation), sample paths and (bottom row) the sampled points (i. [sent-218, score-0.09]

79 The right panel shows the corresponding posterior processes approximations. [sent-221, score-0.129]

80 3 Point process modelling of neural spikes trains In a third example we consider continuous time point process inference for spike time recordings from a population of neurons. [sent-224, score-0.345]

81 This type of data is frequently modelled using (discrete time) state-space models with point process observations (SSPP) [Smith and Brown, 2003, Zammit Mangion et al. [sent-225, score-0.089]

82 org, consisting of recordings of spiking patterns of taste response cells in Sprague-Dawley rats during presentation of different taste stimuli. [sent-230, score-0.106]

83 The recordings are 10s each at a resolution of 10−3 s, and four different taste stimuli: (i) NaCL, (ii) Quinine HCl, (iii) Quinine HCl, and (iv) Sucrose are presented to the subjects for the duration of the ﬁrst 5s of the 10s recording window. [sent-231, score-0.068]

84 We modelled the spike train recordings by univariate log Gaussian Cox process models (see Section 3. [sent-232, score-0.098]

85 We use the variational EM algorithm (discrete time likelihoods are Gaussian) to learn the prior 7 The fitted (c,µ) parameters 5 4. [sent-234, score-0.091]

86 The top-left, bottom-left and centre panels show the intensity ﬁt, event count and the Q-Q plot corresponding to one of the recordings, whereas the right panel shows the learned c and µ parameters for all spike trains in cell 9. [sent-240, score-0.07]

87 The right panel shows an emergent pattern of stimulus based clustering of µ and c as in Zammit Mangion et al. [sent-244, score-0.062]

88 , 2011] are usually forced to take very ﬁne time discretisation by the requirement that at most one spike happens during one time step. [sent-247, score-0.157]

89 Our continuous time approach, on the other hand, handles uneven observations naturally. [sent-249, score-0.134]

90 4 Conclusion Inference methodologies for continuous time stochastic processes are a subject of intense research, both for fundamental and applied research. [sent-250, score-0.152]

91 This paper contributes a novel approach which allows inference from both discrete time and continuous time observations. [sent-251, score-0.218]

92 Our results show that the method is effective in accurately reconstructing marginal posterior distributions, and can be deployed effectively on real world problems. [sent-252, score-0.08]

93 , 2012] that optimal control problems can be recast in inference terms: in many cases, the relevant inference problem is of the same type as the one considered here, hence this methodology could in principle also be used in control problems. [sent-254, score-0.133]

94 The method is based on the parallel EP formulation of Cseke and Heskes [2011b]: interestingly, we show that the EP updates from continuous time observations collapse to variational updates [Archambeau et al. [sent-255, score-0.398]

95 Furthermore, the EP perspective allows us to compute corrections to the Gaussian marginals; in our experiments, these turned out to be highly accurate. [sent-259, score-0.066]

96 Our modelling framework assumes a latent linear diffusion process; however, as mentioned before, some non-linear diffusion processes are equivalent to posterior processes for OU processes observed in continuous time [Øksendal, 2010]. [sent-260, score-0.446]

97 Our approach, hence, can also be viewed as a method for accurate marginal computations in (a class of) nonlinear diffusion processes observed with noise. [sent-261, score-0.146]

98 In general, all non-linear diffusion processes can be recast in a form similar to the one considered here; the important difference though is that the continuous time likelihood is in general an Ito integral, not a regular integral. [sent-262, score-0.272]

99 In the future, it would be interesting to explore the extension of this approach to general non-linear diffusion processes, as well as discrete and hybrid stochastic processes [Rao and Teh, 2012, Ocone et al. [sent-263, score-0.184]

100 Online variational inference for state-space models with point-process observations. [sent-442, score-0.081]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('xtk', 0.59), ('tk', 0.5), ('xti', 0.344), ('xt', 0.153), ('collapse', 0.152), ('heskes', 0.13), ('qc', 0.113), ('yti', 0.107), ('opper', 0.107), ('ou', 0.103), ('ep', 0.103), ('cseke', 0.094), ('xtj', 0.087), ('continuous', 0.078), ('discretisation', 0.076), ('corrections', 0.066), ('ck', 0.062), ('diffusion', 0.058), ('archambeau', 0.05), ('ct', 0.05), ('mangion', 0.049), ('winther', 0.049), ('ytk', 0.049), ('zammit', 0.049), ('bt', 0.049), ('inference', 0.048), ('box', 0.048), ('processes', 0.047), ('equations', 0.046), ('hc', 0.045), ('gaussian', 0.043), ('discretised', 0.043), ('differential', 0.043), ('panel', 0.043), ('canonical', 0.042), ('process', 0.041), ('marginal', 0.041), ('marginals', 0.04), ('posterior', 0.039), ('td', 0.038), ('discrete', 0.038), ('taste', 0.038), ('hcl', 0.037), ('quinine', 0.037), ('recast', 0.037), ('vtbw', 0.037), ('wiegerinck', 0.037), ('limit', 0.037), ('moments', 0.036), ('tj', 0.036), ('yt', 0.036), ('vt', 0.036), ('variational', 0.033), ('xs', 0.033), ('ito', 0.033), ('likelihoods', 0.031), ('updates', 0.03), ('mt', 0.03), ('recordings', 0.03), ('observations', 0.029), ('di', 0.028), ('time', 0.027), ('minka', 0.027), ('spike', 0.027), ('exp', 0.026), ('modelling', 0.026), ('cox', 0.026), ('corrected', 0.025), ('likelihood', 0.025), ('dyt', 0.025), ('gardiner', 0.025), ('kadirkamanathan', 0.025), ('kappen', 0.025), ('ksendal', 0.025), ('mbw', 0.025), ('nacl', 0.025), ('ocone', 0.025), ('qdi', 0.025), ('sucrose', 0.025), ('equation', 0.024), ('covariances', 0.024), ('dt', 0.023), ('smith', 0.023), ('ti', 0.023), ('approximate', 0.022), ('cos', 0.022), ('hybrid', 0.022), ('particle', 0.022), ('rkk', 0.022), ('dwt', 0.022), ('hdi', 0.022), ('lorenzo', 0.022), ('void', 0.022), ('fractional', 0.021), ('dxt', 0.02), ('rao', 0.02), ('oxford', 0.019), ('latent', 0.019), ('et', 0.019), ('factorised', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 41 nips-2013-Approximate inference in latent Gaussian-Markov models from continuous time observations

Author: Botond Cseke, Manfred Opper, Guido Sanguinetti

2 0.20575467 100 nips-2013-Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture

Author: Trevor Campbell, Miao Liu, Brian Kulis, Jonathan P. How, Lawrence Carin

Abstract: This paper presents a novel algorithm, based upon the dependent Dirichlet process mixture model (DDPMM), for clustering batch-sequential data containing an unknown number of evolving clusters. The algorithm is derived via a lowvariance asymptotic analysis of the Gibbs sampling algorithm for the DDPMM, and provides a hard clustering with convergence guarantees similar to those of the k-means algorithm. Empirical results from a synthetic test with moving Gaussian clusters and a test with real ADS-B aircraft trajectory data demonstrate that the algorithm requires orders of magnitude less computational time than contemporary probabilistic and hard clustering algorithms, while providing higher accuracy on the examined datasets. 1

3 0.17624664 39 nips-2013-Approximate Gaussian process inference for the drift function in stochastic differential equations

Author: Andreas Ruttor, Philipp Batz, Manfred Opper

Abstract: We introduce a nonparametric approach for estimating drift functions in systems of stochastic differential equations from sparse observations of the state vector. Using a Gaussian process prior over the drift as a function of the state vector, we develop an approximate EM algorithm to deal with the unobserved, latent dynamics between observations. The posterior over states is approximated by a piecewise linearized process of the Ornstein-Uhlenbeck type and the MAP estimation of the drift is facilitated by a sparse Gaussian process regression. 1

4 0.089218281 62 nips-2013-Causal Inference on Time Series using Restricted Structural Equation Models

Author: Jonas Peters, Dominik Janzing, Bernhard Schölkopf

Abstract: Causal inference uses observational data to infer the causal structure of the data generating system. We study a class of restricted Structural Equation Models for time series that we call Time Series Models with Independent Noise (TiMINo). These models require independent residual time series, whereas traditional methods like Granger causality exploit the variance of residuals. This work contains two main contributions: (1) Theoretical: By restricting the model class (e.g. to additive noise) we provide general identiﬁability results. They cover lagged and instantaneous effects that can be nonlinear and unfaithful, and non-instantaneous feedbacks between the time series. (2) Practical: If there are no feedback loops between time series, we propose an algorithm based on non-linear independence tests of time series. We show empirically that when the data are causally insufﬁcient or the model is misspeciﬁed, the method avoids incorrect answers. We extend the theoretical and the algorithmic part to situations in which the time series have been measured with different time delays. TiMINo is applied to artiﬁcial and real data and code is provided. 1

5 0.08343982 266 nips-2013-Recurrent linear models of simultaneously-recorded neural populations

Author: Marius Pachitariu, Biljana Petreska, Maneesh Sahani

Abstract: Population neural recordings with long-range temporal structure are often best understood in terms of a common underlying low-dimensional dynamical process. Advances in recording technology provide access to an ever-larger fraction of the population, but the standard computational approaches available to identify the collective dynamics scale poorly with the size of the dataset. We describe a new, scalable approach to discovering low-dimensional dynamics that underlie simultaneously recorded spike trains from a neural population. We formulate the Recurrent Linear Model (RLM) by generalising the Kalman-ﬁlter-based likelihood calculation for latent linear dynamical systems to incorporate a generalised-linear observation process. We show that RLMs describe motor-cortical population data better than either directly-coupled generalised-linear models or latent linear dynamical system models with generalised-linear observations. We also introduce the cascaded generalised-linear model (CGLM) to capture low-dimensional instantaneous correlations in neural populations. The CGLM describes the cortical recordings better than either Ising or Gaussian models and, like the RLM, can be ﬁt exactly and quickly. The CGLM can also be seen as a generalisation of a lowrank Gaussian model, in this case factor analysis. The computational tractability of the RLM and CGLM allow both to scale to very high-dimensional neural data. 1

6 0.079243816 48 nips-2013-Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

7 0.078229032 269 nips-2013-Regression-tree Tuning in a Streaming Setting

8 0.073795795 124 nips-2013-Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting

9 0.068082243 112 nips-2013-Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising

10 0.066364288 24 nips-2013-Actor-Critic Algorithms for Risk-Sensitive MDPs

11 0.062188063 1 nips-2013-(More) Efficient Reinforcement Learning via Posterior Sampling

12 0.059916951 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking

13 0.059625633 348 nips-2013-Variational Policy Search via Trajectory Optimization

14 0.059338871 89 nips-2013-Dimension-Free Exponentiated Gradient

15 0.05569353 17 nips-2013-A multi-agent control framework for co-adaptation in brain-computer interfaces

16 0.051990718 103 nips-2013-Efficient Exploration and Value Function Generalization in Deterministic Systems

17 0.051782783 127 nips-2013-Generalized Denoising Auto-Encoders as Generative Models

18 0.047949724 49 nips-2013-Bayesian Inference and Online Experimental Design for Mapping Neural Microcircuits

19 0.047882374 193 nips-2013-Mixed Optimization for Smooth Functions

20 0.047676004 221 nips-2013-On the Expressive Power of Restricted Boltzmann Machines

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.131), (1, -0.002), (2, 0.013), (3, -0.043), (4, -0.082), (5, 0.071), (6, 0.086), (7, 0.072), (8, 0.08), (9, -0.101), (10, -0.057), (11, -0.103), (12, -0.066), (13, 0.024), (14, -0.081), (15, 0.022), (16, 0.003), (17, 0.025), (18, 0.032), (19, 0.013), (20, 0.055), (21, 0.018), (22, 0.036), (23, -0.025), (24, -0.04), (25, -0.011), (26, -0.073), (27, -0.038), (28, -0.036), (29, 0.021), (30, 0.008), (31, -0.097), (32, 0.02), (33, 0.034), (34, -0.058), (35, -0.112), (36, -0.04), (37, -0.029), (38, -0.073), (39, 0.013), (40, -0.04), (41, 0.006), (42, -0.025), (43, 0.062), (44, -0.045), (45, 0.102), (46, -0.025), (47, 0.039), (48, 0.017), (49, 0.084)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92704648 41 nips-2013-Approximate inference in latent Gaussian-Markov models from continuous time observations

Author: Botond Cseke, Manfred Opper, Guido Sanguinetti

2 0.71903282 39 nips-2013-Approximate Gaussian process inference for the drift function in stochastic differential equations

Author: Andreas Ruttor, Philipp Batz, Manfred Opper

3 0.65439343 17 nips-2013-A multi-agent control framework for co-adaptation in brain-computer interfaces

Author: Josh S. Merel, Roy Fox, Tony Jebara, Liam Paninski

Abstract: In a closed-loop brain-computer interface (BCI), adaptive decoders are used to learn parameters suited to decoding the user’s neural response. Feedback to the user provides information which permits the neural tuning to also adapt. We present an approach to model this process of co-adaptation between the encoding model of the neural signal and the decoding algorithm as a multi-agent formulation of the linear quadratic Gaussian (LQG) control problem. In simulation we characterize how decoding performance improves as the neural encoding and adaptive decoder optimize, qualitatively resembling experimentally demonstrated closed-loop improvement. We then propose a novel, modiﬁed decoder update rule which is aware of the fact that the encoder is also changing and show it can improve simulated co-adaptation dynamics. Our modeling approach offers promise for gaining insights into co-adaptation as well as improving user learning of BCI control in practical settings.

4 0.6440919 62 nips-2013-Causal Inference on Time Series using Restricted Structural Equation Models

Author: Jonas Peters, Dominik Janzing, Bernhard Schölkopf

5 0.62324786 48 nips-2013-Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

Author: Roger Frigola, Fredrik Lindsten, Thomas B. Schon, Carl Rasmussen

Abstract: State-space models are successfully used in many areas of science, engineering and economics to model time series and dynamical systems. We present a fully Bayesian approach to inference and learning (i.e. state estimation and system identiﬁcation) in nonlinear nonparametric state-space models. We place a Gaussian process prior over the state transition dynamics, resulting in a ﬂexible model able to capture complex dynamical phenomena. To enable efﬁcient inference, we marginalize over the transition dynamics function and, instead, infer directly the joint smoothing distribution using specially tailored Particle Markov Chain Monte Carlo samplers. Once a sample from the smoothing distribution is computed, the state transition predictive distribution can be formulated analytically. Our approach preserves the full nonparametric expressivity of the model and can make use of sparse Gaussian processes to greatly reduce computational complexity. 1

6 0.62111902 100 nips-2013-Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture

7 0.60598773 266 nips-2013-Recurrent linear models of simultaneously-recorded neural populations

8 0.5275771 127 nips-2013-Generalized Denoising Auto-Encoders as Generative Models

9 0.51006436 298 nips-2013-Small-Variance Asymptotics for Hidden Markov Models

10 0.49838951 24 nips-2013-Actor-Critic Algorithms for Risk-Sensitive MDPs

11 0.48195201 53 nips-2013-Bayesian inference for low rank spatiotemporal neural receptive fields

12 0.47212029 280 nips-2013-Robust Data-Driven Dynamic Programming

13 0.43087548 198 nips-2013-More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server

14 0.4261117 269 nips-2013-Regression-tree Tuning in a Streaming Setting

15 0.40407178 348 nips-2013-Variational Policy Search via Trajectory Optimization

16 0.39705729 89 nips-2013-Dimension-Free Exponentiated Gradient

17 0.39695337 18 nips-2013-A simple example of Dirichlet process mixture inconsistency for the number of components

18 0.38402319 86 nips-2013-Demixing odors - fast inference in olfaction

19 0.36562911 45 nips-2013-BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables

20 0.36433628 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(16, 0.056), (33, 0.087), (34, 0.116), (41, 0.031), (49, 0.059), (56, 0.07), (65, 0.344), (70, 0.027), (85, 0.038), (89, 0.022), (93, 0.033), (95, 0.014)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.72261786 41 nips-2013-Approximate inference in latent Gaussian-Markov models from continuous time observations

Author: Botond Cseke, Manfred Opper, Guido Sanguinetti

2 0.63396281 195 nips-2013-Modeling Clutter Perception using Parametric Proto-object Partitioning

Author: Chen-Ping Yu, Wen-Yu Hua, Dimitris Samaras, Greg Zelinsky

Abstract: Visual clutter, the perception of an image as being crowded and disordered, affects aspects of our lives ranging from object detection to aesthetics, yet relatively little effort has been made to model this important and ubiquitous percept. Our approach models clutter as the number of proto-objects segmented from an image, with proto-objects deﬁned as groupings of superpixels that are similar in intensity, color, and gradient orientation features. We introduce a novel parametric method of clustering superpixels by modeling mixture of Weibulls on Earth Mover’s Distance statistics, then taking the normalized number of proto-objects following partitioning as our estimate of clutter perception. We validated this model using a new 90-image dataset of real world scenes rank ordered by human raters for clutter, and showed that our method not only predicted clutter extremely well (Spearman’s ρ = 0.8038, p < 0.001), but also outperformed all existing clutter perception models and even a behavioral object segmentation ground truth. We conclude that the number of proto-objects in an image affects clutter perception more than the number of objects or features. 1

3 0.59774178 214 nips-2013-On Algorithms for Sparse Multi-factor NMF

Author: Siwei Lyu, Xin Wang

Abstract: Nonnegative matrix factorization (NMF) is a popular data analysis method, the objective of which is to approximate a matrix with all nonnegative components into the product of two nonnegative matrices. In this work, we describe a new simple and efﬁcient algorithm for multi-factor nonnegative matrix factorization (mfNMF) problem that generalizes the original NMF problem to more than two factors. Furthermore, we extend the mfNMF algorithm to incorporate a regularizer based on the Dirichlet distribution to encourage the sparsity of the components of the obtained factors. Our sparse mfNMF algorithm affords a closed form and an intuitive interpretation, and is more efﬁcient in comparison with previous works that use ﬁx point iterations. We demonstrate the effectiveness and efﬁciency of our algorithms on both synthetic and real data sets. 1

4 0.50290501 93 nips-2013-Discriminative Transfer Learning with Tree-based Priors

Author: Nitish Srivastava, Ruslan Salakhutdinov

Abstract: High capacity classiﬁers, such as deep neural networks, often struggle on classes that have very few training examples. We propose a method for improving classiﬁcation performance for such classes by discovering similar classes and transferring knowledge among them. Our method learns to organize the classes into a tree hierarchy. This tree structure imposes a prior over the classiﬁer’s parameters. We show that the performance of deep neural networks can be improved by applying these priors to the weights in the last layer. Our method combines the strength of discriminatively trained deep neural networks, which typically require large amounts of training data, with tree-based priors, making deep neural networks work well on infrequent classes as well. We also propose an algorithm for learning the underlying tree structure. Starting from an initial pre-speciﬁed tree, this algorithm modiﬁes the tree to make it more pertinent to the task being solved, for example, removing semantic relationships in favour of visual ones for an image classiﬁcation task. Our method achieves state-of-the-art classiﬁcation results on the CIFAR-100 image data set and the MIR Flickr image-text data set. 1

5 0.46378627 262 nips-2013-Real-Time Inference for a Gamma Process Model of Neural Spiking

Author: David Carlson, Vinayak Rao, Joshua T. Vogelstein, Lawrence Carin

Abstract: With simultaneous measurements from ever increasing populations of neurons, there is a growing need for sophisticated tools to recover signals from individual neurons. In electrophysiology experiments, this classically proceeds in a two-step process: (i) threshold the waveforms to detect putative spikes and (ii) cluster the waveforms into single units (neurons). We extend previous Bayesian nonparametric models of neural spiking to jointly detect and cluster neurons using a Gamma process model. Importantly, we develop an online approximate inference scheme enabling real-time analysis, with performance exceeding the previous state-of-theart. Via exploratory data analysis—using data with partial ground truth as well as two novel data sets—we ﬁnd several features of our model collectively contribute to our improved performance including: (i) accounting for colored noise, (ii) detecting overlapping spikes, (iii) tracking waveform dynamics, and (iv) using multiple channels. We hope to enable novel experiments simultaneously measuring many thousands of neurons and possibly adapting stimuli dynamically to probe ever deeper into the mysteries of the brain. 1

6 0.45970571 238 nips-2013-Optimistic Concurrency Control for Distributed Unsupervised Learning

7 0.4581067 77 nips-2013-Correlations strike back (again): the case of associative memory retrieval

8 0.45790303 121 nips-2013-Firing rate predictions in optimal balanced networks

9 0.4564321 86 nips-2013-Demixing odors - fast inference in olfaction

10 0.45333433 148 nips-2013-Latent Maximum Margin Clustering

11 0.45301282 303 nips-2013-Sparse Overlapping Sets Lasso for Multitask Learning and its Application to fMRI Analysis

12 0.45192009 287 nips-2013-Scalable Inference for Logistic-Normal Topic Models

13 0.45177284 104 nips-2013-Efficient Online Inference for Bayesian Nonparametric Relational Models

14 0.45056519 5 nips-2013-A Deep Architecture for Matching Short Texts

15 0.45021784 278 nips-2013-Reward Mapping for Transfer in Long-Lived Agents

16 0.44990197 347 nips-2013-Variational Planning for Graph-based MDPs

17 0.44981402 58 nips-2013-Binary to Bushy: Bayesian Hierarchical Clustering with the Beta Coalescent

18 0.44952372 141 nips-2013-Inferring neural population dynamics from multiple partial recordings of the same neural circuit

19 0.44886923 173 nips-2013-Least Informative Dimensions

20 0.4486765 229 nips-2013-Online Learning of Nonparametric Mixture Models via Sequential Variational Approximation