nips nips2005 nips2005-64 knowledge-graph by maker-knowledge-mining

64 nips-2005-Efficient estimation of hidden state dynamics from spike trains

Source: pdf

Author: Marton G. Danoczy, Richard H. R. Hahnloser

Abstract: Neurons can have rapidly changing spike train statistics dictated by the underlying network excitability or behavioural state of an animal. To estimate the time course of such state dynamics from single- or multiple neuron recordings, we have developed an algorithm that maximizes the likelihood of observed spike trains by optimizing the state lifetimes and the state-conditional interspike-interval (ISI) distributions. Our nonparametric algorithm is free of time-binning and spike-counting problems and has the computational complexity of a Mixed-state Markov Model operating on a state sequence of length equal to the total number of recorded spikes. As an example, we ﬁt a two-state model to paired recordings of premotor neurons in the sleeping songbird. We ﬁnd that the two state-conditional ISI functions are highly similar to the ones measured during waking and singing, respectively. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Efﬁcient estimation of hidden state dynamics from spike trains M´ rton G. [sent-1, score-0.862]

2 ch Abstract Neurons can have rapidly changing spike train statistics dictated by the underlying network excitability or behavioural state of an animal. [sent-13, score-0.563]

3 To estimate the time course of such state dynamics from single- or multiple neuron recordings, we have developed an algorithm that maximizes the likelihood of observed spike trains by optimizing the state lifetimes and the state-conditional interspike-interval (ISI) distributions. [sent-14, score-1.101]

4 Our nonparametric algorithm is free of time-binning and spike-counting problems and has the computational complexity of a Mixed-state Markov Model operating on a state sequence of length equal to the total number of recorded spikes. [sent-15, score-0.225]

5 As an example, we ﬁt a two-state model to paired recordings of premotor neurons in the sleeping songbird. [sent-16, score-0.262]

6 1 Introduction It is well known that neurons can suddenly change ﬁring statistics to reﬂect a macroscopic change of a nervous system. [sent-18, score-0.13]

7 Often, ﬁring changes are not accompanied by an immediate behavioural change, as is the case, for example, in paralysed patients, during sleep [1], during covert discriminative processing [2], and for all in-vitro studies [3]. [sent-19, score-0.162]

8 In all of these cases, changes in some hidden macroscopic state can only be detected by close inspection of single or multiple spike trains. [sent-20, score-0.724]

9 Our goal is to develop a powerful, but computationally simple tool for point processes such as spike trains. [sent-21, score-0.284]

10 From spike train data, we want to the extract continuously evolving hidden variables, assuming a discrete set of possible states. [sent-22, score-0.557]

11 Our model for classifying spikes into discrete hidden states is based on three assumptions: 1. [sent-23, score-0.32]

12 Hidden states form a continuous-time Markov process and thus have exponentially distributed lifetimes 2. [sent-24, score-0.142]

13 State switching can occur only at the time of a spike (where there is observable evidence for a new state). [sent-25, score-0.431]

14 In each of the hidden states, spike trains are generated by mutually independent renewal processes. [sent-27, score-0.695]

15 For a continuous-time Markov process, the probability of staying in state S = i for a time interval T > t is given by Pi (t) = exp(−rit), where ri is the escape rate (or hazard rate) of state i. [sent-29, score-0.629]

16 The mean lifetime τi is deﬁned as the inverse of the escape rate, τi = 1/ri . [sent-30, score-0.236]

17 As a corollary, it follows that the probability of staying in state i for a particular duration equals the probability of surviving for a fraction of that duration times the probability of surviving for the remaining time, i. [sent-31, score-0.421]

18 , the state survival probability Pi (t) satisﬁes the product identity Pi (t1 + t2 ) = Pi (t1 )Pi (t2 ). [sent-33, score-0.223]

19 According to the second assumption, state switching can occur at any spike, irrespective of which neuron ﬁred the spike. [sent-35, score-0.444]

20 In the following, we shall refer to a spike ﬁred by any of the neurons as an event (where state switching might occur). [sent-36, score-0.73]

21 Note that if two (or more) neurons happen to ﬁre a spike at exactly the same time, the respective spikes are regarded as two (or more) distinct events. [sent-37, score-0.458]

22 Combining the ﬁrst two assumptions, we formulate the hidden state sequence at the events (i. [sent-39, score-0.442]

23 Accordingly, the probability of remaining in state i for the duration of the interevent-interval (IEI) ∆te = te − te−1 is given by the state survival probability Pi (∆te ). [sent-42, score-0.81]

24 The probability to change state is then 1 − Pi (∆te ). [sent-43, score-0.192]

25 In each state i, the spike trains are assumed to be generated by a renewal process that randomly draws interspike-intervals (ISIs) t from a probability density function (pdf) hi (t). [sent-45, score-0.67]

26 It is deﬁned as the probability density of spiking in the time interval [ϕ , ϕ + dt], given that no spike has occurred in the interval [0, ϕ ) since the last spike. [sent-48, score-0.374]

27 the time that has elapsed since the last spike, shall be referred to as phase [5]. [sent-51, score-0.122]

28 Using the CIF, the ISI pdf can be expressed by the fundamental equation of renewal theory, t hi (t) = exp − 0 λi(ϕ ) dϕ λi(t). [sent-52, score-0.168]

29 (1) At each event e, we observe the phase trajectory of every neuron traced out since the last event. [sent-53, score-0.438]

30 It is clear that in multiple electrode recordings the phase trajectories between events are not independent, since they have to start where the previous trajectory ended. [sent-54, score-0.166]

31 Such models are generalizations of HMMs in that the observable outputs may not only be dependent on the current hidden state, but also on past observations (formally, the mixed state is formed by combining the hidden and observable states). [sent-57, score-0.778]

32 In our model, hidden state transition probabilities are characterized by the escape rates ri and observable state transition probabilities by the CIFs λn for neuron n in hidden state i. [sent-58, score-1.612]

33 2 Transition probabilities The mixed state at event e shall be composed of the hidden state Se and the observable outputs On (for neurons n ∈ {1, . [sent-62, score-0.893]

34 e Hidden state transitions In classical mixed-state Markov models, the hidden state transition probabilities are constant. [sent-66, score-0.756]

35 In our model, however, we describe time as a continuous quantity and observe the system whenever a spike occurs, thus in non-equidistant intervals. [sent-67, score-0.284]

36 Consequently, hidden state transitions depend explicitly on the elapsed time since the last observation, i. [sent-68, score-0.489]

37 The transition probability ai j (∆te ) from hidden state i to hidden state j is then given by exp(−r j ∆te ) if i = j, (2) [1 − exp(−r j ∆te )] gi j otherwise, where gi j is the conditional probability of making a transition from state i into a new state j, given that j = i. [sent-71, score-1.488]

38 Thus, gi j has to satisfy the constraint ∑ j gi j = 1, with gii = 0. [sent-72, score-0.14]

39 ai j (∆te ) = Observable state transitions The observation at event e is deﬁned as Oe = {Φn , νe }, e where νe contains the index of the neuron that has triggered event e by emitting a spike, and Φn = (inf Φn , sup Φn ] is the phase interval traced out by neuron n since its last spike. [sent-73, score-1.054]

40 After a spike, the phase of the respective neuron is immediately reset to zero. [sent-75, score-0.303]

41 The interval’s bounds are thus deﬁned by sup Φn = inf Φn + ∆te e e and inf Φn = e 0 sup Φn e−1 if νe−1 = n, otherwise. [sent-76, score-0.13]

42 The observable transition probability pi (Oe ) = Pr{Oe | Oe−1 , Se = i} is the probability of observing output Oe , given the previous output Oe−1 and the current hidden state Se . [sent-77, score-0.66]

43 e (3) Note that in case of a single neuron recording, this reduces to the ISI pdf. [sent-80, score-0.181]

44 To give a closed form of the observable transition pdf, several approaches are thinkable. [sent-81, score-0.149]

45 Here, for the sake of ﬂexibility and computational simplicity, we approximate the CIF λn i for neuron n in state i by a step function, assuming that its value is constant inside small, n arbitrarily spaced bins Bn(b), b ∈ {1, . [sent-82, score-0.427]

46 i i n In order to use the discretized CIFs n(b), we also discretize Φn : the fractions fe (b) ∈ [0, 1] e i n(b) has been traced out since the last event. [sent-87, score-0.27]

47 represent how much of neuron n’s phase bin B For example, if event e − 1 happened in the middle of neuron n’s phase bin 2 and event e n n n happened ten percent into its phase bin 4, then fe (2) = 0. [sent-88, score-1.248]

48 whereas fe Making use of these discretizations, the integral in equation 3 is approximated by a sum: pi (Oe ) ≈ ∏ exp − n n Nbins ∑ n fe (b) n(b) Bn(b) i ν λνe(sup Φe e ), i (4) b=1 with Bn(b) denoting the width of neuron n’s phase bin b. [sent-91, score-0.723]

49 Next, we apply the EM algorithm to ﬁnd optimal values of the escape rates ri , the conditional hidden state transition probabilities gi j and the discretized CIFs n(b), given a set of spike trains. [sent-93, score-1.039]

50 5 4 1 2 1 Events te−1 te Figure 1: Two spike trains are combined to form the event train shown in the bottom row. [sent-97, score-0.92]

51 The phase bins are shown below the spike trains, they are labelled with the corresponding 2 bin number. [sent-98, score-0.52]

52 As an example, for the second neuron, the fractions fe (b) of its phase bins that have been traced out since event e − 1 are indicated by the horizontal arrow. [sent-99, score-0.462]

53 According to the EM algorithm, we can ﬁnd such values by iterating over models (5) Ψnew = arg max ∑ Pr{S | O, Ψold } ln(Pr{S, O | ψ }) , ψ S∈S where S is the set of all possible hidden state sequences. [sent-102, score-0.409]

54 e Because of the logarithm in equation 5, the maximization over escape rates can be separated from the maximization over conditional intensity functions. [sent-104, score-0.198]

55 e Finally, to obtain the reestimation formula for the conditional hidden state transition probabilities gi j , we solve equation 8 using Lagrange multipliers, resulting in gnew = ∑ ξi j (e) i= j e 4 ∑ ξik (e). [sent-112, score-0.747]

56 e, k=i Application to spike trains from the sleeping songbird We have applied our model to spike train data from sleeping songbirds [9]. [sent-113, score-0.89]

57 It has been found that during sleep, neurons in vocal premotor area RA exhibit spontaneous activity that at times resembles premotor activity during singing [10, 9]. [sent-114, score-0.403]

58 We train our model on the spike train of a single RA neuron in the sleeping bird with Nbins = 100, where the ﬁrst bin extends from the sample time to 1ms and the consecutive 99 steps are logarithmically spaced up to the largest ISI. [sent-115, score-0.801]

59 After convergence, we ﬁnd that the ISI pdfs associated with the two hidden states qualitatively agree with the pdfs recorded in the awake non-singing bird and the awake singing bird, respectively, Figure 2. [sent-116, score-0.71]

60 ISI pdfs were derived from the CIFs by using equation 1. [sent-117, score-0.121]

61 For the state-conditional ISI histograms we ﬁrst ran the Viterbi algorithm to ﬁnd the most likely hidden-state sequence and then sorted spikes into two groups, for which the ISIs histograms were computed. [sent-118, score-0.228]

62 We ﬁnd that sleep-related activity in the RA neuron of Figure 2 is best described by random switching between a singing-like state of lifetime τ1 = 1. [sent-119, score-0.619]

63 38s and an awake, nonsinging-like state of lifetime τ2 = 2. [sent-121, score-0.336]

64 Standard deviations of lifetime estimates were computed by dividing the spike train into 30 data windows of 10s duration each and computing the Jackknife variance [11] on the truncated spike trains. [sent-124, score-0.826]

65 The difference between the singing-like state in our model and the true singing ISI pdf shown in Figure 2 is more likely due to generally reduced burst rates during sleep, rather than to a particularity of the examined neuron. [sent-125, score-0.332]

66 By ﬁtting two separate models (with identical phase binning) to the two spike trains, and after running the Viterbi algorithm to ﬁnd the most likely hidden state sequences, we ﬁnd good agreement between the two sequences, Figure 3 (top row) and 4c. [sent-127, score-0.816]

67 The correspondence of hidden state sequences suggests a common network mechanism for the generation of the singing-like states in both neurons. [sent-128, score-0.505]

68 We thus applied a single model to both spike trains and found again good agreement with hidden-state sequences determined for the separate models, Figure 3 (bottom row) and 4f. [sent-129, score-0.475]

69 The lifetime histograms for both states look approximatively exponential, justifying our assumption for the state dynamics, Figure 4g and h. [sent-130, score-0.487]

70 For the model trained on neuron one we ﬁnd lifetimes τ1 = 0. [sent-131, score-0.264]

71 45s, and for the model trained on neuron two we ﬁnd τ1 = 0. [sent-135, score-0.181]

72 [%] (a) 10 5 0 100 101 102 ISI [ms] 103 15 10 5 0 100 101 102 ISI [ms] 103 Figure 2: (a): The two state-conditional ISI histograms of an RA neuron during sleep are shown by the red and green curves, respectively. [sent-145, score-0.45]

73 (b): After waking up the bird by pinching his tail, the new ISI histogram shown by the gray area becomes almost indistinguishable from the ISI histogram of state 1 (green line). [sent-147, score-0.4]

74 (c): In comparison to the average ISI histogram of many RA neurons during singing (shown by the gray area, reproduced from [12]), the ISI histogram corresponding to state 2 (red line) is shifted to the right, but looks otherwise qualitatively similar. [sent-148, score-0.484]

75 The reason for this increase might be that evidence for the song-like state appears more frequently with two neurons, as a single neuron might not be able to indicate song-like ﬁring statistics with high temporal ﬁdelity. [sent-153, score-0.373]

76 We have also analysed the correlations between state dynamics in the different models. [sent-154, score-0.239]

77 The hidden state function S(t) is a binary function that equals one when in hidden state 1 and zero when in state 2. [sent-155, score-1.01]

78 For the case where we modelled the two spike trains separately, we have two such hidden state functions, S1 (t) for neuron one and S2 (t) for neuron two. [sent-156, score-1.177]

79 5 Discussion We have presented a mixed-state Markov model for point processes, assuming generation by random switching between renewal processes. [sent-161, score-0.143]

80 Our algorithm is suited for systems in which neurons make discrete state transitions simultaneously. [sent-162, score-0.34]

81 Previous attempts of ﬁtting spike train data with Markov models exhibited weaknesses due to time binning. [sent-163, score-0.34]

82 With large time bins and the number of spikes per bin treated as observables [13, 14], state transitions can only be detected when they are accompanied by ﬁring rate changes. [sent-164, score-0.471]

83 We were able to model the hidden states in continuous time, but had to bin the ISIs in order to deal with limited data. [sent-166, score-0.367]

84 The green areas show the times when in the ﬁrst (awake-like) hidden state, and the red areas when in the song-like hidden state. [sent-171, score-0.521]

85 1 # states 30 101 102 ISI [ms] 103 (g) 20 correlation 3 (e) 2 1 0 100 30 101 102 ISI [ms] (h) 0 101 102 103 104 State duration [ms] (c) . [sent-173, score-0.157]

86 (d) and (e): ISI histograms (blue and yellow) for neurons 1 and 2, respectively, as well as state-conditional ISI histograms (red and green), computed as in Figure 2a. [sent-188, score-0.283]

87 (g) and (h): State lifetime histograms for the song-like state (red) and for the awake-like state (green). [sent-189, score-0.62]

88 Theoretical (exponential) histograms with escape rates r1 and r2 (ﬁne black lines) show good agreement with the measured histograms, especially in F. [sent-190, score-0.184]

89 (c): Correlation between state functions of the two separate models. [sent-191, score-0.224]

90 (f): Correlation between the state functions of the combined model with separate model 1 (blue) and separate model 2 (yellow). [sent-192, score-0.293]

91 mating this lifetime, we hope it might be possible to form a link between the hidden states and the underlying physical process that governs the dynamics of switching. [sent-196, score-0.323]

92 Despite the apparent limitation of Poisson statistics, it is a simple matter to generalize our model to hidden state distributions with long tails (e. [sent-197, score-0.409]

93 , power-law lifetime distributions): By cascading many hidden states into a chain (with ﬁxed CIFs), a power-law distribution can be approximated by the combination of multiple exponentials with different lifetimes. [sent-199, score-0.42]

94 Replay and time compression a o a a of recurring spike sequences in the hippocampus. [sent-217, score-0.321]

95 Perceptual and motor processing stages identiﬁed in the activity of macaque frontal eye ﬁeld neurons during visual search. [sent-228, score-0.13]

96 The time-rescaling theorem and its application to neural spike train data analysis. [sent-245, score-0.34]

97 Forecasting probability densities by using hidden Markov models with mixed states. [sent-257, score-0.217]

98 A tutorial on hidden Markov models and selected applications in speech recognition. [sent-267, score-0.217]

99 Song replay during sleep and computational rules for sensorimotor vocal learning. [sent-283, score-0.193]

100 Analysis, classiﬁcation, and coding of u u multielectrode spike trains with hidden Markov models. [sent-305, score-0.623]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('isi', 0.444), ('te', 0.337), ('spike', 0.284), ('hidden', 0.217), ('state', 0.192), ('neuron', 0.181), ('oe', 0.162), ('cifs', 0.145), ('lifetime', 0.144), ('trains', 0.122), ('ra', 0.116), ('fe', 0.11), ('isis', 0.104), ('pi', 0.102), ('pr', 0.1), ('neurons', 0.099), ('bn', 0.094), ('escape', 0.092), ('histograms', 0.092), ('bin', 0.091), ('phase', 0.091), ('sleep', 0.09), ('se', 0.088), ('event', 0.084), ('cif', 0.083), ('gnew', 0.083), ('ifr', 0.083), ('lifetimes', 0.083), ('nbins', 0.083), ('pdfs', 0.083), ('singing', 0.082), ('traced', 0.082), ('observable', 0.076), ('ms', 0.075), ('transition', 0.073), ('renewal', 0.072), ('sleeping', 0.072), ('switching', 0.071), ('gi', 0.07), ('markov', 0.068), ('sup', 0.065), ('ring', 0.063), ('jackknife', 0.062), ('vocal', 0.062), ('bird', 0.061), ('states', 0.059), ('duration', 0.058), ('pdf', 0.058), ('train', 0.056), ('bins', 0.054), ('premotor', 0.049), ('transitions', 0.049), ('red', 0.048), ('dynamics', 0.047), ('awake', 0.046), ('interval', 0.045), ('spikes', 0.044), ('poisson', 0.042), ('recordings', 0.042), ('accompanied', 0.041), ('fen', 0.041), ('fractions', 0.041), ('iei', 0.041), ('reestimation', 0.041), ('replay', 0.041), ('rnew', 0.041), ('surviving', 0.041), ('ri', 0.041), ('correlation', 0.04), ('gray', 0.039), ('green', 0.039), ('equation', 0.038), ('combined', 0.037), ('sequences', 0.037), ('discretized', 0.037), ('sec', 0.037), ('histogram', 0.036), ('hazard', 0.036), ('abbreviations', 0.036), ('binning', 0.036), ('viterbi', 0.036), ('waking', 0.036), ('maximization', 0.034), ('probabilities', 0.033), ('events', 0.033), ('old', 0.033), ('feb', 0.033), ('hahnloser', 0.033), ('recorded', 0.033), ('separate', 0.032), ('activity', 0.031), ('ln', 0.031), ('respective', 0.031), ('behavioural', 0.031), ('macroscopic', 0.031), ('happened', 0.031), ('survival', 0.031), ('cybern', 0.031), ('staying', 0.031), ('elapsed', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9999997 64 nips-2005-Efficient estimation of hidden state dynamics from spike trains

Author: Marton G. Danoczy, Richard H. R. Hahnloser

2 0.27341223 181 nips-2005-Spiking Inputs to a Winner-take-all Network

Author: Matthias Oster, Shih-Chii Liu

Abstract: Recurrent networks that perform a winner-take-all computation have been studied extensively. Although some of these studies include spiking networks, they consider only analog input rates. We present results of this winner-take-all computation on a network of integrate-and-ﬁre neurons which receives spike trains as inputs. We show how we can conﬁgure the connectivity in the network so that the winner is selected after a pre-determined number of input spikes. We discuss spiking inputs with both regular frequencies and Poisson-distributed rates. The robustness of the computation was tested by implementing the winner-take-all network on an analog VLSI array of 64 integrate-and-ﬁre neurons which have an innate variance in their operating parameters. 1

3 0.25792933 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity

Author: Robert A. Legenstein, Wolfgang Maass

Abstract: We investigate under what conditions a neuron can learn by experimentally supported rules for spike timing dependent plasticity (STDP) to predict the arrival times of strong “teacher inputs” to the same neuron. It turns out that in contrast to the famous Perceptron Convergence Theorem, which predicts convergence of the perceptron learning rule for a simpliﬁed neuron model whenever a stable solution exists, no equally strong convergence guarantee can be given for spiking neurons with STDP. But we derive a criterion on the statistical dependency structure of input spike trains which characterizes exactly when learning with STDP will converge on average for a simple model of a spiking neuron. This criterion is reminiscent of the linear separability criterion of the Perceptron Convergence Theorem, but it applies here to the rows of a correlation matrix related to the spike inputs. In addition we show through computer simulations for more realistic neuron models that the resulting analytically predicted positive learning results not only hold for the common interpretation of STDP where STDP changes the weights of synapses, but also for a more realistic interpretation suggested by experimental data where STDP modulates the initial release probability of dynamic synapses. 1

4 0.22096947 99 nips-2005-Integrate-and-Fire models with adaptation are good enough

Author: Renaud Jolivet, Alexander Rauch, Hans-rudolf Lüscher, Wulfram Gerstner

Abstract: Integrate-and-Fire-type models are usually criticized because of their simplicity. On the other hand, the Integrate-and-Fire model is the basis of most of the theoretical studies on spiking neuron models. Here, we develop a sequential procedure to quantitatively evaluate an equivalent Integrate-and-Fire-type model based on intracellular recordings of cortical pyramidal neurons. We ﬁnd that the resulting effective model is sufﬁcient to predict the spike train of the real pyramidal neuron with high accuracy. In in vivo-like regimes, predicted and recorded traces are almost indistinguishable and a signiﬁcant part of the spikes can be predicted at the correct timing. Slow processes like spike-frequency adaptation are shown to be a key feature in this context since they are necessary for the model to connect between different driving regimes. 1

5 0.1766357 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

Author: Afsheen Afshar, Gopal Santhanam, Stephen I. Ryu, Maneesh Sahani, Byron M. Yu, Krishna V. Shenoy

Abstract: Spiking activity from neurophysiological experiments often exhibits dynamics beyond that driven by external stimulation, presumably reﬂecting the extensive recurrence of neural circuitry. Characterizing these dynamics may reveal important features of neural computation, particularly during internally-driven cognitive operations. For example, the activity of premotor cortex (PMd) neurons during an instructed delay period separating movement-target speciﬁcation and a movementinitiation cue is believed to be involved in motor planning. We show that the dynamics underlying this activity can be captured by a lowdimensional non-linear dynamical systems model, with underlying recurrent structure and stochastic point-process output. We present and validate latent variable methods that simultaneously estimate the system parameters and the trial-by-trial dynamical trajectories. These methods are applied to characterize the dynamics in PMd data recorded from a chronically-implanted 96-electrode array while monkeys perform delayed-reach tasks. 1

6 0.14893976 118 nips-2005-Learning in Silicon: Timing is Everything

7 0.14074455 39 nips-2005-Beyond Pair-Based STDP: a Phenomenological Rule for Spike Triplet and Frequency Effects

8 0.13349302 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks

9 0.13060188 134 nips-2005-Neural mechanisms of contrast dependent receptive field size in V1

10 0.10819059 188 nips-2005-Temporally changing synaptic plasticity

11 0.10083017 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models

12 0.091869533 129 nips-2005-Modeling Neural Population Spiking Activity with Gibbs Distributions

13 0.089631803 50 nips-2005-Convex Neural Networks

14 0.087557547 49 nips-2005-Convergence and Consistency of Regularized Boosting Algorithms with Stationary B-Mixing Observations

15 0.08652515 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks

16 0.075865701 28 nips-2005-Analyzing Auditory Neurons by Learning Distance Functions

17 0.074006386 153 nips-2005-Policy-Gradient Methods for Planning

18 0.068680026 147 nips-2005-On the Convergence of Eigenspaces in Kernel Principal Component Analysis

19 0.065733701 130 nips-2005-Modeling Neuronal Interactivity using Dynamic Bayesian Networks

20 0.064854644 113 nips-2005-Learning Multiple Related Tasks using Latent Independent Component Analysis

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.209), (1, -0.342), (2, -0.027), (3, -0.171), (4, 0.029), (5, -0.054), (6, 0.031), (7, 0.059), (8, -0.041), (9, -0.057), (10, 0.111), (11, 0.02), (12, 0.001), (13, -0.041), (14, 0.071), (15, 0.084), (16, 0.054), (17, -0.05), (18, 0.022), (19, 0.082), (20, -0.005), (21, -0.011), (22, 0.056), (23, 0.056), (24, -0.008), (25, -0.055), (26, 0.025), (27, 0.046), (28, -0.073), (29, -0.051), (30, 0.152), (31, -0.072), (32, 0.003), (33, 0.193), (34, -0.013), (35, -0.102), (36, 0.023), (37, -0.0), (38, -0.0), (39, -0.107), (40, 0.086), (41, 0.038), (42, -0.001), (43, -0.03), (44, 0.049), (45, -0.012), (46, -0.009), (47, -0.039), (48, -0.105), (49, -0.003)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96410072 64 nips-2005-Efficient estimation of hidden state dynamics from spike trains

Author: Marton G. Danoczy, Richard H. R. Hahnloser

2 0.79658556 181 nips-2005-Spiking Inputs to a Winner-take-all Network

Author: Matthias Oster, Shih-Chii Liu

3 0.76681679 99 nips-2005-Integrate-and-Fire models with adaptation are good enough

Author: Renaud Jolivet, Alexander Rauch, Hans-rudolf Lüscher, Wulfram Gerstner

4 0.69379693 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity

Author: Robert A. Legenstein, Wolfgang Maass

5 0.60219711 118 nips-2005-Learning in Silicon: Timing is Everything

Author: John V. Arthur, Kwabena Boahen

Abstract: We describe a neuromorphic chip that uses binary synapses with spike timing-dependent plasticity (STDP) to learn stimulated patterns of activity and to compensate for variability in excitability. Speciﬁcally, STDP preferentially potentiates (turns on) synapses that project from excitable neurons, which spike early, to lethargic neurons, which spike late. The additional excitatory synaptic current makes lethargic neurons spike earlier, thereby causing neurons that belong to the same pattern to spike in synchrony. Once learned, an entire pattern can be recalled by stimulating a subset. 1 Variability in Neural Systems Evidence suggests precise spike timing is important in neural coding, speciﬁcally, in the hippocampus. The hippocampus uses timing in the spike activity of place cells (in addition to rate) to encode location in space [1]. Place cells employ a phase code: the timing at which a neuron spikes relative to the phase of the inhibitory theta rhythm (5-12Hz) conveys information. As an animal approaches a place cell’s preferred location, the place cell not only increases its spike rate, but also spikes at earlier phases in the theta cycle. To implement a phase code, the theta rhythm is thought to prevent spiking until the input synaptic current exceeds the sum of the neuron threshold and the decreasing inhibition on the downward phase of the cycle [2]. However, even with identical inputs and common theta inhibition, neurons do not spike in synchrony. Variability in excitability spreads the activity in phase. Lethargic neurons (such as those with high thresholds) spike late in the theta cycle, since their input exceeds the sum of the neuron threshold and theta inhibition only after the theta inhibition has had time to decrease. Conversely, excitable neurons (such as those with low thresholds) spike early in the theta cycle. Consequently, variability in excitability translates into variability in timing. We hypothesize that the hippocampus achieves its precise spike timing (about 10ms) through plasticity enhanced phase-coding (PEP). The source of hippocampal timing precision in the presence of variability (and noise) remains unexplained. Synaptic plasticity can compensate for variability in excitability if it increases excitatory synaptic input to neurons in inverse proportion to their excitabilities. Recasting this in a phase-coding framework, we desire a learning rule that increases excitatory synaptic input to neurons directly related to their phases. Neurons that lag require additional synaptic input, whereas neurons that lead 120µm 190µm A B Figure 1: STDP Chip. A The chip has a 16-by-16 array of microcircuits; one microcircuit includes four principal neurons, each with 21 STDP circuits. B The STDP Chip is embedded in a circuit board including DACs, a CPLD, a RAM chip, and a USB chip, which communicates with a PC. require none. The spike timing-dependent plasticity (STDP) observed in the hippocampus satisﬁes this requirement [3]. It requires repeated pre-before-post spike pairings (within a time window) to potentiate and repeated post-before-pre pairings to depress a synapse. Here we validate our hypothesis with a model implemented in silicon, where variability is as ubiquitous as it is in biology [4]. Section 2 presents our silicon system, including the STDP Chip. Section 3 describes and characterizes the STDP circuit. Section 4 demonstrates that PEP compensates for variability and provides evidence that STDP is the compensation mechanism. Section 5 explores a desirable consequence of PEP: unconventional associative pattern recall. Section 6 discusses the implications of the PEP model, including its beneﬁts and applications in the engineering of neuromorphic systems and in the study of neurobiology. 2 Silicon System We have designed, submitted, and tested a silicon implementation of PEP. The STDP Chip was fabricated through MOSIS in a 1P5M 0.25µm CMOS process, with just under 750,000 transistors in just over 10mm2 of area. It has a 32 by 32 array of excitatory principal neurons commingled with a 16 by 16 array of inhibitory interneurons that are not used here (Figure 1A). Each principal neuron has 21 STDP synapses. The address-event representation (AER) [5] is used to transmit spikes off chip and to receive afferent and recurrent spike input. To conﬁgure the STDP Chip as a recurrent network, we embedded it in a circuit board (Figure 1B). The board has ﬁve primary components: a CPLD (complex programmable logic device), the STDP Chip, a RAM chip, a USB interface chip, and DACs (digital-to-analog converters). The central component in the system is the CPLD. The CPLD handles AER trafﬁc, mediates communication between devices, and implements recurrent connections by accessing a lookup table, stored in the RAM chip. The USB interface chip provides a bidirectional link with a PC. The DACs control the analog biases in the system, including the leak current, which the PC varies in real-time to create the global inhibitory theta rhythm. The principal neuron consists of a refractory period and calcium-dependent potassium circuit (RCK), a synapse circuit, and a soma circuit (Figure 2A). RCK and the synapse are ISOMA Soma Synapse STDP Presyn. Spike PE LPF A Presyn. Spike Raster AH 0 0.1 Spike probability RCK Postsyn. Spike B 0.05 0.1 0.05 0.1 0.08 0.06 0.04 0.02 0 0 Time(s) Figure 2: Principal neuron. A A simpliﬁed schematic is shown, including: the synapse, refractory and calcium-dependent potassium channel (RCK), soma, and axon-hillock (AH) circuits, plus their constituent elements, the pulse extender (PE) and the low-pass ﬁlter (LPF). B Spikes (dots) from 81 principal neurons are temporally dispersed, when excited by poisson-like inputs (58Hz) and inhibited by the common 8.3Hz theta rhythm (solid line). The histogram includes spikes from ﬁve theta cycles. composed of two reusable blocks: the low-pass ﬁlter (LPF) and the pulse extender (PE). The soma is a modiﬁed version of the LPF, which receives additional input from an axonhillock circuit (AH). RCK is inhibitory to the neuron. It consists of a PE, which models calcium inﬂux during a spike, and a LPF, which models calcium buffering. When AH ﬁres a spike, a packet of charge is dumped onto a capacitor in the PE. The PE’s output activates until the charge decays away, which takes a few milliseconds. Also, while the PE is active, charge accumulates on the LPF’s capacitor, lowering the LPF’s output voltage. Once the PE deactivates, this charge leaks away as well, but this takes tens of milliseconds because the leak is smaller. The PE’s and the LPF’s inhibitory effects on the soma are both described below in terms of the sum (ISHUNT ) of the currents their output voltages produce in pMOS transistors whose sources are at Vdd (see Figure 2A). Note that, in the absence of spikes, these currents decay exponentially, with a time-constant determined by their respective leaks. The synapse circuit is excitatory to the neuron. It is composed of a PE, which represents the neurotransmitter released into the synaptic cleft, and a LPF, which represents the bound neurotransmitter. The synapse circuit is similar to RCK in structure but differs in function: It is activated not by the principal neuron itself but by the STDP circuits (or directly by afferent spikes that bypass these circuits, i.e., ﬁxed synapses). The synapse’s effect on the soma is also described below in terms of the current (ISYN ) its output voltage produces in a pMOS transistor whose source is at Vdd. The soma circuit is a leaky integrator. It receives excitation from the synapse circuit and shunting inhibition from RCK and has a leak current as well. Its temporal behavior is described by: τ dISOMA ISYN I0 + ISOMA = dt ISHUNT where ISOMA is the current the capacitor’s voltage produces in a pMOS transistor whose source is at Vdd (see Figure 2A). ISHUNT is the sum of the leak, refractory, and calciumdependent potassium currents. These currents also determine the time constant: τ = C Ut κISHUNT , where I0 and κ are transistor parameters and Ut is the thermal voltage. STDP circuit ~LTP SRAM Presynaptic spike A ~LTD Inverse number of pairings Integrator Decay Postsynaptic spike Potentiation 0.1 0.05 0 0.05 0.1 Depression -80 -40 0 Presynaptic spike Postsynaptic spike 40 Spike timing: t pre - t post (ms) 80 B Figure 3: STDP circuit design and characterization. A The circuit is composed of three subcircuits: decay, integrator, and SRAM. B The circuit potentiates when the presynaptic spike precedes the postsynaptic spike and depresses when the postsynaptic spike precedes the presynaptic spike. The soma circuit is connected to an AH, the locus of spike generation. The AH consists of model voltage-dependent sodium and potassium channel populations (modiﬁed from [6] by Kai Hynna). It initiates the AER signaling process required to send a spike off chip. To characterize principal neuron variability, we excited 81 neurons with poisson-like 58Hz spike trains (Figure 2B). We made these spike trains poisson-like by starting with a regular 200Hz spike train and dropping spikes randomly, with probability of 0.71. Thus spikes were delivered to neurons that won the coin toss in synchrony every 5ms. However, neurons did not lock onto the input synchrony due to ﬁltering by the synaptic time constant (see Figure 2B). They also received a common inhibitory input at the theta frequency (8.3Hz), via their leak current. Each neuron was prevented from ﬁring more than one spike in a theta cycle by its model calcium-dependent potassium channel population. The principal neurons’ spike times were variable. To quantify the spike variability, we used timing precision, which we deﬁne as twice the standard deviation of spike times accumulated from ﬁve theta cycles. With an input rate of 58Hz the timing precision was 34ms. 3 STDP Circuit The STDP circuit (related to [7]-[8]), for which the STDP Chip is named, is the most abundant, with 21,504 copies on the chip. This circuit is built from three subcircuits: decay, integrator, and SRAM (Figure 3A). The decay and integrator are used to implement potentiation, and depression, in a symmetric fashion. The SRAM holds the current binary state of the synapse, either potentiated or depressed. For potentiation, the decay remembers the last presynaptic spike. Its capacitor is charged when that spike occurs and discharges linearly thereafter. A postsynaptic spike samples the charge remaining on the capacitor, passes it through an exponential function, and dumps the resultant charge into the integrator. This charge decays linearly thereafter. At the time of the postsynaptic spike, the SRAM, a cross-coupled inverter pair, reads the voltage on the integrator’s capacitor. If it exceeds a threshold, the SRAM switches state from depressed to potentiated (∼LTD goes high and ∼LTP goes low). The depression side of the STDP circuit is exactly symmetric, except that it responds to postsynaptic activation followed by presynaptic activation and switches the SRAM’s state from potentiated to depressed (∼LTP goes high and ∼LTD goes low). When the SRAM is in the potentiated state, the presynaptic 50 After STDP 83 92 100 Timing precision(ms) Before STDP 75 B Before STDP After STDP 40 30 20 10 0 50 60 70 80 90 Input rate(Hz) 100 50 58 67 text A 0.2 0.4 Time(s) 0.6 0.2 0.4 Time(s) 0.6 C Figure 4: Plasticity enhanced phase-coding. A Spike rasters of 81 neurons (9 by 9 cluster) display synchrony over a two-fold range of input rates after STDP. B The degree of enhancement is quantiﬁed by timing precision. C Each neuron (center box) sends synapses to (dark gray) and receives synapses from (light gray) twenty-one randomly chosen neighbors up to ﬁve nodes away (black indicates both connections). spike activates the principal neuron’s synapse; otherwise the spike has no effect. We characterized the STDP circuit by activating a plastic synapse and a ﬁxed synapse– which elicits a spike at different relative times. We repeated this pairing at 16Hz. We counted the number of pairings required to potentiate (or depress) the synapse. Based on this count, we calculated the efﬁcacy of each pairing as the inverse number of pairings required (Figure 3B). For example, if twenty pairings were required to potentiate the synapse, the efﬁcacy of that pre-before-post time-interval was one twentieth. The efﬁcacy of both potentiation and depression are ﬁt by exponentials with time constants of 11.4ms and 94.9ms, respectively. This behavior is similar to that observed in the hippocampus: potentiation has a shorter time constant and higher maximum efﬁcacy than depression [3]. 4 Recurrent Network We carried out an experiment designed to test the STDP circuit’s ability to compensate for variability in spike timing through PEP. Each neuron received recurrent connections from 21 randomly selected neurons within an 11 by 11 neighborhood centered on itself (see Figure 4C). Conversely, it made recurrent connections to randomly chosen neurons within the same neighborhood. These connections were mediated by STDP circuits, initialized to the depressed state. We chose a 9 by 9 cluster of neurons and delivered spikes at a mean rate of 50 to 100Hz to each one (dropping spikes with a probability of 0.75 to 0.5 from a regular 200Hz train) and provided common theta inhibition as before. We compared the variability in spike timing after ﬁve seconds of learning with the initial distribution. Phase coding was enhanced after STDP (Figure 4A). Before STDP, spike timing among neurons was highly variable (except for the very highest input rate). After STDP, variability was virtually eliminated (except for the very lowest input rate). Initially, the variability, characterized by timing precision, was inversely related to the input rate, decreasing from 34 to 13ms. After ﬁve seconds of STDP, variability decreased and was largely independent of input rate, remaining below 11ms. Potentiated synapses 25 A Synaptic state after STDP 20 15 10 5 0 B 50 100 150 200 Spiking order 250 Figure 5: Compensating for variability. A Some synapses (dots) become potentiated (light) while others remain depressed (dark) after STDP. B The number of potentiated synapses neurons make (pluses) and receive (circles) is negatively (r = -0.71) and positively (r = 0.76) correlated to their rank in the spiking order, respectively. Comparing the number of potentiated synapses each neuron made or received with its excitability conﬁrmed the PEP hypothesis (i.e., leading neurons provide additional synaptic current to lagging neurons via potentiated recurrent synapses). In this experiment, to eliminate variability due to noise (as opposed to excitability), we provided a 17 by 17 cluster of neurons with a regular 200Hz excitatory input. Theta inhibition was present as before and all synapses were initialized to the depressed state. After 10 seconds of STDP, a large fraction of the synapses were potentiated (Figure 5A). When the number of potentiated synapses each neuron made or received was plotted versus its rank in spiking order (Figure 5B), a clear correlation emerged (r = -0.71 or 0.76, respectively). As expected, neurons that spiked early made more and received fewer potentiated synapses. In contrast, neurons that spiked late made fewer and received more potentiated synapses. 5 Pattern Completion After STDP, we found that the network could recall an entire pattern given a subset, thus the same mechanisms that compensated for variability and noise could also compensate for lack of information. We chose a 9 by 9 cluster of neurons as our pattern and delivered a poisson-like spike train with mean rate of 67Hz to each one as in the ﬁrst experiment. Theta inhibition was present as before and all synapses were initialized to the depressed state. Before STDP, we stimulated a subset of the pattern and only neurons in that subset spiked (Figure 6A). After ﬁve seconds of STDP, we stimulated the same subset again. This time they recruited spikes from other neurons in the pattern, completing it (Figure 6B). Upon varying the fraction of the pattern presented, we found that the fraction recalled increased faster than the fraction presented. We selected subsets of the original pattern randomly, varying the fraction of neurons chosen from 0.1 to 1.0 (ten trials for each). We classiﬁed neurons as active if they spiked in the two second period over which we recorded. Thus, we characterized PEP’s pattern-recall performance as a function of the probability that the pattern in question’s neurons are activated (Figure 6C). At a fraction of 0.50 presented, nearly all of the neurons in the pattern are consistently activated (0.91±0.06), showing robust pattern completion. We ﬁtted the recall performance with a sigmoid that reached 0.50 recall fraction with an input fraction of 0.30. No spurious neurons were activated during any trials. Rate(Hz) Rate(Hz) 8 7 7 6 6 5 5 0.6 0.4 2 0.2 0 0 3 3 2 1 1 A 0.8 4 4 Network activity before STDP 1 Fraction of pattern actived 8 0 B Network activity after STDP C 0 0.2 0.4 0.6 0.8 Fraction of pattern stimulated 1 Figure 6: Associative recall. A Before STDP, half of the neurons in a pattern are stimulated; only they are activated. B After STDP, half of the neurons in a pattern are stimulated, and all are activated. C The fraction of the pattern activated grows faster than the fraction stimulated. 6 Discussion Our results demonstrate that PEP successfully compensates for graded variations in our silicon recurrent network using binary (on–off) synapses (in contrast with [8], where weights are graded). While our chip results are encouraging, variability was not eliminated in every case. In the case of the lowest input (50Hz), we see virtually no change (Figure 4A). We suspect the timing remains imprecise because, with such low input, neurons do not spike every theta cycle and, consequently, provide fewer opportunities for the STDP synapses to potentiate. This shortfall illustrates the system’s limits; it can only compensate for variability within certain bounds, and only for activity appropriate to the PEP model. As expected, STDP is the mechanism responsible for PEP. STDP potentiated recurrent synapses from leading neurons to lagging neurons, reducing the disparity among the diverse population of neurons. Even though the STDP circuits are themselves variable, with different efﬁcacies and time constants, when using timing the sign of the weight-change is always correct (data not shown). For this reason, we chose STDP over other more physiological implementations of plasticity, such as membrane-voltage-dependent plasticity (MVDP), which has the capability to learn with graded voltage signals [9], such as those found in active dendrites, providing more computational power [10]. Previously, we investigated a MVDP circuit, which modeled a voltage-dependent NMDAreceptor-gated synapse [11]. It potentiated when the calcium current analog exceeded a threshold, which was designed to occur only during a dendritic action potential. This circuit produced behavior similar to STDP, implying it could be used in PEP. However, it was sensitive to variability in the NMDA and potentiation thresholds, causing a fraction of the population to potentiate anytime the synapse received an input and another fraction to never potentiate, rendering both subpopulations useless. Therefore, the simpler, less biophysical STDP circuit won out over the MVDP circuit: In our system timing is everything. Associative storage and recall naturally emerge in the PEP network when synapses between neurons coactivated by a pattern are potentiated. These synapses allow neurons to recruit their peers when a subset of the pattern is presented, thereby completing the pattern. However, this form of pattern storage and completion differs from Hopﬁeld’s attractor model [12] . Rather than forming symmetric, recurrent neuronal circuits, our recurrent network forms asymmetric circuits in which neurons make connections exclusively to less excitable neurons in the pattern. In both the poisson-like and regular cases (Figures 4 & 5), only about six percent of potentiated connections were reciprocated, as expected by chance. We plan to investigate the storage capacity of this asymmetric form of associative memory. Our system lends itself to modeling brain regions that use precise spike timing, such as the hippocampus. We plan to extend the work presented to store and recall sequences of patterns, as the hippocampus is hypothesized to do. Place cells that represent different locations spike at different phases of the theta cycle, in relation to the distance to their preferred locations. This sequential spiking will allow us to link patterns representing different locations in the order those locations are visited, thereby realizing episodic memory. We propose PEP as a candidate neural mechanism for information coding and storage in the hippocampal system. Observations from the CA1 region of the hippocampus suggest that basal dendrites (which primarily receive excitation from recurrent connections) support submillisecond timing precision, consistent with PEP [13]. We have shown, in a silicon model, PEP’s ability to exploit such fast recurrent connections to sharpen timing precision as well as to associatively store and recall patterns. Acknowledgments We thank Joe Lin for assistance with chip generation. The Ofﬁce of Naval Research funded this work (Award No. N000140210468). References [1] O’Keefe J. & Recce M.L. (1993). Phase relationship between hippocampal place units and the EEG theta rhythm. Hippocampus 3(3):317-330. [2] Mehta M.R., Lee A.K. & Wilson M.A. (2002) Role of experience and oscillations in transforming a rate code into a temporal code. Nature 417(6890):741-746. [3] Bi G.Q. & Wang H.X. (2002) Temporal asymmetry in spike timing-dependent synaptic plasticity. Physiology & Behavior 77:551-555. [4] Rodriguez-Vazquez, A., Linan, G., Espejo S. & Dominguez-Castro R. (2003) Mismatch-induced trade-offs and scalability of analog preprocessing visual microprocessor chips. Analog Integrated Circuits and Signal Processing 37:73-83. [5] Boahen K.A. (2000) Point-to-point connectivity between neuromorphic chips using address events. IEEE Transactions on Circuits and Systems II 47:416-434. [6] Culurciello E.R., Etienne-Cummings R. & Boahen K.A. (2003) A biomorphic digital image sensor. IEEE Journal of Solid State Circuits 38:281-294. [7] Boﬁll A., Murray A.F & Thompson D.P. (2005) Citcuits for VLSI Implementation of Temporally Asymmetric Hebbian Learning. In: Advances in Neural Information Processing Systems 14, MIT Press, 2002. [8] Cameron K., Boonsobhak V., Murray A. & Renshaw D. (2005) Spike timing dependent plasticity (STDP) can ameliorate process variations in neuromorphic VLSI. IEEE Transactions on Neural Networks 16(6):1626-1627. [9] Chicca E., Badoni D., Dante V., D’Andreagiovanni M., Salina G., Carota L., Fusi S. & Del Giudice P. (2003) A VLSI recurrent network of integrate-and-ﬁre neurons connected by plastic synapses with long-term memory. IEEE Transaction on Neural Networks 14(5):1297-1307. [10] Poirazi P., & Mel B.W. (2001) Impact of active dendrites and structural plasticity on the memory capacity of neural tissue. Neuron 29(3)779-796. [11] Arthur J.V. & Boahen K. (2004) Recurrently connected silicon neurons with active dendrites for one-shot learning. In: IEEE International Joint Conference on Neural Networks 3, pp.1699-1704. [12] Hopﬁeld J.J. (1984) Neurons with graded response have collective computational properties like those of two-state neurons. Proceedings of the National Academy of Science 81(10):3088-3092. [13] Ariav G., Polsky A. & Schiller J. (2003) Submillisecond precision of the input-output transformation function mediated by fast sodium dendritic spikes in basal dendrites of CA1 pyramidal neurons. Journal of Neuroscience 23(21):7750-7758.

6 0.59366047 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks

7 0.58439636 39 nips-2005-Beyond Pair-Based STDP: a Phenomenological Rule for Spike Triplet and Frequency Effects

8 0.55191761 6 nips-2005-A Connectionist Model for Constructive Modal Reasoning

9 0.546157 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

10 0.52584285 134 nips-2005-Neural mechanisms of contrast dependent receptive field size in V1

11 0.40849787 164 nips-2005-Representing Part-Whole Relationships in Recurrent Neural Networks

12 0.39165601 61 nips-2005-Dynamical Synapses Give Rise to a Power-Law Distribution of Neuronal Avalanches

13 0.3908419 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models

14 0.34280512 68 nips-2005-Factorial Switching Kalman Filters for Condition Monitoring in Neonatal Intensive Care

15 0.32690284 49 nips-2005-Convergence and Consistency of Regularized Boosting Algorithms with Stationary B-Mixing Observations

16 0.3232106 197 nips-2005-Unbiased Estimator of Shape Parameter for Spiking Irregularities under Changing Environments

17 0.31950676 129 nips-2005-Modeling Neural Population Spiking Activity with Gibbs Distributions

18 0.2914716 50 nips-2005-Convex Neural Networks

19 0.28438506 141 nips-2005-Norepinephrine and Neural Interrupts

20 0.28289703 188 nips-2005-Temporally changing synaptic plasticity

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.033), (10, 0.04), (27, 0.025), (31, 0.046), (34, 0.054), (39, 0.052), (44, 0.403), (55, 0.019), (57, 0.017), (60, 0.019), (69, 0.061), (73, 0.012), (77, 0.01), (88, 0.064), (91, 0.056)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83339322 64 nips-2005-Efficient estimation of hidden state dynamics from spike trains

Author: Marton G. Danoczy, Richard H. R. Hahnloser

2 0.67165709 124 nips-2005-Measuring Shared Information and Coordinated Activity in Neuronal Networks

Author: Kristina Klinkner, Cosma Shalizi, Marcelo Camperi

Abstract: Most nervous systems encode information about stimuli in the responding activity of large neuronal networks. This activity often manifests itself as dynamically coordinated sequences of action potentials. Since multiple electrode recordings are now a standard tool in neuroscience research, it is important to have a measure of such network-wide behavioral coordination and information sharing, applicable to multiple neural spike train data. We propose a new statistic, informational coherence, which measures how much better one unit can be predicted by knowing the dynamical state of another. We argue informational coherence is a measure of association and shared information which is superior to traditional pairwise measures of synchronization and correlation. To ﬁnd the dynamical states, we use a recently-introduced algorithm which reconstructs effective state spaces from stochastic time series. We then extend the pairwise measure to a multivariate analysis of the network by estimating the network multi-information. We illustrate our method by testing it on a detailed model of the transition from gamma to beta rhythms. Much of the most important information in neural systems is shared over multiple neurons or cortical areas, in such forms as population codes and distributed representations [1]. On behavioral time scales, neural information is stored in temporal patterns of activity as opposed to static markers; therefore, as information is shared between neurons or brain regions, it is physically instantiated as coordination between entire sequences of neural spikes. Furthermore, neural systems and regions of the brain often require coordinated neural activity to perform important functions; acting in concert requires multiple neurons or cortical areas to share information [2]. Thus, if we want to measure the dynamic network-wide behavior of neurons and test hypotheses about them, we need reliable, practical methods to detect and quantify behavioral coordination and the associated information sharing across multiple neural units. These would be especially useful in testing ideas about how particular forms of coordination relate to distributed coding (e.g., that of [3]). Current techniques to analyze relations among spike trains handle only pairs of neurons, so we further need a method which is extendible to analyze the coordination in the network, system, or region as a whole. Here we propose a new measure of behavioral coordination and information sharing, informational coherence, based on the notion of dynamical state. Section 1 argues that coordinated behavior in neural systems is often not captured by exist- ing measures of synchronization or correlation, and that something sensitive to nonlinear, stochastic, predictive relationships is needed. Section 2 deﬁnes informational coherence as the (normalized) mutual information between the dynamical states of two systems and explains how looking at the states, rather than just observables, fulﬁlls the needs laid out in Section 1. Since we rarely know the right states a prori, Section 2.1 brieﬂy describes how we reconstruct effective state spaces from data. Section 2.2 gives some details about how we calculate the informational coherence and approximate the global information stored in the network. Section 3 applies our method to a model system (a biophysically detailed conductance-based model) comparing our results to those of more familiar second-order statistics. In the interest of space, we omit proofs and a full discussion of the existing literature, giving only minimal references here; proofs and references will appear in a longer paper now in preparation. 1 Synchrony or Coherence? Most hypotheses which involve the idea that information sharing is reﬂected in coordinated activity across neural units invoke a very speciﬁc notion of coordinated activity, namely strict synchrony: the units should be doing exactly the same thing (e.g., spiking) at exactly the same time. Investigators then measure coordination by measuring how close the units come to being strictly synchronized (e.g., variance in spike times). From an informational point of view, there is no reason to favor strict synchrony over other kinds of coordination. One neuron consistently spiking 50 ms after another is just as informative a relationship as two simultaneously spiking, but such stable phase relations are missed by strict-synchrony approaches. Indeed, whatever the exact nature of the neural code, it uses temporally extended patterns of activity, and so information sharing should be reﬂected in coordination of those patterns, rather than just the instantaneous activity. There are three common ways of going beyond strict synchrony: cross-correlation and related second-order statistics, mutual information, and topological generalized synchrony. The cross-correlation function (the normalized covariance function; this includes, for present purposes, the joint peristimulus time histogram [2]), is one of the most widespread measures of synchronization. It can be efﬁciently calculated from observable series; it handles statistical as well as deterministic relationships between processes; by incorporating variable lags, it reduces the problem of phase locking. Fourier transformation of the covariance function γXY (h) yields the cross-spectrum FXY (ν), which in turn gives the 2 spectral coherence cXY (ν) = FXY (ν)/FX (ν)FY (ν), a normalized correlation between the Fourier components of X and Y . Integrated over frequencies, the spectral coherence measures, essentially, the degree of linear cross-predictability of the two series. ([4] applies spectral coherence to coordinated neural activity.) However, such second-order statistics only handle linear relationships. Since neural processes are known to be strongly nonlinear, there is little reason to think these statistics adequately measure coordination and synchrony in neural systems. Mutual information is attractive because it handles both nonlinear and stochastic relationships and has a very natural and appealing interpretation. Unfortunately, it often seems to fail in practice, being disappointingly small even between signals which are known to be tightly coupled [5]. The major reason is that the neural codes use distinct patterns of activity over time, rather than many different instantaneous actions, and the usual approach misses these extended patterns. Consider two neurons, one of which drives the other to spike 50 ms after it does, the driving neuron spiking once every 500 ms. These are very tightly coordinated, but whether the ﬁrst neuron spiked at time t conveys little information about what the second neuron is doing at t — it’s not spiking, but it’s not spiking most of the time anyway. Mutual information calculated from the direct observations conﬂates the “no spike” of the second neuron preparing to ﬁre with its just-sitting-around “no spike”. Here, mutual information could ﬁnd the coordination if we used a 50 ms lag, but that won’t work in general. Take two rate-coding neurons with base-line ﬁring rates of 1 Hz, and suppose that a stimulus excites one to 10 Hz and suppresses the other to 0.1 Hz. The spiking rates thus share a lot of information, but whether the one neuron spiked at t is uninformative about what the other neuron did then, and lagging won’t help. Generalized synchrony is based on the idea of establishing relationships between the states of the various units. “State” here is taken in the sense of physics, dynamics and control theory: the state at time t is a variable which ﬁxes the distribution of observables at all times ≥ t, rendering the past of the system irrelevant [6]. Knowing the state allows us to predict, as well as possible, how the system will evolve, and how it will respond to external forces [7]. Two coupled systems are said to exhibit generalized synchrony if the state of one system is given by a mapping from the state of the other. Applications to data employ statespace reconstruction [8]: if the state x ∈ X evolves according to smooth, d-dimensional deterministic dynamics, and we observe a generic function y = f (x), then the space Y of time-delay vectors [y(t), y(t − τ ), ...y(t − (k − 1)τ )] is diffeomorphic to X if k > 2d, for generic choices of lag τ . The various versions of generalized synchrony differ on how, precisely, to quantify the mappings between reconstructed state spaces, but they all appear to be empirically equivalent to one another and to notions of phase synchronization based on Hilbert transforms [5]. Thus all of these measures accommodate nonlinear relationships, and are potentially very ﬂexible. Unfortunately, there is essentially no reason to believe that neural systems have deterministic dynamics at experimentally-accessible levels of detail, much less that there are deterministic relationships among such states for different units. What we want, then, but none of these alternatives provides, is a quantity which measures predictive relationships among states, but allows those relationships to be nonlinear and stochastic. The next section introduces just such a measure, which we call “informational coherence”. 2 States and Informational Coherence There are alternatives to calculating the “surface” mutual information between the sequences of observations themselves (which, as described, fails to capture coordination). If we know that the units are phase oscillators, or rate coders, we can estimate their instantaneous phase or rate and, by calculating the mutual information between those variables, see how coordinated the units’ patterns of activity are. However, phases and rates do not exhaust the repertoire of neural patterns and a more general, common scheme is desirable. The most general notion of “pattern of activity” is simply that of the dynamical state of the system, in the sense mentioned above. We now formalize this. Assuming the usual notation for Shannon information [9], the information content of a state variable X is H[X] and the mutual information between X and Y is I[X; Y ]. As is well-known, I[X; Y ] ≤ min H[X], H[Y ]. We use this to normalize the mutual state information to a 0 − 1 scale, and this is the informational coherence (IC). ψ(X, Y ) = I[X; Y ] , with 0/0 = 0 . min H[X], H[Y ] (1) ψ can be interpreted as follows. I[X; Y ] is the Kullback-Leibler divergence between the joint distribution of X and Y , and the product of their marginal distributions [9], indicating the error involved in ignoring the dependence between X and Y . The mutual information between predictive, dynamical states thus gauges the error involved in assuming the two systems are independent, i.e., how much predictions could improve by taking into account the dependence. Hence it measures the amount of dynamically-relevant information shared between the two systems. ψ simply normalizes this value, and indicates the degree to which two systems have coordinated patterns of behavior (cf. [10], although this only uses directly observable quantities). 2.1 Reconstruction and Estimation of Effective State Spaces As mentioned, the state space of a deterministic dynamical system can be reconstructed from a sequence of observations. This is the main tool of experimental nonlinear dynamics [8]; but the assumption of determinism is crucial and false, for almost any interesting neural system. While classical state-space reconstruction won’t work on stochastic processes, such processes do have state-space representations [11], and, in the special case of discretevalued, discrete-time series, there are ways to reconstruct the state space. Here we use the CSSR algorithm, introduced in [12] (code available at http://bactra.org/CSSR). This produces causal state models, which are stochastic automata capable of statistically-optimal nonlinear prediction; the state of the machine is a minimal sufﬁcient statistic for the future of the observable process[13].1 The basic idea is to form a set of states which should be (1) Markovian, (2) sufﬁcient statistics for the next observable, and (3) have deterministic transitions (in the automata-theory sense). The algorithm begins with a minimal, one-state, IID model, and checks whether these properties hold, by means of hypothesis tests. If they fail, the model is modiﬁed, generally but not always by adding more states, and the new model is checked again. Each state of the model corresponds to a distinct distribution over future events, i.e., to a statistical pattern of behavior. Under mild conditions, which do not involve prior knowledge of the state space, CSSR converges in probability to the unique causal state model of the data-generating process [12]. In practice, CSSR is quite fast (linear in the data size), and generalizes at least as well as training hidden Markov models with the EM algorithm and using cross-validation for selection, the standard heuristic [12]. One advantage of the causal state approach (which it shares with classical state-space reconstruction) is that state estimation is greatly simpliﬁed. In the general case of nonlinear state estimation, it is necessary to know not just the form of the stochastic dynamics in the state space and the observation function, but also their precise parametric values and the distribution of observation and driving noises. Estimating the state from the observable time series then becomes a computationally-intensive application of Bayes’s Rule [17]. Due to the way causal states are built as statistics of the data, with probability 1 there is a ﬁnite time, t, at which the causal state at time t is certain. This is not just with some degree of belief or conﬁdence: because of the way the states are constructed, it is impossible for the process to be in any other state at that time. Once the causal state has been established, it can be updated recursively, i.e., the causal state at time t + 1 is an explicit function of the causal state at time t and the observation at t + 1. The causal state model can be automatically converted, therefore, into a ﬁnite-state transducer which reads in an observation time series and outputs the corresponding series of states [18, 13]. (Our implementation of CSSR ﬁlters its training data automatically.) The result is a new time series of states, from which all non-predictive components have been ﬁltered out. 2.2 Estimating the Coherence Our algorithm for estimating the matrix of informational coherences is as follows. For each unit, we reconstruct the causal state model, and ﬁlter the observable time series to produce a series of causal states. Then, for each pair of neurons, we construct a joint histogram of 1 Causal state models have the same expressive power as observable operator models [14] or predictive state representations [7], and greater power than variable-length Markov models [15, 16]. a b Figure 1: Rastergrams of neuronal spike-times in the network. Excitatory, pyramidal neurons (numbers 1 to 1000) are shown in green, inhibitory interneurons (numbers 1001 to 1300) in red. During the ﬁrst 10 seconds (a), the current connections among the pyramidal cells are suppressed and a gamma rhythm emerges (left). At t = 10s, those connections become active, leading to a beta rhythm (b, right). the state distribution, estimate the mutual information between the states, and normalize by the single-unit state informations. This gives a symmetric matrix of ψ values. Even if two systems are independent, their estimated IC will, on average, be positive, because, while they should have zero mutual information, the empirical estimate of mutual information is non-negative. Thus, the signiﬁcance of IC values must be assessed against the null hypothesis of system independence. The easiest way to do so is to take the reconstructed state models for the two systems and run them forward, independently of one another, to generate a large number of simulated state sequences; from these calculate values of the IC. This procedure will approximate the sampling distribution of the IC under a null model which preserves the dynamics of each system, but not their interaction. We can then ﬁnd p-values as usual. We omit them here to save space. 2.3 Approximating the Network Multi-Information There is broad agreement [2] that analyses of networks should not just be an analysis of pairs of neurons, averaged over pairs. Ideally, an analysis of information sharing in a network would look at the over-all structure of statistical dependence between the various units, reﬂected in the complete joint probability distribution P of the states. This would then allow us, for instance, to calculate the n-fold multi-information, I[X1 , X2 , . . . Xn ] ≡ D(P ||Q), the Kullback-Leibler divergence between the joint distribution P and the product of marginal distributions Q, analogous to the pairwise mutual information [19]. Calculated over the predictive states, the multi-information would give the total amount of shared dynamical information in the system. Just as we normalized the mutual information I[X1 , X2 ] by its maximum possible value, min H[X1 ], H[X2 ], we normalize the multiinformation by its maximum, which is the smallest sum of n − 1 marginal entropies: I[X1 ; X2 ; . . . Xn ] ≤ min k H[Xn ] i=k Unfortunately, P is a distribution over a very high dimensional space and so, hard to estimate well without strong parametric constraints. We thus consider approximations. The lowest-order approximation treats all the units as independent; this is the distribution Q. One step up are tree distributions, where the global distribution is a function of the joint distributions of pairs of units. Not every pair of units needs to enter into such a distribution, though every unit must be part of some pair. Graphically, a tree distribution corresponds to a spanning tree, with edges linking units whose interactions enter into the global probability, and conversely spanning trees determine tree distributions. Writing ET for the set of pairs (i, j) and abbreviating X1 = x1 , X2 = x2 , . . . Xn = xn by X = x, one has n T (X = x) = (i,j)∈ET T (Xi = xi , Xj = xj ) T (Xi = xi ) T (Xi = xi )T (Xj = xj ) i=1 (2) where the marginal distributions T (Xi ) and the pair distributions T (Xi , Xj ) are estimated by the empirical marginal and pair distributions. We must now pick edges ET so that T best approximates the true global distribution P . A natural approach is to minimize D(P ||T ), the divergence between P and its tree approximation. Chow and Liu [20] showed that the maximum-weight spanning tree gives the divergence-minimizing distribution, taking an edge’s weight to be the mutual information between the variables it links. There are three advantages to using the Chow-Liu approximation. (1) Estimating T from empirical probabilities gives a consistent maximum likelihood estimator of the ideal ChowLiu tree [20], with reasonable rates of convergence, so T can be reliably known even if P cannot. (2) There are efﬁcient algorithms for constructing maximum-weight spanning trees, such as Prim’s algorithm [21, sec. 23.2], which runs in time O(n2 + n log n). Thus, the approximation is computationally tractable. (3) The KL divergence of the Chow-Liu distribution from Q gives a lower bound on the network multi-information; that bound is just the sum of the mutual informations along the edges in the tree: I[X1 ; X2 ; . . . Xn ] ≥ D(T ||Q) = I[Xi ; Xj ] (3) (i,j)∈ET Even if we knew P exactly, Eq. 3 would be useful as an alternative to calculating D(P ||Q) directly, evaluating log P (x)/Q(x) for all the exponentially-many conﬁgurations x. It is natural to seek higher-order approximations to P , e.g., using three-way interactions not decomposable into pairwise interactions [22, 19]. But it is hard to do so effectively, because ﬁnding the optimal approximation to P when such interactions are allowed is NP [23], and analytical formulas like Eq. 3 generally do not exist [19]. We therefore conﬁne ourselves to the Chow-Liu approximation here. 3 Example: A Model of Gamma and Beta Rhythms We use simulated data as a test case, instead of empirical multiple electrode recordings, which allows us to try the method on a system of over 1000 neurons and compare the measure against expected results. The model, taken from [24], was originally designed to study episodes of gamma (30–80Hz) and beta (12–30Hz) oscillations in the mammalian nervous system, which often occur successively with a spontaneous transition between them. More concretely, the rhythms studied were those displayed by in vitro hippocampal (CA1) slice preparations and by in vivo neocortical EEGs. The model contains two neuron populations: excitatory (AMPA) pyramidal neurons and inhibitory (GABAA ) interneurons, deﬁned by conductance-based Hodgkin-Huxley-style equations. Simulations were carried out in a network of 1000 pyramidal cells and 300 interneurons. Each cell was modeled as a one-compartment neuron with all-to-all coupling, endowed with the basic sodium and potassium spiking currents, an external applied current, and some Gaussian input noise. The ﬁrst 10 seconds of the simulation correspond to the gamma rhythm, in which only a group of neurons is made to spike via a linearly increasing applied current. The beta rhythm a b c d Figure 2: Heat-maps of coordination for the network, as measured by zero-lag cross-correlation (top row) and informational coherence (bottom), contrasting the gamma rhythm (left column) with the beta (right). Colors run from red (no coordination) through yellow to pale cream (maximum). (subsequent 10 seconds) is obtained by activating pyramidal-pyramidal recurrent connections (potentiated by Hebbian preprocessing as a result of synchrony during the gamma rhythm) and a slow outward after-hyper-polarization (AHP) current (the M-current), suppressed during gamma due to the metabotropic activation used in the generation of the rhythm. During the beta rhythm, pyramidal cells, silent during gamma rhythm, ﬁre on a subset of interneurons cycles (Fig. 1). Fig. 2 compares zero-lag cross-correlation, a second-order method of quantifying coordination, with the informational coherence calculated from the reconstructed states. (In this simulation, we could have calculated the actual states of the model neurons directly, rather than reconstructing them, but for purposes of testing our method we did not.) Crosscorrelation ﬁnds some of the relationships visible in Fig. 1, but is confused by, for instance, the phase shifts between pyramidal cells. (Surface mutual information, not shown, gives similar results.) Informational coherence, however, has no trouble recognizing the two populations as effectively coordinated blocks. The presence of dynamical noise, problematic for ordinary state reconstruction, is not an issue. The average IC is 0.411 (or 0.797 if the inactive, low-numbered neurons are excluded). The tree estimate of the global informational multi-information is 3243.7 bits, with a global coherence of 0.777. The right half of Fig. 2 repeats this analysis for the beta rhythm; in this stage, the average IC is 0.614, and the tree estimate of the global multi-information is 7377.7 bits, though the estimated global coherence falls very slightly to 0.742. This is because low-numbered neurons which were quiescent before are now active, contributing to the global information, but the over-all pattern is somewhat weaker and more noisy (as can be seen from Fig. 1b.) So, as expected, the total information content is higher, but the overall coordination across the network is lower. 4 Conclusion Informational coherence provides a measure of neural information sharing and coordinated activity which accommodates nonlinear, stochastic relationships between extended patterns of spiking. It is robust to dynamical noise and leads to a genuinely multivariate measure of global coordination across networks or regions. Applied to data from multi-electrode recordings, it should be a valuable tool in evaluating hypotheses about distributed neural representation and function. Acknowledgments Thanks to R. Haslinger, E. Ionides and S. Page; and for support to the Santa Fe Institute (under grants from Intel, the NSF and the MacArthur Foundation, and DARPA agreement F30602-00-2-0583), the Clare Booth Luce Foundation (KLK) and the James S. McDonnell Foundation (CRS). References [1] L. F. Abbott and T. J. Sejnowski, eds. Neural Codes and Distributed Representations. MIT Press, 1998. [2] E. N. Brown, R. E. Kass, and P. P. Mitra. Nature Neuroscience, 7:456–461, 2004. [3] D. H. Ballard, Z. Zhang, and R. P. N. Rao. In R. P. N. Rao, B. A. Olshausen, and M. S. Lewicki, eds., Probabilistic Models of the Brain, pp. 273–284, MIT Press, 2002. [4] D. R. Brillinger and A. E. P. Villa. In D. R. Brillinger, L. T. Fernholz, and S. Morgenthaler, eds., The Practice of Data Analysis, pp. 77–92. Princeton U.P., 1997. [5] R. Quian Quiroga et al. Physical Review E, 65:041903, 2002. [6] R. F. Streater. Statistical Dynamics. Imperial College Press, London. [7] M. L. Littman, R. S. Sutton, and S. Singh. In T. G. Dietterich, S. Becker, and Z. Ghahramani, eds., Advances in Neural Information Processing Systems 14, pp. 1555–1561. MIT Press, 2002. [8] H. Kantz and T. Schreiber. Nonlinear Time Series Analysis. Cambridge U.P., 1997. [9] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley, 1991. [10] M. Palus et al. Physical Review E, 63:046211, 2001. [11] F. B. Knight. Annals of Probability, 3:573–596, 1975. [12] C. R. Shalizi and K. L. Shalizi. In M. Chickering and J. Halpern, eds., Uncertainty in Artiﬁcial Intelligence: Proceedings of the Twentieth Conference, pp. 504–511. AUAI Press, 2004. [13] C. R. Shalizi and J. P. Crutchﬁeld. Journal of Statistical Physics, 104:817–819, 2001. [14] H. Jaeger. Neural Computation, 12:1371–1398, 2000. [15] D. Ron, Y. Singer, and N. Tishby. Machine Learning, 25:117–149, 1996. [16] P. B¨ hlmann and A. J. Wyner. Annals of Statistics, 27:480–513, 1999. u [17] N. U. Ahmed. Linear and Nonlinear Filtering for Scientists and Engineers. World Scientiﬁc, 1998. [18] D. R. Upper. PhD thesis, University of California, Berkeley, 1997. [19] E. Schneidman, S. Still, M. J. Berry, and W. Bialek. Physical Review Letters, 91:238701, 2003. [20] C. K. Chow and C. N. Liu. IEEE Transactions on Information Theory, IT-14:462–467, 1968. [21] T. H. Cormen et al. Introduction to Algorithms. 2nd ed. MIT Press, 2001. [22] S. Amari. IEEE Transacttions on Information Theory, 47:1701–1711, 2001. [23] S. Kirshner, P. Smyth, and A. Robertson. Tech. Rep. 04-04, UC Irvine, Information and Computer Science, 2004. [24] M. S. Olufsen et al. Journal of Computational Neuroscience, 14:33–54, 2003.

3 0.58362097 13 nips-2005-A Probabilistic Approach for Optimizing Spectral Clustering

Author: Rong Jin, Feng Kang, Chris H. Ding

Abstract: Spectral clustering enjoys its success in both data clustering and semisupervised learning. But, most spectral clustering algorithms cannot handle multi-class clustering problems directly. Additional strategies are needed to extend spectral clustering algorithms to multi-class clustering problems. Furthermore, most spectral clustering algorithms employ hard cluster membership, which is likely to be trapped by the local optimum. In this paper, we present a new spectral clustering algorithm, named “Soft Cut”. It improves the normalized cut algorithm by introducing soft membership, and can be efﬁciently computed using a bound optimization algorithm. Our experiments with a variety of datasets have shown the promising performance of the proposed clustering algorithm. 1

4 0.51683402 49 nips-2005-Convergence and Consistency of Regularized Boosting Algorithms with Stationary B-Mixing Observations

Author: Aurelie C. Lozano, Sanjeev R. Kulkarni, Robert E. Schapire

Abstract: We study the statistical convergence and consistency of regularized Boosting methods, where the samples are not independent and identically distributed (i.i.d.) but come from empirical processes of stationary β-mixing sequences. Utilizing a technique that constructs a sequence of independent blocks close in distribution to the original samples, we prove the consistency of the composite classiﬁers resulting from a regularization achieved by restricting the 1-norm of the base classiﬁers’ weights. When compared to the i.i.d. case, the nature of sampling manifests in the consistency result only through generalization of the original condition on the growth of the regularization parameter.

5 0.34217212 153 nips-2005-Policy-Gradient Methods for Planning

Author: Douglas Aberdeen

Abstract: Probabilistic temporal planning attempts to ﬁnd good policies for acting in domains with concurrent durative tasks, multiple uncertain outcomes, and limited resources. These domains are typically modelled as Markov decision problems and solved using dynamic programming methods. This paper demonstrates the application of reinforcement learning — in the form of a policy-gradient method — to these domains. Our emphasis is large domains that are infeasible for dynamic programming. Our approach is to construct simple policies, or agents, for each planning task. The result is a general probabilistic temporal planner, named the Factored Policy-Gradient Planner (FPG-Planner), which can handle hundreds of tasks, optimising for probability of success, duration, and resource use. 1

6 0.33032304 8 nips-2005-A Criterion for the Convergence of Learning with Spike Timing Dependent Plasticity

7 0.32358533 181 nips-2005-Spiking Inputs to a Winner-take-all Network

8 0.31673282 96 nips-2005-Inference with Minimal Communication: a Decision-Theoretic Variational Approach

9 0.31623623 30 nips-2005-Assessing Approximations for Gaussian Process Classification

10 0.31576106 67 nips-2005-Extracting Dynamical Structure Embedded in Neural Activity

11 0.31233063 157 nips-2005-Principles of real-time computing with feedback applied to cortical microcircuit models

12 0.31044427 20 nips-2005-Affine Structure From Sound

13 0.30927813 90 nips-2005-Hot Coupling: A Particle Approach to Inference and Normalization on Pairwise Undirected Graphs

14 0.30909926 111 nips-2005-Learning Influence among Interacting Markov Chains

15 0.30640572 193 nips-2005-The Role of Top-down and Bottom-up Processes in Guiding Eye Movements during Visual Search

16 0.3044183 200 nips-2005-Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery

17 0.30415106 136 nips-2005-Noise and the two-thirds power Law

18 0.30319417 32 nips-2005-Augmented Rescorla-Wagner and Maximum Likelihood Estimation

19 0.30045205 14 nips-2005-A Probabilistic Interpretation of SVMs with an Application to Unbalanced Classification

20 0.30011925 169 nips-2005-Saliency Based on Information Maximization