nips nips2012 nips2012-238 knowledge-graph by maker-knowledge-mining

238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks


Source: pdf

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. [sent-14, score-0.546]

2 Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. [sent-16, score-0.407]

3 The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. [sent-17, score-1.172]

4 We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. [sent-18, score-0.86]

5 Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. [sent-19, score-0.657]

6 The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. [sent-20, score-0.366]

7 It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. [sent-22, score-0.619]

8 1 Introduction By giving reward at the right times, animals like monkeys can be trained to perform complex tasks that require the mapping of sensory stimuli onto responses, the storage of information in working memory and the integration of uncertain sensory evidence. [sent-23, score-0.939]

9 We propose a simple biologically plausible neural network model that can solve a variety of working memory tasks. [sent-25, score-0.554]

10 The model has memory units inspired by neurons in lateral intraparietal (LIP) cortex and prefrontal cortex. [sent-27, score-0.941]

11 Such neurons exhibit persistent activations for task related cues in visual working memory tasks [1, 11, 4]. [sent-28, score-0.84]

12 Memory units learn to represent an internal state that allows the network to solve working memory tasks by transforming POMDPs into MDPs [25]. [sent-29, score-0.91]

13 The first is a synaptic tag [12] that arises from an interaction between feedforward and feedback activations. [sent-31, score-0.532]

14 Tags form on those synapses that are responsible for the chosen actions by an attentional feedback process [13]. [sent-32, score-0.411]

15 global neuromodulatory signal δ that reflects the TD error, and this signal interacts with the tags to yield synaptic plasticity. [sent-35, score-0.711]

16 The persistence of tags permits learning if time passes between synaptic activity and the animal’s choice, for example if information is stored in working memory or evidence accumulates before a decision is made. [sent-37, score-0.973]

17 The learning rules are biologically plausible because the information required for computing the synaptic updates is available at the synapse. [sent-38, score-0.46]

18 Instantaneous units xi encode sensory inputs si (t), and + and - units encode positive and negative changes in sensory inputs with respect to the previous time step t − 1: x− (t) = [si (t − 1) − si (t)]+ , i x+ (t) = [si (t) − si (t − 1)]+ ; i (1) where [. [sent-46, score-1.23]

19 Each sensory variable si is thus represented by three units xi , x+ , x− (we only explicitly i i write the time dependence if it is ambiguous). [sent-48, score-0.578]

20 We denote the set of differentiating units as x . [sent-49, score-0.373]

21 The hidden layer models the association cortex and it contains regular units and memory units. [sent-50, score-1.064]

22 1, circles) are fully connected to the instantaneous units i in the sensory layer R R R by connections vij ; v0j is a bias weight. [sent-52, score-0.709]

23 1, diamonds) are fully connected to the +/- units in the sensory layer by M M connections vlm and they derive their activations yj (t) by integrating their inputs: M ym = σ(aM ) with aM = aM (t − 1) + m m m M vlm xl , (3) l with σ as defined in eqn. [sent-55, score-1.119]

24 Output layer units k are fully connected to the hidden layer by connecR R M tions wjk (for regular hiddens, w0k is a bias weight) and wmk (for memory hiddens). [sent-57, score-1.153]

25 Activations are computed as: R R M M qk = yj wjk + ym wmk . [sent-58, score-0.387]

26 (5) k exp qk The WTA mechanism then sets the activation of the winning unit to 1 and the activation of all other units to 0; zk = δkK where δkK is the Kronecker delta function. [sent-61, score-0.81]

27 The winning unit sends feedback signals to the earlier processing layers, informing the rest of the network about the action that was taken. [sent-62, score-0.403]

28 This feedback signal interacts with the feedforward activations to give rise to synaptic tags on those synapses that were involved in taking the decision. [sent-63, score-0.968]

29 The tags then interact with a neuromodulatory signal δ, which codes a TD error, to modify synaptic strengths. [sent-64, score-0.618]

30 The second is a global neuromodulatory signal δ which interacts with these tags to yield synaptic plasticity. [sent-71, score-0.668]

31 If λγ > 0, tags decay exponentially so that synapses that were responsible for previous actions are also assigned credit for the currently observed error. [sent-75, score-0.363]

32 Equivalently, updates for synapses between memory units and motor units are: M ∆wmk M ∆T agmk = = M βδ(t)T agmk , (λγ − M 1)T agmk (10) + M ym zk . [sent-76, score-1.507]

33 The intuition for the last equation is that the winning output unit K provides feedback to the units in the association layer that were responsible for its activation. [sent-78, score-0.931]

34 Association units with a strong feedforward connection also have a strong feedback connection. [sent-79, score-0.616]

35 As a result, synapses onto association units that 3 provided strong input to the winning unit will have the strongest plasticity. [sent-80, score-0.716]

36 For convenience, we have assumed that feedforward and feedback weights are symmetrical, but they can also be trained as in [13]. [sent-82, score-0.356]

37 For the updates for the synapses between +/- sensory units and memory units we first approximate the activation aM (see eqn. [sent-83, score-1.391]

38 (3)) as: m t aM = aM (t − 1) + m m M M vlm xl ≈ vlm xl (t ) , (15) t =0 l M which is a good approximation if the synapses vlm change slowly. [sent-84, score-0.591]

39 We can then write the updates as: M ∆vlm M ∆T aglm ∂E M M T aglm = βδT aglm , ∂qK ∂qK ∂y M ∂aM m M = −T aglm + M m , M ∂ym ∂aM ∂vlm m = −β (16) (17) t = M −T aglm + M M wKj ym (t)(1 − M ym (t)) xl (t ) . [sent-85, score-0.64]

40 (18) t =0 Note that one can interpret a memory unit as a regular one that receives all sensory input in a trial simultaneously. [sent-86, score-0.593]

41 For synapses onto memory units, we set λ = 0 to arrive at the last equation. [sent-87, score-0.404]

42 The intuition behind the last equation is that because the activity of a memory unit does not decay, the influence of its inputs xl on the activity in the motor layer does not decay either (λγ = 0). [sent-88, score-0.82]

43 (6) is set to 0 (see [2]) and after the synaptic updates we reset the memory units and synaptic tags, so that there is no confounding between different trials. [sent-91, score-1.269]

44 AuGMEnT is biologically plausible because the information required for the synaptic updates is locally available by the interaction of feedforward and feedback signals and a globally released neuromodulator coding TD errors. [sent-92, score-0.793]

45 3 Experiments We tested AuGMEnT on a set of memory tasks that have been used to investigate the effects of training on neuronal activity in area LIP. [sent-94, score-0.444]

46 Across all of our simulations, we fixed the configuration of the association layer (three regular units, four memory units) and Q-layer (three output units, for directing gaze to the left, center or right of a virtual screen). [sent-95, score-0.61]

47 The full task reward rf in was given if this saccade was accurate, while we aborted trials and gave no reward if the model made the wrong eye-movement or broke fixation before the go signal. [sent-100, score-0.433]

48 5, which shifts the sigmoidal activation function for association units so that that units with little input have almost zero output. [sent-106, score-0.915]

49 This task requires a non-linear transformation and cannot be solved by a direct mapping from sensory units to Q-value units. [sent-116, score-0.539]

50 After fixating for two timesteps, a cue was presented on the left or right and a small shaping reward rf ix was given. [sent-120, score-0.443]

51 In the association layer, a regular unit and two memory units are color coded gray, green and orange, respectively. [sent-131, score-0.916]

52 Output units L,F ,R are colored green, blue and red, respectively. [sent-132, score-0.373]

53 D Selectivity indices of memory units in saccade/antisaccade task (black) and in pro-saccade only task (red). [sent-137, score-0.711]

54 We trained 10, 000 randomly initialized networks with and without a shaping reward (rf ix = 0). [sent-150, score-0.34]

55 The Q-unit for fixating at the center had strongest activity at fixation onset and throughout the fixation and memory delays, whereas the Qunit for the appropriate eye movement became more active after the go-signal. [sent-158, score-0.417]

56 Interestingly, the activity of the Q-cells also depended on cue-location during the memory delay, as is observed, for example, in the frontal eye fields [18]. [sent-159, score-0.458]

57 This activity derives from memory units in the association layer that maintain a trace of the cue as persistent elevation of their activity and are also tuned to the difference between pro- and antisaccade trials. [sent-160, score-1.417]

58 To illustrate this, we defined selectivity indices (SIs) to characterize the tuning of memory units to the difference between pro- or antisaccade trials and to the difference in cue location. [sent-161, score-0.94]

59 The sensitivity of units to differences in trial types, SItype was |0. [sent-162, score-0.449]

60 We trained 100 networks and found that units tuned to cue-location also tended to be selective for trial-type (black data points in Fig. [sent-167, score-0.504]

61 To show that the association layer only learns to represent relevant features, we trained the same 100 networks using the same stimuli, but now only required pro5 R 0 0. [sent-170, score-0.395]

62 B Model network C Population averages, conditional on LogLR-quintile (inset) for LIP neurons (redrawn from [6]) (top) and model memory units over 100, 000 trials after learning had converged (bottom). [sent-191, score-0.941]

63 D Subjective weights inferred for a trained monkey (redrawn from [6]) (left) and average synaptic weights to an example memory unit (right) versus true symbol weights (A, right). [sent-192, score-0.794]

64 E Histogram of weight correlations for 400 memory units from 100 trained networks. [sent-193, score-0.712]

65 Memory units in the 97 converged networks now became tuned to cue-location but not to fixation point color (Fig. [sent-195, score-0.433]

66 We hypothesized that memory units could learn to integrate probabilistic evidence for a decision. [sent-202, score-0.713]

67 Yang and Shadlen [6] investigated how monkeys learn to combine information about four briefly presented symbols, which provided probabilistic cues whether a red or green eye movement target was baited with reward (Fig. [sent-203, score-0.418]

68 A previous model with only one layer of modifiable synapses could learn a simplified, linear version of this task [19]. [sent-205, score-0.362]

69 3B) had four retinotopic fields with binary units for all possible symbols, a binary unit for the fixation mark and four binary units coding the locations of the colored targets on the virtual screen. [sent-215, score-0.835]

70 Due to the +/- units, this made 45 × 3 units in total. [sent-216, score-0.373]

71 Memory units integrated information for one of the choices over the symbol sequence and maintained information about the value of this choice as persistent activity during the memory delay. [sent-230, score-0.824]

72 The graphs show average activations of populations of real and model neurons in the four cue presentation epochs. [sent-233, score-0.383]

73 Each pos6 B 2 0 2 51 4+ 38 256 2+ 19 28 +1 96 4 +6 48 2 +3 24 6 +1 12 8 6+ Association layer units (reg. [sent-234, score-0.528]

74 ) 4 4 3+ 2 51 4+ 38 256 2+ 19 28 +1 96 4 +6 48 2 +3 24 6 +1 12 8 4 Association layer units (reg. [sent-236, score-0.528]

75 ) 6+ 0 3+ 2 51 4+ 38 256 2+ 19 28 +1 96 4 +6 48 2 +3 24 6 +1 0 12 Association layer units (reg. [sent-238, score-0.528]

76 ) 2 8 6+ 4 3+ 2 51 4+ 38 256 2+ 19 28 +1 96 4 +6 48 2 +3 24 6 +1 12 8 4 6+ 3+ 0 4 25 1 Convergence Rate Convergence Rate Median Convergence Speed 25 Median Convergence Speed A1 Association layer units (reg. [sent-240, score-0.528]

77 For model units we rearranged the quintiles so that they were aligned in the last epoch to compute the population average. [sent-251, score-0.373]

78 Synaptic weights from input neurons to memory cells became strongly correlated to the true weights of the symbols (Fig. [sent-252, score-0.635]

79 Thus, the training of synaptic weights to memory neurons in parietal cortex can explain how the monkeys valuate the symbols [19]. [sent-254, score-1.054]

80 We trained 100 networks on the same task and computed Spearman correlations for the memory unit weights with the true weights and found that in general they learn to represent the symbols (Fig. [sent-255, score-0.704]

81 We scaled the number of association units by powers of two, from 21 = 2 (yielding 6 regular units and 8 memory units) to 27 = 128 (yielding 384 regular and 512 memory units). [sent-260, score-1.547]

82 We first evaluated these scaled networks with the standard set of learning parameters and found that these yielded stable results within a wide range but that performance deteriorated for the largest networks (from 26 = 64; 192 regular units and 256 memory units) (Fig. [sent-263, score-0.839]

83 4 Discussion We have shown that AuGMEnT can train networks to solve working memory tasks that require nonlinear stimulus-response mappings and the integration of sensory evidence in a biologically plausible way. [sent-270, score-0.764]

84 The network is trained by a form of SARSA(λ) [10, 2], and synaptic updates minimize TD errors by stochastic gradient descent. [sent-272, score-0.459]

85 Technically, working memory tasks are Partially Observable Markov Decision Processes (POMDPs), because current observations do not contain the information to make optimal decisions [25]. [sent-275, score-0.452]

86 Although AuGMEnT is not a solution for all POMDPs, as these are in general intractable [25], its simple learning mechanism is well able to learn challenging working memory tasks. [sent-276, score-0.454]

87 The problem of learning new working memory representations by reinforcement learning is not well-studied. [sent-277, score-0.445]

88 Some early work used the biologically implausible backpropagation-through-time algorithm to learn memory representations [26, 27]. [sent-278, score-0.376]

89 Most other work pre-wires some aspects of working memory and only has a single layer of plastic weights (e. [sent-279, score-0.581]

90 This model is able to learn a variety of working memory tasks, but it requires a teaching signal that provides the correct actions on each time-step and the architecture and learning rules are elaborate. [sent-283, score-0.463]

91 AuGMEnT explains how neurons become tuned to relevant sensory stimuli in sequential decision tasks that animals learn by trial and error. [sent-285, score-0.564]

92 The persistent activity of these memory cells could derive from intracellular processes, local circuit reverberations or recurrent activity in larger networks spanning cortex, thalamus and basal ganglia [31]. [sent-287, score-0.732]

93 The learning scheme adopts previously proposed ideas that globally released neuromodulatory signals code deviations from reward expectancy and gate synaptic plasticity [8, 9, 14]. [sent-288, score-0.668]

94 In addition to this neuromodulatory signal, plasticity in AuGMEnT is gated by an attentional feedback signal that tags synapses responsible for the chosen action. [sent-289, score-0.794]

95 Such a feedback signal exists in the brain because neurons at the motor stage that code a selected action enhance the activity of upstream neurons that provided input for this action [32], a signal that explains a corresponding shift of visual attention [33]. [sent-290, score-0.902]

96 Although the hypothesis that attentional feedback controls the formation of tags is new, there is ample evidence for the existence of synaptic tags [34, 12]. [sent-294, score-0.821]

97 Interestingly, neuromodulatory signals influence synaptic plasticity even if released seconds or minutes later than the plasticity-inducing event [12, 35], which supports that they interact with a trace of the stimulus, i. [sent-296, score-0.57]

98 Here we have shown how interactions between synaptic tags and neuromodulatory signals explain how neurons in association areas acquire working memory representations for apparently disparate tasks that rely on working memory or decision making. [sent-299, score-1.781]

99 Making working memory work: a computational model of learning in the prefrontal cortex and basal ganglia. [sent-339, score-0.572]

100 Learning to use working memory in partially observable environments through dopaminic reinforcement. [sent-431, score-0.384]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('units', 0.373), ('synaptic', 0.289), ('memory', 0.268), ('xation', 0.204), ('neurons', 0.173), ('feedback', 0.155), ('layer', 0.155), ('cue', 0.152), ('tags', 0.149), ('qk', 0.149), ('neuromodulatory', 0.137), ('synapses', 0.136), ('sensory', 0.131), ('vlm', 0.121), ('working', 0.116), ('symbols', 0.11), ('association', 0.109), ('augment', 0.108), ('activity', 0.108), ('agjk', 0.103), ('reward', 0.098), ('dopamine', 0.091), ('feedforward', 0.088), ('agij', 0.086), ('aglm', 0.086), ('rf', 0.082), ('cortex', 0.081), ('attentional', 0.079), ('regular', 0.078), ('trials', 0.078), ('trial', 0.076), ('sarsa', 0.076), ('persistent', 0.075), ('si', 0.074), ('wjk', 0.072), ('shaping', 0.072), ('biologically', 0.072), ('trained', 0.071), ('lip', 0.07), ('antisaccade', 0.069), ('loglr', 0.069), ('xate', 0.069), ('tasks', 0.068), ('reinforcement', 0.061), ('basal', 0.061), ('activation', 0.06), ('networks', 0.06), ('activations', 0.058), ('motor', 0.058), ('winning', 0.058), ('yj', 0.057), ('ym', 0.057), ('action', 0.056), ('neuroscience', 0.056), ('monkeys', 0.056), ('plasticity', 0.054), ('td', 0.054), ('agmk', 0.052), ('baited', 0.052), ('ganglia', 0.052), ('redrawn', 0.052), ('wmk', 0.052), ('vij', 0.05), ('interacts', 0.05), ('updates', 0.05), ('mark', 0.049), ('plausible', 0.049), ('network', 0.049), ('green', 0.048), ('delay', 0.048), ('cues', 0.047), ('xl', 0.046), ('prefrontal', 0.046), ('roelfsema', 0.046), ('signals', 0.045), ('released', 0.045), ('signal', 0.043), ('decision', 0.043), ('saccade', 0.042), ('weights', 0.042), ('responsible', 0.041), ('frontal', 0.041), ('eye', 0.041), ('tagging', 0.041), ('unit', 0.04), ('red', 0.04), ('wkj', 0.04), ('ix', 0.039), ('explains', 0.037), ('decay', 0.037), ('pomdps', 0.036), ('netherlands', 0.036), ('zk', 0.036), ('learn', 0.036), ('integrate', 0.036), ('ar', 0.035), ('task', 0.035), ('parietal', 0.035), ('shadlen', 0.035), ('mechanism', 0.034)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

2 0.19596612 347 nips-2012-Towards a learning-theoretic analysis of spike-timing dependent plasticity

Author: David Balduzzi, Michel Besserve

Abstract: This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning. We introduce a model, the selectron, that (i) arises as the fast time constant limit of leaky integrate-and-fire neurons equipped with spiking timing dependent plasticity (STDP) and (ii) is amenable to theoretical analysis. We show that the selectron encodes reward estimates into spikes and that an error bound on spikes is controlled by a spiking margin and the sum of synaptic weights. Moreover, the efficacy of spikes (their usefulness to other reward maximizing selectrons) also depends on total synaptic strength. Finally, based on our analysis, we propose a regularized version of STDP, and show the regularization improves the robustness of neuronal learning when faced with multiple stimuli. 1

3 0.16934082 152 nips-2012-Homeostatic plasticity in Bayesian spiking networks as Expectation Maximization with posterior constraints

Author: Stefan Habenschuss, Johannes Bill, Bernhard Nessler

Abstract: Recent spiking network models of Bayesian inference and unsupervised learning frequently assume either inputs to arrive in a special format or employ complex computations in neuronal activation functions and synaptic plasticity rules. Here we show in a rigorous mathematical treatment how homeostatic processes, which have previously received little attention in this context, can overcome common theoretical limitations and facilitate the neural implementation and performance of existing models. In particular, we show that homeostatic plasticity can be understood as the enforcement of a ’balancing’ posterior constraint during probabilistic inference and learning with Expectation Maximization. We link homeostatic dynamics to the theory of variational inference, and show that nontrivial terms, which typically appear during probabilistic inference in a large class of models, drop out. We demonstrate the feasibility of our approach in a spiking WinnerTake-All architecture of Bayesian inference and learning. Finally, we sketch how the mathematical framework can be extended to richer recurrent network architectures. Altogether, our theory provides a novel perspective on the interplay of homeostatic processes and synaptic plasticity in cortical microcircuits, and points to an essential role of homeostasis during inference and learning in spiking networks. 1

4 0.15999845 24 nips-2012-A mechanistic model of early sensory processing based on subtracting sparse representations

Author: Shaul Druckmann, Tao Hu, Dmitri B. Chklovskii

Abstract: Early stages of sensory systems face the challenge of compressing information from numerous receptors onto a much smaller number of projection neurons, a so called communication bottleneck. To make more efficient use of limited bandwidth, compression may be achieved using predictive coding, whereby predictable, or redundant, components of the stimulus are removed. In the case of the retina, Srinivasan et al. (1982) suggested that feedforward inhibitory connections subtracting a linear prediction generated from nearby receptors implement such compression, resulting in biphasic center-surround receptive fields. However, feedback inhibitory circuits are common in early sensory circuits and furthermore their dynamics may be nonlinear. Can such circuits implement predictive coding as well? Here, solving the transient dynamics of nonlinear reciprocal feedback circuits through analogy to a signal-processing algorithm called linearized Bregman iteration we show that nonlinear predictive coding can be implemented in an inhibitory feedback circuit. In response to a step stimulus, interneuron activity in time constructs progressively less sparse but more accurate representations of the stimulus, a temporally evolving prediction. This analysis provides a powerful theoretical framework to interpret and understand the dynamics of early sensory processing in a variety of physiological experiments and yields novel predictions regarding the relation between activity and stimulus statistics.

5 0.14870149 195 nips-2012-Learning visual motion in recurrent neural networks

Author: Marius Pachitariu, Maneesh Sahani

Abstract: We present a dynamic nonlinear generative model for visual motion based on a latent representation of binary-gated Gaussian variables. Trained on sequences of images, the model learns to represent different movement directions in different variables. We use an online approximate inference scheme that can be mapped to the dynamics of networks of neurons. Probed with drifting grating stimuli and moving bars of light, neurons in the model show patterns of responses analogous to those of direction-selective simple cells in primary visual cortex. Most model neurons also show speed tuning and respond equally well to a range of motion directions and speeds aligned to the constraint line of their respective preferred speed. We show how these computations are enabled by a specific pattern of recurrent connections learned by the model. 1

6 0.13031508 190 nips-2012-Learning optimal spike-based representations

7 0.12791987 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models

8 0.12200215 341 nips-2012-The topographic unsupervised learning of natural sounds in the auditory cortex

9 0.11668558 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

10 0.11487515 65 nips-2012-Cardinality Restricted Boltzmann Machines

11 0.11483499 158 nips-2012-ImageNet Classification with Deep Convolutional Neural Networks

12 0.11205105 114 nips-2012-Efficient coding provides a direct link between prior and likelihood in perceptual Bayesian inference

13 0.11049869 79 nips-2012-Compressive neural representation of sparse, high-dimensional probabilities

14 0.10915852 358 nips-2012-Value Pursuit Iteration

15 0.1037283 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

16 0.10210401 94 nips-2012-Delay Compensation with Dynamical Synapses

17 0.087848224 90 nips-2012-Deep Learning of Invariant Features via Simulated Fixations in Video

18 0.087268896 273 nips-2012-Predicting Action Content On-Line and in Real Time before Action Onset – an Intracranial Human Study

19 0.083730437 322 nips-2012-Spiking and saturating dendrites differentially expand single neuron computation capacity

20 0.080830723 239 nips-2012-Neuronal Spike Generation Mechanism as an Oversampling, Noise-shaping A-to-D converter


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.179), (1, -0.059), (2, -0.196), (3, 0.135), (4, -0.017), (5, 0.25), (6, -0.008), (7, 0.012), (8, -0.005), (9, 0.1), (10, -0.025), (11, 0.016), (12, 0.003), (13, 0.097), (14, -0.01), (15, -0.009), (16, -0.009), (17, 0.016), (18, 0.012), (19, -0.065), (20, -0.019), (21, -0.062), (22, -0.036), (23, 0.024), (24, 0.014), (25, -0.016), (26, 0.045), (27, -0.003), (28, 0.14), (29, -0.027), (30, -0.021), (31, 0.02), (32, -0.024), (33, -0.028), (34, 0.014), (35, -0.072), (36, -0.01), (37, -0.098), (38, -0.163), (39, -0.085), (40, -0.071), (41, -0.056), (42, -0.057), (43, 0.069), (44, -0.055), (45, -0.066), (46, -0.03), (47, -0.027), (48, -0.033), (49, 0.087)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97392321 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

2 0.72665149 224 nips-2012-Multi-scale Hyper-time Hardware Emulation of Human Motor Nervous System Based on Spiking Neurons using FPGA

Author: C. M. Niu, Sirish Nandyala, Won J. Sohn, Terence Sanger

Abstract: Our central goal is to quantify the long-term progression of pediatric neurological diseases, such as a typical 10-15 years progression of child dystonia. To this purpose, quantitative models are convincing only if they can provide multi-scale details ranging from neuron spikes to limb biomechanics. The models also need to be evaluated in hyper-time, i.e. significantly faster than real-time, for producing useful predictions. We designed a platform with digital VLSI hardware for multiscale hyper-time emulations of human motor nervous systems. The platform is constructed on a scalable, distributed array of Field Programmable Gate Array (FPGA) devices. All devices operate asynchronously with 1 millisecond time granularity, and the overall system is accelerated to 365x real-time. Each physiological component is implemented using models from well documented studies and can be flexibly modified. Thus the validity of emulation can be easily advised by neurophysiologists and clinicians. For maximizing the speed of emulation, all calculations are implemented in combinational logic instead of clocked iterative circuits. This paper presents the methodology of building FPGA modules emulating a monosynaptic spinal loop. Emulated activities are qualitatively similar to real human data. Also discussed is the rationale of approximating neural circuitry by organizing neurons with sparse interconnections. In conclusion, our platform allows emulating pathological abnormalities such that motor symptoms will emerge and can be analyzed. It compels us to test the origins of childhood motor disorders and predict their long-term progressions. 1 Challenges of studying developmental motor disorders There is currently no quantitative model of how a neuropathological condition, which mainly affects the function of neurons, ends up causing the functional abnormalities identified in clinical examinations. The gap in knowledge is particularly evident for disorders in developing human nervous systems, i.e. childhood neurological diseases. In these cases, the ultimate clinical effect of cellu1 lar injury is compounded by a complex interplay among the child’s injury, development, behavior, experience, plasticity, etc. Qualitative insight has been provided by clinical experiences into the association between particular types of injury and particular types of outcome. Their quantitative linkages, nevertheless, have yet to be created – neither in clinic nor in cellular physiological tests. This discrepancy is significantly more prominent for individual child patients, which makes it very difficult to estimate the efficacy of treatment plans. In order to understand the consequence of injury and discover new treatments, it is necessary to create a modeling toolset with certain design guidelines, such that child neurological diseases can be quantitatively analyzed. Perhaps more than any other organ, the brain necessarily operates on multiple spatial and temporal scales. On the one hand, it is the neurons that perform fundamental computations, but neurons have to interact with large-scale organs (ears, eyes, skeletal muscles, etc.) to achieve global functions. This multi-scale nature worths more attention in injuries, where the overall deficits depend on both the cellular effects of injuries and the propagated consequences. On the other hand, neural processes in developmental diseases usually operate on drastically different time scales, e.g. spinal reflex in milliseconds versus learning in years. Thus when studying motor nervous systems, mathematical modeling is convincing only if it can provide multi-scale details, ranging from neuron spikes to limb biomechanics; also the models should be evaluated with time granularity as small as 1 millisecond, meanwhile the evaluation needs to continue trillions of cycles in order to cover years of life. It is particularly challenging to describe the multi-scale nature of human nervous system when modeling childhood movement disorders. Note that for a child who suffered brain injury at birth, the full development of all motor symptoms may easily take more than 10 years. Therefore the millisecondbased model needs to be evaluated significantly faster than real-time, otherwise the model will fail to produce any useful predictions in time. We have implemented realistic models for spiking motoneurons, sensory neurons, neural circuitry, muscle fibers and proprioceptors using VLSI and programmable logic technologies. All models are computed in Field Programmable Gate Array (FPGA) hardware in 365 times real-time. Therefore one year of disease progression can be assessed after one day of emulation. This paper presents the methodology of building the emulation platform. The results demonstrate that our platform is capable of producing physiologically realistic multi-scale signals, which are usually scarce in experiments. Successful emulations enabled by this platform will be used to verify theories of neuropathology. New treatment mechanisms and drug effects can also be emulated before animal experiments or clinical trials. 2 Methodology of multi-scale neural emulation A. Human arm B. Monosynaptic spinal loop C. Inner structure of muscle spindle Gamma Secondary dynamic Gamma output input static Primary input output Bag 1 αMN Bag 2 Chain Figure 1: Illustration of the multi-scale nature of motor nervous system. The motor part of human nervous system is responsible for maintaining body postures and generating voluntary movements. The multi-scale nature of motor nervous system is demonstrated in Fig.1. When the elbow (Fig.1A) is maintaining a posture or performing a movement, a force is established by the involved muscle based on how much spiking excitation the muscle receives from its αmotoneurons (Fig.1B). The α-motoneurons are regulated by a variety of sensory input, part of which comes directly from the proprioceptors in the muscle. As the primary proprioceptor found in skeletal muscles, a muscle spindle is another complex system that has its own microscopic Multiple-InputMultiple-Output structure (Fig.1C). Spindles continuously provide information about the length and lengthening speed of the muscle fiber. A muscle with its regulating motoneurons, sensory neurons and proprioceptors constitutes a monosynaptic spinal loop. This minimalist neurophysiological 2 structure is used as an example for explaining the multi-scale hyper-time emulation in hardware. Additional structures can be added to the backbone set-up using similar methodologies. 2.1 Modularized architecture for multi-scale models Decades of studies on neurophysiology provided an abundance of models characterizing different components of the human motor nervous system. The informational characteristics of physiological components allowed us to model them as functional structures, i.e. each of which converting input signals to certain outputs. In particular, within a monosynaptic spinal loop illustrated in Fig.1B, stretching the muscle will elicit a chain of physiological activities in: muscle stretch ⇒ spindle ⇒ sensory neuron ⇒ synapse ⇒ motoneuron ⇒ muscle contraction. The adjacent components must have compatible interfaces, and the interfacing variables must also be physiologically realistic. In our design, each component is mathematically described in Table 1: Table 1: Functional definition of neural models COMPONENT Neuron Synapse Muscle Spindle MATHEMATICAL DEFINITION S(t) = fneuron (I, t) I(t) = fsynapse (S, t) ˙ T (t) = fmuscle (S, L, L, t) ˙ Γdynamic , Γstatic , t) A(t) = fspindle (L, L, all components are modeled as black-box functions that map the inputs to the outputs. The meanings of these mathematical definitions are explained below. This design allows existing physiological models to be easily inserted and switched. In all models the input signals are time-varying, e.g. I = I(t), L = L(t) , etc. The argument of t in input signals are omitted throughout this paper. 2.2 Selection of models for emulation Models were selected in consideration of their computational cost, physiological verisimilitude, and whether it can be adapted to the mathematical form defined in Table 1. Model of Neuron The informational process for a neuron is to take post-synaptic current I as the input, and produce a binary spike train S in the output. The neuron model adopted in the emulation was developed by Izhikevich [1]: = 0.04v 2 + 5v + 140 − u + I = a(bv − u) v u (1) (2) if v = 30 mV, then v ← c, u ← u + d where a, b, c, d are free parameters tuned to achieve certain firing patterns. Membrane potential v directly determines a binary spike train S(t) that S(t) = 1 if v ≥ 30, otherwise S(t) = 0. Note that v in Izhikevich model is in millivolts and time t is in milliseconds. Therefore the coefficients in eq.1 need to be adjusted in correspondence to SI units. Model of Synapse When a pre-synaptic neuron spikes, i.e. S(0) = 1, an excitatory synapse subsequently issues an Excitatory Post-Synaptic Current (EPSC) that drives the post-synaptic neuron. Neural recording of hair cells in rats [2] provided evidence that the time profile of EPSC can be well characterized using the equations below: I(t) = Vm × e t d Vm −τ 0 t − e− τr Vm if t ≥ 0 (3) otherwise The key parameters in a synapse model is the time constants for rising (τr ) and decaying (τd ). In our emulation τr = 0.001 s and τr = 0.003 s. 3 Model of Muscle force and electromyograph (EMG) The primary effect of skeletal muscle is to convert α-motoneuron spikes S into force T , depending ˙ on the muscle’s instantaneous length L and lengthening speed L. We used Hill’s muscle model in the emulation with parameter tuning described in [3]. Another measurable output of muscle is electromyograph (EMG). EMG is the small skin current polarized by motor unit action potential (MUAP) when it travels along muscle fibers. Models exist to describe the typical waveform picked by surface EMG electrodes. In this project we chose to implement the one described in [4]. Model of Proprioceptor Spindle is a sensory organ that provides the main source of proprioceptive information. As can be seen in Fig.1C, a spindle typically produces two afferent outputs (primary Ia and secondary II) ˙ according to its gamma fusimotor drives (Γdynamic and Γstatic ) and muscle states (L and L). There is currently no closed-form models describing spindle functions due to spindle’s significant nonlinearity. On representative model that numerically approximates the spindle dynamics was developed by Mileusnic et al. [5]. The model used differential equations to characterize a typical cat soleus spindle. Eqs.4-10 present a subset of this model for one type of spindle fiber (bag1): Γdynamic − x0 /τ Γdynamic + Ω2 bag1 x0 ˙ = x1 ˙ = x2 1 = [TSR − TB − TP R − Γ1 x0 ] M x2 ˙ (4) (5) (6) where TSR TB TP R CSS = KSR (L − x1 − LSR0 ) (7) 0.3 = (B0 + B1 x0 ) · (x1 − R) · CSS · |x2 | = KP R (x1 − LP R0 ) 2 = −1 −1000x2 1+e (8) (9) (10) Eq.8 and 10 suggest that evaluating the spindle model requires multiplication, division as well as more complex arithmetics like polynomials and exponentials. The implementation details are described in Section 3. 2.3 Neuron connectivity with sparse interconnections Although the number of spinal neurons (~1 billion) is significantly less compared to that of cortical neurons (~100 billion), a fully connected spinal network still means approximately 2 trillion synaptic endings [6]. Implementing such a huge number of synapses imposes a major challenge, if not impossible, given limited hardware resource. In this platform we approximated the neural connectivity by sparsely connecting sensory neurons to motoneurons as parallel pathways. We do not attempt to introduce the full connectivity. The rationale is that in a neural control system, the effect of a single neuron can be considered as mapping current state x to change in state x through a band-limited channel. Therefore when a collection of ˙ neurons are firing stochastically, the probability of x depends on both x and the firing behavior s ˙ (s = 1 when spiking, otherwise s = 0) of each neuron, as such: p(x|x, s) = p(x|s = 1)p(s = 1|x) + p(x|s = 0)p(s = 0|x) ˙ ˙ ˙ (11) Eq.11 is a master equation that determines a probability flow on the state. From the Kramers-Moyal expansion we can associate this probability flow with a partial differential equation: ∂ p(x, t) ∂t ∞ − = i=1 ∂ ∂x i D(i) (x)p(x, t) (12) where D(i) (x) is a time-invariant term that modifies the change of probability density based on its i-th gradient. 4 Under certain conditions [7, 8], D(i) (x) for i > 2 all vanish and therefore the probability flow can be described deterministically using a linear operator L: ∂ ∂ ∂ 2 (2) D (x) p(x, t) = Lp(x, t) (13) p(x, t) = − D(1) (x) + ∂t ∂x ∂x2 This means that various Ls can be superimposed to achieve complex system dynamics (illustrated in Fig.2A). B. Equivalent network with sparse interconnections A. Neuron function as superimposed linear operators SN Sensory Input + SN SN SN αMN αMN αMN Motor Output αMN Figure 2: Functions of neuron population can be described as the combination of linear operators (A). Therefore the original neural function can be equivalently produced by sparsely connected neurons formalizing parallel pathways (B). As a consequence, the statistical effect of two fully connected neuron populations is equivalent to ones that are only sparsely connected, as long as the probability flow can be described by the same L. For a movement task, in particular, it is the statistical effect from the neuron ensemble onto skeletal muscles that determines the global behavior. Therefore we argue that it is feasible to approximate the spinal cord connectivity by sparsely interconnecting sensory and motor neurons (Fig.2B). Here a pool of homogenous sensory neurons projects to another pool of homogeneous α-motoneurons. Pseudorandom noise is added to the input of all homogeneous neurons within a population. It is worth noting that this approximation significantly reduces the number of synapses that need to be implemented in hardware. 3 Hardware implementation on FPGA We select FPGA as the implementation device due to its inherent parallelism that resembles the nervous system. FPGA is favored over GPU or clustered CPUs because it is relatively easy to network hundreds of nodes under flexible protocols. The platform is distributed on multiple nodes of Xilinx Spartan-6 devices. The interfacing among FPGAs and computers is created using OpalKelly development board XEM6010. The dynamic range of variables is tight in models of Izhikevich neuron, synapse and EMG. This helps maintaining the accuracy of models even when they are evaluated in 32-bit fixed-point arithmetics. The spindle model, in contrast, requires floating-point arithmetics due to its wide dynamic range and complex calculations (see eq.4-10). Hyper-time computations with floating-point numbers are resource consuming and therefore need to be implemented with special attentions. 3.1 Floating-point arithmetics in combinational logic Our arithmetic implementations are compatible with IEEE-754 standard. Typical floating-point arithmetic IP cores are either pipe-lined or based on iterative algorithms such as CORDIC, all of which require clocks to schedule the calculation. In our platform, no clock is provided for model evaluations thus all arithmetics need to be executed in pure combinational logic. Taking advantage of combinational logic allows all model evaluations to be 1) fast, the evaluation time depends entirely on the propagating and settling time of signals, which is on the order of microseconds, and 2) parallel, each model is evaluated on its own circuit without waiting for any other results. Our implementations of adder and multiplier are inspired by the open source project “Free FloatingPoint Madness”, available at http://www.hmc.edu/chips/. Please contact the authors of this paper if the modified code is needed. 5 Fast combinational floating-point division Floating-point division is even more resource demanding than multiplications. We avoided directly implementing the dividing algorithm by approximating it with additions and multiplications. Our approach is inspired by an algorithm described in [9], which provides a good approximation of the inverse square root for any positive number x within one Newton-Raphson iteration: 1 x Q(x) = √ ≈ x(1.5 − · x2 ) 2 x (x > 0) (14) Q(x) can be implemented only using floating-point adders and multipliers. Thereby any division with a positive divisor can be achieved if two blocks of Q(x) are concatenated: a a (15) = √ √ = a · Q(b) · Q(b) (b > 0) b b· b This algorithm has been adjusted to also work with negative divisors (b < 0). Numerical integrators for differential equations Evaluating the instantaneous states of differential equation models require a fixed-step numerical integrator. Backward Euler’s Method was chosen to balance the numerical error and FPGA usage: x ˙ xn+1 = f (x, t) = xn + T f (xn+1 , tn+1 ) (16) (17) where T is the sampling interval. f (x, t) is the derivative function for state variable x. 3.2 Asynchronous spike-based communication between FPGA chips Clock Spike clean count Counter 1 1 2 1 2 3 Figure 3: Timing diagram of asynchronous spike-based communication FPGA nodes are networked by transferring 1-bit binary spikes to each other. Our design allowed the sender and the receiver to operate on independent clocks without having to synchronize. The timing diagram of the spike-based communication is shown in Fig.3. The sender issues Spike with a pulse width of 1/(365 × Femu ) second. Each Spike then triggers a counting event on the receiver, meanwhile each Clock first reads the accumulated spike count and subsequently cleans the counter. Note that the phase difference between Spike and Clock is not predictable due to asynchronicity. 3.3 Serialize neuron evaluations within a homogeneous population Different neuron populations are instantiated as standalone circuits. Within in each population, however, homogeneous neurons mentioned in Section 2.3 are evaluated in series in order to optimize FPGA usage. Within each FPGA node all modules operate with a central clock, which is the only source allowed to trigger any updating event. Therefore the maximal number of neurons that can be serialized (Nserial ) is restrained by the following relationship: Ffpga = C × Nserial × 365 × Femu (18) Here Ffpga is the fastest clock rate that a FPGA can operate on; C = 4 is the minimal clock cycles needed for updating each state variable in the on-chip memory; Femu = 1 kHz is the time granularity of emulation (1 millisecond), and 365 × Femu represents 365x real-time. Consider that Xilinx 6 Spartan-6 FPGA devices peaks at 200MHz central clock frequency, the theoretical maximum of neurons that can be serialized is Nserial 200 MHz/(4 × 365 × 1 kHz) ≈ 137 (19) In the current design we choose Nserial = 128. 4 Results: emulated activities of motor nervous system Figure 4 shows the implemented monosynaptic spinal loop in schematics and in operation. Each FPGA node is able to emulate monosynaptic spinal loops consisting of 1,024 sensory and 1,024 motor neurons, i.e. 2,048 neurons in total. The spike-based asynchronous communication is successful between two FPGA nodes. Note that the emulation has to be significantly slowed down for on-line plotting. When the emulation is at full speed (365x real-time) the software front-end is not able to visualize the signals due to limited data throughput. 128 SNs 128 αMNs SN αMN 128 SNs 128 αMNs SN αMN ... 8 parallel pathways 2,048 neurons Figure 4: The neural emulation platform in operation. Left: Neural circuits implemented for each FPGA node including 2,048 neurons. SN = Sensory Neuron; αMN = α-motoneuron. Center: One working FPGA node. Right: Two FPGA nodes networked using asynchronous spiking protocol. The emulation platform successfully created multi-scale information when the muscle is externally stretched (Fig.5A). We also tested if our emulated motor system is able to produce the recruitment order and size principles observed in real physiological data. It has been well known that when a voluntary motor command is sent to the α-motoneuron pool, the motor units are recruited in an order that small ones get recruited first, followed by the big ones [10]. The comparison between our results and real data are shown in Fig.5B, where the top panel shows 20 motor unit activities emulated using our platform, and the bottom panel shows decoded motor unit activities from real human EMG [11]. No qualitative difference was found. 5 Discussion and future work We designed a hardware platform for emulating the multi-scale motor nervous activities in hypertime. We managed to use one node of single Xilinx Spartan-6 FPGA to emulate monosynaptic spinal loops consisting of 2,048 neurons, associated muscles and proprioceptors. The neurons are organized as parallel pathways with sparse interconnections. The emulation is successfully accelerated to 365x real-time. The platform can be scaled by networking multiple FPGA nodes, which is enabled by an asynchronous spike-based communication protocol. The emulated monosynaptic spinal loops are capable of producing reflex-like activities in response to muscle stretch. Our results of motor unit recruitment order are compatible with the physiological data collected in real human subjects. There is a question of whether this stochastic system turns out chaotic, especially with accumulated errors from Backward Euler’s integrator. Note that the firing property of a neuron population is usually stable even with explicit noise [8], and spindle inputs are measured from real robots so the integrator errors are corrected at every iteration. To our knowledge, the system is not critically sensitive to the initial conditions or integrator errors. This question, however, is both interesting and important for in-depth investigations in the future. 7 It has been shown [12] that replicating classic types of spinal interneurons (propriospinal, Iaexcitatory, Ia-inhibitory, Renshaw, etc.) is sufficient to produce stabilizing responses and rapid reaching movement in a wrist. Our platform will introduce those interneurons to describe the known spinal circuitry in further details. Physiological models will also be refined as needed. For the purpose of modeling movement behavior or diseases, Izhikevich model is a good balance between verisimilitude and computational cost. Nevertheless when testing drug effects along disease progression, neuron models are expected to cover sufficient molecular details including how neurotransmitters affect various ion channels. With the advancing of programmable semiconductor technology, it is expected to upgrade our neuron model to Hodgkin-Huxley’s. For the muscle models, Hill’s type of model does not fit the muscle properties accurately enough when the muscle is being shortened. Alternative models will be tested. Other studies showed that the functional dexterity of human limbs – especially in the hands – is critically enabled by the tendon configurations and joint geometry [13]. As a result, if our platform is used to understand whether known neurophysiology and biomechanics are sufficient to produce able and pathological movements, it will be necessary to use this platform to control human-like limbs. Since the emulation speed can be flexibly adjusted from arbitrarily slow to 365x real-time, when speeded to exactly 1x real-time the platform will function as a digital controller with 1kHz refresh rate. The main purpose of the emulation is to learn how certain motor disorders progress during childhood development. This first requires the platform to reproduce motor symptoms that are compatible with clinical observations. For example it has been suggested that muscle spasticity in rats is associated with decreased soma size of α-motoneurons [14], which presumably reduced the firing threshold of neurons. Thus when lower firing threshold is introduced to the emulated motoneuron pool, similar EMG patterns as in [15] should be observed. It is also necessary for the symptoms to evolve with neural plasticity. In the current version we presume that the structure of each component remains time invariant. In the future work Spike Timing Dependent Plasticity (STDP) will be introduced such that all components are subject to temporal modifications. B. Verify motor unit recruitment pattern A. Multi-scale activities from emulation Emulation 1s Stretch Spindle Ia Sensory post-synaptic current Real Data Motoneurons Muscle Force EMG Figure 5: A) Physiological activity emulated by each model when the muscle is sinusoidally stretched. B) Comparing the emulated motor unit recruitment order with real experimental data. Acknowledgments The authors thank Dr. Gerald Loeb for helping set up the emulation of spindle models. This project is supported by NIH NINDS grant R01NS069214-02. 8 References [1] Izhikevich, E. M. Simple model of spiking neurons. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 14, 1569–1572 (2003). [2] Glowatzki, E. & Fuchs, P. A. Transmitter release at the hair cell ribbon synapse. Nature neuroscience 5, 147–154 (2002). [3] Shadmehr, R. & Wise, S. P. A Mathematical Muscle Model. In Supplementary documents for “Computational Neurobiology of Reaching and Pointing”, 1–18 (MIT Press, Cambridge, MA, 2005). [4] Fuglevand, A. J., Winter, D. A. & Patla, A. E. Models of recruitment and rate coding organization in motor-unit pools. Journal of neurophysiology 70, 2470–2488 (1993). [5] Mileusnic, M. P., Brown, I. E., Lan, N. & Loeb, G. E. Mathematical models of proprioceptors. I. Control and transduction in the muscle spindle. Journal of neurophysiology 96, 1772–1788 (2006). [6] Gelfan, S., Kao, G. & Ruchkin, D. S. The dendritic tree of spinal neurons. The Journal of comparative neurology 139, 385–411 (1970). [7] Sanger, T. D. Neuro-mechanical control using differential stochastic operators. In Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE, 4494–4497 (2010). [8] Sanger, T. D. Distributed control of uncertain systems using superpositions of linear operators. Neural computation 23, 1911–1934 (2011). [9] Lomont, C. Fast inverse square root (2003). URL http://www.lomont.org/Math/Papers/ 2003/InvSqrt.pdf. [10] Henneman, E. Relation between size of neurons and their susceptibility to discharge. Science (New York, N.Y.) 126, 1345–1347 (1957). [11] De Luca, C. J. & Hostage, E. C. Relationship between firing rate and recruitment threshold of motoneurons in voluntary isometric contractions. Journal of neurophysiology 104, 1034–1046 (2010). [12] Raphael, G., Tsianos, G. A. & Loeb, G. E. Spinal-like regulator facilitates control of a two-degree-offreedom wrist. The Journal of neuroscience : the official journal of the Society for Neuroscience 30, 9431–9444 (2010). [13] Valero-Cuevas, F. J. et al. The tendon network of the fingers performs anatomical computation at a macroscopic scale. IEEE transactions on bio-medical engineering 54, 1161–1166 (2007). [14] Brashear, A. & Elovic, E. Spasticity: Diagnosis and Management (Demos Medical, 2010), 1 edn. [15] Levin, M. F. & Feldman, A. G. The role of stretch reflex threshold regulation in normal and impaired motor control. Brain research 657, 23–30 (1994). 9

3 0.68452811 152 nips-2012-Homeostatic plasticity in Bayesian spiking networks as Expectation Maximization with posterior constraints

Author: Stefan Habenschuss, Johannes Bill, Bernhard Nessler

Abstract: Recent spiking network models of Bayesian inference and unsupervised learning frequently assume either inputs to arrive in a special format or employ complex computations in neuronal activation functions and synaptic plasticity rules. Here we show in a rigorous mathematical treatment how homeostatic processes, which have previously received little attention in this context, can overcome common theoretical limitations and facilitate the neural implementation and performance of existing models. In particular, we show that homeostatic plasticity can be understood as the enforcement of a ’balancing’ posterior constraint during probabilistic inference and learning with Expectation Maximization. We link homeostatic dynamics to the theory of variational inference, and show that nontrivial terms, which typically appear during probabilistic inference in a large class of models, drop out. We demonstrate the feasibility of our approach in a spiking WinnerTake-All architecture of Bayesian inference and learning. Finally, we sketch how the mathematical framework can be extended to richer recurrent network architectures. Altogether, our theory provides a novel perspective on the interplay of homeostatic processes and synaptic plasticity in cortical microcircuits, and points to an essential role of homeostasis during inference and learning in spiking networks. 1

4 0.68208408 347 nips-2012-Towards a learning-theoretic analysis of spike-timing dependent plasticity

Author: David Balduzzi, Michel Besserve

Abstract: This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning. We introduce a model, the selectron, that (i) arises as the fast time constant limit of leaky integrate-and-fire neurons equipped with spiking timing dependent plasticity (STDP) and (ii) is amenable to theoretical analysis. We show that the selectron encodes reward estimates into spikes and that an error bound on spikes is controlled by a spiking margin and the sum of synaptic weights. Moreover, the efficacy of spikes (their usefulness to other reward maximizing selectrons) also depends on total synaptic strength. Finally, based on our analysis, we propose a regularized version of STDP, and show the regularization improves the robustness of neuronal learning when faced with multiple stimuli. 1

5 0.66544515 322 nips-2012-Spiking and saturating dendrites differentially expand single neuron computation capacity

Author: Romain Cazé, Mark Humphries, Boris S. Gutkin

Abstract: The integration of excitatory inputs in dendrites is non-linear: multiple excitatory inputs can produce a local depolarization departing from the arithmetic sum of each input’s response taken separately. If this depolarization is bigger than the arithmetic sum, the dendrite is spiking; if the depolarization is smaller, the dendrite is saturating. Decomposing a dendritic tree into independent dendritic spiking units greatly extends its computational capacity, as the neuron then maps onto a two layer neural network, enabling it to compute linearly non-separable Boolean functions (lnBFs). How can these lnBFs be implemented by dendritic architectures in practise? And can saturating dendrites equally expand computational capacity? To address these questions we use a binary neuron model and Boolean algebra. First, we confirm that spiking dendrites enable a neuron to compute lnBFs using an architecture based on the disjunctive normal form (DNF). Second, we prove that saturating dendrites as well as spiking dendrites enable a neuron to compute lnBFs using an architecture based on the conjunctive normal form (CNF). Contrary to a DNF-based architecture, in a CNF-based architecture, dendritic unit tunings do not imply the neuron tuning, as has been observed experimentally. Third, we show that one cannot use a DNF-based architecture with saturating dendrites. Consequently, we show that an important family of lnBFs implemented with a CNF-architecture can require an exponential number of saturating dendritic units, whereas the same family implemented with either a DNF-architecture or a CNF-architecture always require a linear number of spiking dendritic units. This minimization could explain why a neuron spends energetic resources to make its dendrites spike. 1

6 0.65831184 362 nips-2012-Waveform Driven Plasticity in BiFeO3 Memristive Devices: Model and Implementation

7 0.61731499 24 nips-2012-A mechanistic model of early sensory processing based on subtracting sparse representations

8 0.59962344 190 nips-2012-Learning optimal spike-based representations

9 0.56491584 94 nips-2012-Delay Compensation with Dynamical Synapses

10 0.55906689 195 nips-2012-Learning visual motion in recurrent neural networks

11 0.52635556 239 nips-2012-Neuronal Spike Generation Mechanism as an Oversampling, Noise-shaping A-to-D converter

12 0.51951963 65 nips-2012-Cardinality Restricted Boltzmann Machines

13 0.47796088 93 nips-2012-Deep Spatio-Temporal Architectures and Learning for Protein Structure Prediction

14 0.47283283 158 nips-2012-ImageNet Classification with Deep Convolutional Neural Networks

15 0.45994091 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models

16 0.4446466 229 nips-2012-Multimodal Learning with Deep Boltzmann Machines

17 0.44462004 39 nips-2012-Analog readout for optical reservoir computers

18 0.43205178 256 nips-2012-On the connections between saliency and tracking

19 0.43136674 4 nips-2012-A Better Way to Pretrain Deep Boltzmann Machines

20 0.42870057 91 nips-2012-Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.03), (11, 0.314), (17, 0.042), (21, 0.048), (38, 0.107), (42, 0.039), (54, 0.048), (55, 0.055), (74, 0.029), (76, 0.082), (77, 0.014), (80, 0.06), (92, 0.046)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77958697 238 nips-2012-Neurally Plausible Reinforcement Learning of Working Memory Tasks

Author: Jaldert Rombouts, Pieter Roelfsema, Sander M. Bohte

Abstract: A key function of brains is undoubtedly the abstraction and maintenance of information from the environment for later use. Neurons in association cortex play an important role in this process: by learning these neurons become tuned to relevant features and represent the information that is required later as a persistent elevation of their activity [1]. It is however not well known how such neurons acquire these task-relevant working memories. Here we introduce a biologically plausible learning scheme grounded in Reinforcement Learning (RL) theory [2] that explains how neurons become selective for relevant information by trial and error learning. The model has memory units which learn useful internal state representations to solve working memory tasks by transforming partially observable Markov decision problems (POMDP) into MDPs. We propose that synaptic plasticity is guided by a combination of attentional feedback signals from the action selection stage to earlier processing levels and a globally released neuromodulatory signal. Feedback signals interact with feedforward signals to form synaptic tags at those connections that are responsible for the stimulus-response mapping. The neuromodulatory signal interacts with tagged synapses to determine the sign and strength of plasticity. The learning scheme is generic because it can train networks in different tasks, simply by varying inputs and rewards. It explains how neurons in association cortex learn to 1) temporarily store task-relevant information in non-linear stimulus-response mapping tasks [1, 3, 4] and 2) learn to optimally integrate probabilistic evidence for perceptual decision making [5, 6]. 1

2 0.67969173 151 nips-2012-High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction

Author: Hua Wang, Feiping Nie, Heng Huang, Jingwen Yan, Sungeun Kim, Shannon Risacher, Andrew Saykin, Li Shen

Abstract: Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by progressive impairment of memory and other cognitive functions. Regression analysis has been studied to relate neuroimaging measures to cognitive status. However, whether these measures have further predictive power to infer a trajectory of cognitive performance over time is still an under-explored but important topic in AD research. We propose a novel high-order multi-task learning model to address this issue. The proposed model explores the temporal correlations existing in imaging and cognitive data by structured sparsity-inducing norms. The sparsity of the model enables the selection of a small number of imaging measures while maintaining high prediction accuracy. The empirical studies, using the longitudinal imaging and cognitive data of the ADNI cohort, have yielded promising results.

3 0.67868954 225 nips-2012-Multi-task Vector Field Learning

Author: Binbin Lin, Sen Yang, Chiyuan Zhang, Jieping Ye, Xiaofei He

Abstract: Multi-task learning (MTL) aims to improve generalization performance by learning multiple related tasks simultaneously and identifying the shared information among tasks. Most of existing MTL methods focus on learning linear models under the supervised setting. We propose a novel semi-supervised and nonlinear approach for MTL using vector fields. A vector field is a smooth mapping from the manifold to the tangent spaces which can be viewed as a directional derivative of functions on the manifold. We argue that vector fields provide a natural way to exploit the geometric structure of data as well as the shared differential structure of tasks, both of which are crucial for semi-supervised multi-task learning. In this paper, we develop multi-task vector field learning (MTVFL) which learns the predictor functions and the vector fields simultaneously. MTVFL has the following key properties. (1) The vector fields MTVFL learns are close to the gradient fields of the predictor functions. (2) Within each task, the vector field is required to be as parallel as possible which is expected to span a low dimensional subspace. (3) The vector fields from all tasks share a low dimensional subspace. We formalize our idea in a regularization framework and also provide a convex relaxation method to solve the original non-convex problem. The experimental results on synthetic and real data demonstrate the effectiveness of our proposed approach. 1

4 0.60971403 305 nips-2012-Selective Labeling via Error Bound Minimization

Author: Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

Abstract: In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

5 0.60222 327 nips-2012-Structured Learning of Gaussian Graphical Models

Author: Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-in Lee, Maryam Fazel

Abstract: We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions. We assume that most aspects of the networks are shared, but that there are some structured differences between them. Specifically, the network differences are generated from node perturbations: a few nodes are perturbed across networks, and most or all edges stemming from such nodes differ between networks. This corresponds to a simple model for the mechanism underlying many cancers, in which the gene regulatory network is disrupted due to the aberrant activity of a few specific genes. We propose to solve this problem using the perturbed-node joint graphical lasso, a convex optimization problem that is based upon the use of a row-column overlap norm penalty. We then solve the convex problem using an alternating directions method of multipliers algorithm. Our proposal is illustrated on synthetic data and on an application to brain cancer gene expression data. 1

6 0.48465425 284 nips-2012-Q-MKL: Matrix-induced Regularization in Multi-Kernel Learning with Applications to Neuroimaging

7 0.47829714 113 nips-2012-Efficient and direct estimation of a neural subunit model for sensory coding

8 0.47262815 77 nips-2012-Complex Inference in Neural Circuits with Probabilistic Population Codes and Topic Models

9 0.47248349 333 nips-2012-Synchronization can Control Regularization in Neural Systems via Correlated Noise Processes

10 0.46925494 23 nips-2012-A lattice filter model of the visual pathway

11 0.4648872 83 nips-2012-Controlled Recognition Bounds for Visual Learning and Exploration

12 0.46457097 363 nips-2012-Wavelet based multi-scale shape features on arbitrary surfaces for cortical thickness discrimination

13 0.46430242 190 nips-2012-Learning optimal spike-based representations

14 0.4637793 65 nips-2012-Cardinality Restricted Boltzmann Machines

15 0.46143344 162 nips-2012-Inverse Reinforcement Learning through Structured Classification

16 0.46105543 153 nips-2012-How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

17 0.45911786 302 nips-2012-Scaling MPE Inference for Constrained Continuous Markov Random Fields with Consensus Optimization

18 0.458808 120 nips-2012-Exact and Stable Recovery of Sequences of Signals with Sparse Increments via Differential 1-Minimization

19 0.45879611 38 nips-2012-Algorithms for Learning Markov Field Policies

20 0.45865425 292 nips-2012-Regularized Off-Policy TD-Learning