nips nips2011 nips2011-219 knowledge-graph by maker-knowledge-mining

219 nips-2011-Predicting response time and error rates in visual search

Source: pdf

Author: Bo Chen, Vidhya Navalpakkam, Pietro Perona

Abstract: A model of human visual search is proposed. It predicts both response time (RT) and error rates (RT) as a function of image parameters such as target contrast and clutter. The model is an ideal observer, in that it optimizes the Bayes ratio of target present vs target absent. The ratio is computed on the ﬁring pattern of V1/V2 neurons, modeled by Poisson distributions. The optimal mechanism for integrating information over time is shown to be a ‘soft max’ of diffusions, computed over the visual ﬁeld by ‘hypercolumns’ of neurons that share the same receptive ﬁeld and have different response properties to image features. An approximation of the optimal Bayesian observer, based on integrating local decisions, rather than diffusions, is also derived; it is shown experimentally to produce very similar predictions to the optimal observer in common psychophysics conditions. A psychophyisics experiment is proposed that may discriminate between which mechanism is used in the human brain. A B C Figure 1: Visual search. (A) Clutter and camouﬂage make visual search difﬁcult. (B,C) Psychologists and neuroscientists build synthetic displays to study visual search. In (B) the target ‘pops out’ (∆θ = 450 ), while in (C) the target requires more time to be detected (∆θ = 100 ) [1]. 1

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Predicting response time and error rates in visual search Bo Chen Caltech bchen3@caltech. [sent-1, score-0.544]

2 edu Abstract A model of human visual search is proposed. [sent-5, score-0.333]

3 It predicts both response time (RT) and error rates (RT) as a function of image parameters such as target contrast and clutter. [sent-6, score-0.528]

4 The model is an ideal observer, in that it optimizes the Bayes ratio of target present vs target absent. [sent-7, score-0.663]

5 The optimal mechanism for integrating information over time is shown to be a ‘soft max’ of diffusions, computed over the visual ﬁeld by ‘hypercolumns’ of neurons that share the same receptive ﬁeld and have different response properties to image features. [sent-9, score-0.624]

6 An approximation of the optimal Bayesian observer, based on integrating local decisions, rather than diffusions, is also derived; it is shown experimentally to produce very similar predictions to the optimal observer in common psychophysics conditions. [sent-10, score-0.439]

7 (B,C) Psychologists and neuroscientists build synthetic displays to study visual search. [sent-14, score-0.269]

8 In (B) the target ‘pops out’ (∆θ = 450 ), while in (C) the target requires more time to be detected (∆θ = 100 ) [1]. [sent-15, score-0.456]

9 Visual search is challenging because the location of the object that one is looking for is not known in advance, and surrounding clutter may generate false alarms. [sent-17, score-0.288]

10 The three ecologically relevant performance parameters of visual search are the two error rates (ER): false alarms (FA) and false rejects (FR), and response time (RT). [sent-18, score-0.612]

11 The design of a visual system is crucial in obtaining low ER and RT. [sent-19, score-0.224]

12 Psychologists and physiologists have long been interested in understanding the performance and the mechanisms of visual search. [sent-21, score-0.265]

13 Several studies since 1980s have investigated how RT and ER are affected by the complexity of the stimulus (number of distractors), and by target-distractor discriminability with different visual cues. [sent-25, score-0.46]

14 One early observation is that when the target and distractor features are widely separated in feature space (e. [sent-26, score-0.397]

15 , red target among green distractors), the target ‘pops out’. [sent-28, score-0.456]

16 , RT to ﬁnd the target is independent of number of items in the display [1]. [sent-32, score-0.437]

17 Decreasing the discriminability between the target and distractor increases error rates, and increases the slope of RT vs. [sent-33, score-0.508]

18 Moreover, it was found that the RT for displays with no target is longer than where the target is present (see review in [6]). [sent-35, score-0.566]

19 Recent studies investigated the shape of RT distributions in visual search [7, 8]. [sent-36, score-0.333]

20 Neurophysiologically plausible models have been recently proposed to predict RTs in visual discrimination tasks [9] and various other 2AFC tasks [10] at a single spatial location in the visual ﬁeld. [sent-37, score-0.628]

21 They are based on sequential tests of statistical hypotheses (target present vs target absent) [11] computed on the response of stimulus-tuned neurons [2, 3]. [sent-38, score-0.633]

22 We do not yet have satisfactory models for explaining RTs in visual search, which is harder as it involves integrating information across several locations across the visual ﬁeld, as well as time. [sent-39, score-0.482]

23 Existing models predicting RT in visual search are either qualitative (e. [sent-40, score-0.333]

24 , the drift-diffusion model [13, 14, 15]), and do not attempt to predict experimental results with new set sizes, target and distractor settings. [sent-44, score-0.429]

25 We propose a Bayesian model of visual search that predicts both ER and RT. [sent-45, score-0.377]

26 First, while visual search has been modeled using signal-detection theory to predict ER [16], our model is based on neuron-like mechanisms and predicts both ER and RT. [sent-47, score-0.45]

27 Second, our model is an optimal observer, given a physiologically plausible front-end of the visual system. [sent-48, score-0.314]

28 Third, our model shows that in visual search the optimal computation is not a diffusion, as one might believe by analogy with single-location discrimination models [17, 18], rather, it is a ‘softmax’ nonlinear combination of locally-computed diffusions. [sent-49, score-0.398]

29 First, we assume that stimulus items are centered on cortical hypercolumns [19] and at locations where there is no item neuronal ﬁring is negligible. [sent-52, score-0.388]

30 Second, retinal and cortical magniﬁcation [19] are ignored, since psychophysicists have developed displays that sidestep this issue (by placing items on a constant-eccentricity ring as shown in Fig 1). [sent-53, score-0.308]

31 Third, we do not account for overt and covert attentional shifts. [sent-54, score-0.248]

32 Overt attentional shifts are manifested by saccades (eye motions), which happen every 200ms or so. [sent-55, score-0.138]

33 Since the post-decision motor response to a stimulus by pressing a button takes about 250-300ms, one does not need to worry about eye motions when response times are shorter than 500ms. [sent-56, score-0.611]

34 For longer RTs, one may enforce eye ﬁxation at the center of the display so as to prevent overt attentional shifts. [sent-57, score-0.377]

35 Furthermore, our model explains serial search without the need to invoke covert attentional shifts [20] which are difﬁcult to prove neurophysiologically. [sent-58, score-0.368]

36 2 Target discrimination at a single location with Poisson neurons We ﬁrst consider probabilistic reasoning at one location, where two possible stimuli may appear. [sent-59, score-0.395]

37 We will call them distractor (D) and target (T), also labeled C = 1 and C = 2 (call c ∈ {1, 2} the generic value of C). [sent-63, score-0.397]

38 Based on the response of N neurons (a hypercolumn) we will decide whether the stimulus was a target or a distractor. [sent-64, score-0.806]

39 Given the evidence T (deﬁned further below in terms of the neurons’ activity) we wish to decide whether the stimulus was of type 1 or 2. [sent-68, score-0.212]

40 We may do so when the probability P (C = 1|T ) of the stimulus being of type 1 given the observations in T exceeds a given threshold T1 (T1 = 0. [sent-69, score-0.244]

41 If 2 Neurons’ tuning curves Mean spiking rate per second 11 12 =90o 0. [sent-77, score-0.177]

42 01 i (T, i) per s 7 6 5 4 diffusion jump per spike (spikes/s) 8 Poisson Expected firing rate (spikes per second) 10 Diffusion jump caused by action potential D 8 6 4 0. [sent-84, score-0.563]

43 25 0 50 100 150 Stimulus orientation (degrees) 0 0 50 100 150 Neuron’s preferred orientation (degrees) 0 50 100 150 Neuron’s preferred orientation (degrees) dt max dt 0 1 B dt T1 1 T1 T1 0 . [sent-92, score-0.701]

44 0 1 1 C AND OR 0 1 D Figure 2: (Left three panels) Model of a hypercolumn in V1/V2 cortex composed of four orientation-tuned neurons (our simulations use 32). [sent-103, score-0.316]

45 The left panel shows the neurons’ tuning curve λ(θ) representing the expected Poisson ﬁring rate when the stimulus has orientation θ. [sent-104, score-0.4]

46 The middle plot shows the expected ﬁring rate of the population of neurons for two stimuli whose orientation is indicated with a red (distractor) and green (target) vertical line. [sent-105, score-0.407]

47 The third plot shows the step-change in the value of the diffusion when an action potential is registered from a given neuron. [sent-106, score-0.259]

48 The action potentials of a hypercolumn of neurons (top) are integrated in time to produce a diffusion. [sent-109, score-0.505]

49 When the diffusion reaches either an upper bound T1 or a lower bound T0 the decision is taken that either the target is present (1) or the target is absent (0). [sent-110, score-0.803]

50 (B) While not a diffusion, it may be seen as a ‘soft maximum’ combination of local diffusions: the local diffusions are ﬁrst exponentiated, then averaged; the log of the result is compared to two thresholds to reach a decision. [sent-112, score-0.332]

51 (C) The ‘Max approximation’ is a simpliﬁed approximation of the ideal observer, where the maximum of local diffusions replaces a soft-maximum. [sent-113, score-0.354]

52 We will model the ﬁring rate of the neurons with a Poisson pdf: the number n of action potentials that will be observed during one second is distributed as P (n|λ) = λn e−λ /n! [sent-120, score-0.421]

53 The constant λ is the expectation of the number of action potentials per second. [sent-122, score-0.232]

54 , N } is tuned to a different orientation θi ; for the sake of simplicity we will assume that the width of the tuning curve is the same for all neurons; i. [sent-126, score-0.252]

55 each neuron i will respond to stimulus c with expectation λi = f (|θ(c) −θi |) (in spikes per second) which are determined by the distance between the neuron’s c preferred orientation θi and by the stimulus orientation θ(c) . [sent-128, score-0.866]

56 Let Ti = {ti } be the set of action potentials from neuron i produced starting at t = 0 and until k the end of the observation period t = T . [sent-129, score-0.333]

57 Indicate with T = {tk } = i Ti the complete set of action potentials from all neurons (where the tk are sorted). [sent-130, score-0.511]

58 We will indicate with i(k) the index of the neuron who ﬁred the action potential at time tk . [sent-131, score-0.364]

59 Call Ik = (tk tk+1 ) the intervals of time in between action potentials, where I0 = (0 t1 ). [sent-132, score-0.149]

60 The signal coming from the neurons is thus a concatenation of ‘spikes’ and ‘intervals’, and the interval (0, T ) may be viewed as the union of instants tk and open intervals (tk , tk+1 ). [sent-136, score-0.377]

61 (0, T ) = I0 t1 I1 t2 · · · Since the spike trains Ti and T are Poisson processes, once we condition on the class of the stimulus the spike times are independent. [sent-139, score-0.27]

62 05 Mean response time (ms) Normalized counts 0. [sent-155, score-0.17]

63 Also, they are longer when the target is absent (see Fig. [sent-167, score-0.381]

64 Notice that the response times have a Gaussian-like distribution when time is plotted on a log scale, and the width of the distribution does not change signiﬁcantly as the difﬁculty of the task changes; thus, the mean and median response time are equivalently informative statistics of RT. [sent-169, score-0.457]

65 (B) Mean RT as a function of the number M of items for different values of target contrast; the curves appear linear as a function of log M [21]. [sent-170, score-0.431]

66 Notice that RT slope is almost zero (‘parallel search’) when the target has high contrast, while when target contrast is low RT increases signiﬁcantly with M (‘serial search’) [1]. [sent-171, score-0.552]

67 The response times observed using the Max approximation are almost identical to those obtained with the ideal observer. [sent-172, score-0.34]

68 Ideal bayesian observer (blue) and Max approximation (cyan) are almost identical indicating that the Max approximation’s performance is almost as good as that of the optimal observer. [sent-176, score-0.441]

69 5 and 6 we know that the difference in diffusion value between the target location and the distractor location grows linearly in time. [sent-178, score-0.728]

70 11: log Rtot ∗ ≈ log Rl − log M if Rl << Rl ∗ (13) On the other hand, when |T | is small, we resort to another approximation (see supplementary material for derivation): √ ∗ b2 t 1 exp(b2 t) + M − 1 log Rtot ≈ log Rl − µM b1 t + 1 − log( ) 2 2 M where µM ≡ M distribution. [sent-184, score-0.31]

71 ∞ −∞ if Rl ≈ Rl ∗ (14) zΦM −1 (z)N (z)dz, and N (z) and Φ(z) denote the pdf and cdf of normal Since the max diffusion does not represent the global log likelihood ratio, its thresholds can not be computed directly from the error rates. [sent-185, score-0.4]

72 Nonetheless we can ﬁrst compute analytically the thresholds for the Bayesian observer (Eqn. [sent-186, score-0.424]

73 Finally, we threshold the maximum local diffusion log Rl with respect to the adjusted upper and lower threshold to make our decision. [sent-189, score-0.392]

74 In this experiment we explore the model’s prediction of response time over a series of interesting conditions. [sent-192, score-0.17]

75 pdf 7 neurons per location N = 32, the tuning width of each neuron = π/8, the maximum expected ﬁring rate (λ = 10 action potentials per second) and minimum expected ﬁring rate (λ = 1 a. [sent-196, score-0.935]

76 /s) of a neuron, which reﬂects the signal-to-noise ratio of the neuron’s tuning curves, the number of items (locations) in the display M = 10 and the stimulus contrast ∆θ = π/6. [sent-198, score-0.537]

77 We will focus on how predictions change when the display parameters are changed over a set of discrete settings: M ∈ {3, 10, 30} and ∆θ ∈ {π/18, π/6, π/2}. [sent-200, score-0.124]

78 For each setting of the parameters, we simulate the bayesian and the max model for 1000 runs. [sent-201, score-0.117]

79 For each η we search for the best pair of upper and lower thresholds that achieve F N R ≈ FPR ≈ η. [sent-204, score-0.204]

80 5 0] for the optimal lower threshold (an upper threshold of 3. [sent-207, score-0.136]

81 The search is conducted exhaustively over an [80 × 80] discretization of the joint space of the thresholds. [sent-210, score-0.109]

82 We record the response time distributions for all parameter settings and for all values of η (Fig. [sent-211, score-0.17]

83 For a Bayesian observer, the thresholds yielding a given error rate can be computed exactly independent of the display (Eqn. [sent-215, score-0.224]

84 On the contrary, in order for the max model to achieve the equivalent performance, its threshold must be adjusted differently depending on the number of items M and the target contrasts ∆θ (Eqn. [sent-217, score-0.5]

85 As a result, if a constant threshold is used for all conditions, we would expect the Bayesian observer ER to be roughly constant, whereas the Max model would have considerable ER variability. [sent-219, score-0.397]

86 The threshold is set as the optimal threshold that produces 5% error for the Bayesian observer at a single location M = 1 and with ∆θ = π/18. [sent-222, score-0.548]

87 6 Discussion and conclusions We presented a Bayesian ideal observer model of visual search. [sent-223, score-0.678]

88 Neurons are modeled as Poisson units and the model has only four free parameters: the number of neurons per hypercolumn, the tuning width of their response curve, the maximum and the minimum ﬁring rate of each neuron. [sent-225, score-0.573]

89 The model predicts qualitatively the main phenomena that are observed in visual search: serial vs. [sent-226, score-0.336]

90 parallel search [1], the Gaussian-like shape of the response time histograms in log time [7] and the faster response times when the target is present [3]. [sent-227, score-0.73]

91 Unlike the case of binary detection/decision, the ideal observer may not be implemented by a diffusion. [sent-229, score-0.454]

92 However, it may be implemented using a precisely deﬁned ‘soft-max’ combination of diffusions, each one of which is computed at a different location across the visual ﬁeld. [sent-230, score-0.307]

93 We discuss an approximation of the ideal observer, the Max model, which has two natural and simple implementations in neural hardware. [sent-231, score-0.17]

94 The Max model is found experimentally to have a performance that is very close to that of the ideal observer when the task parameters do not change. [sent-232, score-0.454]

95 We explored whether any combinations of target contrast and number of distractors would produce signiﬁcantly different predictions of the ideal observer vs the Max model approximation and found none in the case where the visual system can estimate decision thresholds in advance. [sent-233, score-1.36]

96 However, our simulations predict different error rates when interleaving images containing diverse contrast levels and distractor numbers. [sent-234, score-0.287]

97 What are the shapes of response time distributions in visual search? [sent-292, score-0.394]

98 Just say no: How are visual searches terminated when there is no target present? [sent-315, score-0.452]

99 The diffusion decision model: theory and data for two-choice decision tasks. [sent-325, score-0.291]

100 Uncertainty explains many aspects of visual contrast detection and discrimination. [sent-330, score-0.269]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('observer', 0.329), ('target', 0.228), ('visual', 0.224), ('rt', 0.213), ('neurons', 0.196), ('diffusions', 0.184), ('rl', 0.179), ('stimulus', 0.176), ('response', 0.17), ('distractor', 0.169), ('diffusion', 0.165), ('neuron', 0.144), ('distractors', 0.136), ('tk', 0.126), ('ideal', 0.125), ('orientation', 0.124), ('hypercolumn', 0.12), ('absent', 0.119), ('items', 0.116), ('ring', 0.111), ('search', 0.109), ('attentional', 0.105), ('potentials', 0.095), ('thresholds', 0.095), ('action', 0.094), ('display', 0.093), ('overt', 0.09), ('physiologically', 0.09), ('rtot', 0.09), ('rts', 0.09), ('poisson', 0.088), ('location', 0.083), ('er', 0.07), ('dt', 0.069), ('threshold', 0.068), ('serial', 0.068), ('psychology', 0.068), ('bayesian', 0.067), ('discrimination', 0.065), ('width', 0.064), ('tuning', 0.064), ('decision', 0.063), ('clutter', 0.062), ('discriminability', 0.06), ('hypercolumns', 0.06), ('navalpakkam', 0.06), ('setsize', 0.06), ('vidhya', 0.06), ('eye', 0.055), ('psychological', 0.055), ('intervals', 0.055), ('bayes', 0.054), ('log', 0.053), ('covert', 0.053), ('pops', 0.053), ('slope', 0.051), ('decisions', 0.051), ('stimuli', 0.051), ('max', 0.05), ('palmer', 0.048), ('shadlen', 0.048), ('spike', 0.047), ('jump', 0.046), ('ti', 0.046), ('fpr', 0.045), ('displays', 0.045), ('contrast', 0.045), ('approximation', 0.045), ('predicts', 0.044), ('per', 0.043), ('ratio', 0.043), ('spikes', 0.043), ('roger', 0.041), ('caltech', 0.041), ('rates', 0.041), ('mechanisms', 0.041), ('motions', 0.04), ('vs', 0.039), ('adjusted', 0.038), ('psychologists', 0.038), ('neurosci', 0.037), ('pdf', 0.037), ('cortical', 0.036), ('decide', 0.036), ('rate', 0.036), ('preferred', 0.036), ('ms', 0.035), ('discriminate', 0.035), ('integrating', 0.034), ('longer', 0.034), ('curves', 0.034), ('false', 0.034), ('shifts', 0.033), ('perona', 0.033), ('degrees', 0.033), ('predict', 0.032), ('inverted', 0.032), ('perceptual', 0.032), ('predictions', 0.031), ('review', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000006 219 nips-2011-Predicting response time and error rates in visual search

Author: Bo Chen, Vidhya Navalpakkam, Pietro Perona

2 0.23533112 224 nips-2011-Probabilistic Modeling of Dependencies Among Visual Short-Term Memory Representations

Author: Emin Orhan, Robert A. Jacobs

Abstract: Extensive evidence suggests that items are not encoded independently in visual short-term memory (VSTM). However, previous research has not quantitatively considered how the encoding of an item inﬂuences the encoding of other items. Here, we model the dependencies among VSTM representations using a multivariate Gaussian distribution with a stimulus-dependent mean and covariance matrix. We report the results of an experiment designed to determine the speciﬁc form of the stimulus-dependence of the mean and the covariance matrix. We ﬁnd that the magnitude of the covariance between the representations of two items is a monotonically decreasing function of the difference between the items’ feature values, similar to a Gaussian process with a distance-dependent, stationary kernel function. We further show that this type of covariance function can be explained as a natural consequence of encoding multiple stimuli in a population of neurons with correlated responses. 1

3 0.18433948 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

Author: Kamiar R. Rad, Liam Paninski

Abstract: Many fundamental questions in theoretical neuroscience involve optimal decoding and the computation of Shannon information rates in populations of spiking neurons. In this paper, we apply methods from the asymptotic theory of statistical inference to obtain a clearer analytical understanding of these quantities. We ﬁnd that for large neural populations carrying a ﬁnite total amount of information, the full spiking population response is asymptotically as informative as a single observation from a Gaussian process whose mean and covariance can be characterized explicitly in terms of network and single neuron properties. The Gaussian form of this asymptotic sufﬁcient statistic allows us in certain cases to perform optimal Bayesian decoding by simple linear transformations, and to obtain closed-form expressions of the Shannon information carried by the network. One technical advantage of the theory is that it may be applied easily even to non-Poisson point process network models; for example, we ﬁnd that under some conditions, neural populations with strong history-dependent (non-Poisson) effects carry exactly the same information as do simpler equivalent populations of non-interacting Poisson neurons with matched ﬁring rates. We argue that our ﬁndings help to clarify some results from the recent literature on neural decoding and neuroprosthetic design.

4 0.17244551 302 nips-2011-Variational Learning for Recurrent Spiking Networks

Author: Danilo J. Rezende, Daan Wierstra, Wulfram Gerstner

Abstract: We derive a plausible learning rule for feedforward, feedback and lateral connections in a recurrent network of spiking neurons. Operating in the context of a generative model for distributions of spike sequences, the learning mechanism is derived from variational inference principles. The synaptic plasticity rules found are interesting in that they are strongly reminiscent of experimental Spike Time Dependent Plasticity, and in that they differ for excitatory and inhibitory neurons. A simulation conﬁrms the method’s applicability to learning both stationary and temporal spike patterns. 1

5 0.16881472 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

Author: Yan Karklin, Eero P. Simoncelli

Abstract: Efﬁcient coding provides a powerful principle for explaining early sensory coding. Most attempts to test this principle have been limited to linear, noiseless models, and when applied to natural images, have yielded oriented ﬁlters consistent with responses in primary visual cortex. Here we show that an efﬁcient coding model that incorporates biologically realistic ingredients – input and output noise, nonlinear response functions, and a metabolic cost on the ﬁring rate – predicts receptive ﬁelds and response nonlinearities similar to those observed in the retina. Speciﬁcally, we develop numerical methods for simultaneously learning the linear ﬁlters and response nonlinearities of a population of model neurons, so as to maximize information transmission subject to metabolic costs. When applied to an ensemble of natural images, the method yields ﬁlters that are center-surround and nonlinearities that are rectifying. The ﬁlters are organized into two populations, with On- and Off-centers, which independently tile the visual space. As observed in the primate retina, the Off-center neurons are more numerous and have ﬁlters with smaller spatial extent. In the absence of noise, our method reduces to a generalized version of independent components analysis, with an adapted nonlinear “contrast” function; in this case, the optimal ﬁlters are localized and oriented.

6 0.1633033 249 nips-2011-Sequence learning with hidden units in spiking neural networks

7 0.13559282 88 nips-2011-Environmental statistics and the trade-off between model-based and TD learning in humans

8 0.13297445 37 nips-2011-Analytical Results for the Error in Filtering of Gaussian Processes

9 0.13105035 24 nips-2011-Active learning of neural response functions with Gaussian processes

10 0.11430619 23 nips-2011-Active dendrites: adaptation to spike-based communication

11 0.11096744 298 nips-2011-Unsupervised learning models of primary cortical receptive fields and receptive field plasticity

12 0.10733129 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data

13 0.10575525 34 nips-2011-An Unsupervised Decontamination Procedure For Improving The Reliability Of Human Judgments

14 0.10425069 2 nips-2011-A Brain-Machine Interface Operating with a Real-Time Spiking Neural Network Control Algorithm

15 0.10290564 86 nips-2011-Empirical models of spiking in neural populations

16 0.10253657 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

17 0.099810138 200 nips-2011-On the Analysis of Multi-Channel Neural Spike Data

18 0.099202961 304 nips-2011-Why The Brain Separates Face Recognition From Object Recognition

19 0.097167842 44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

20 0.095725872 75 nips-2011-Dynamical segmentation of single trials from population neural data

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.229), (1, 0.092), (2, 0.276), (3, 0.072), (4, 0.1), (5, 0.059), (6, -0.071), (7, -0.054), (8, -0.045), (9, 0.023), (10, -0.0), (11, 0.076), (12, 0.008), (13, 0.057), (14, 0.158), (15, 0.037), (16, 0.019), (17, -0.02), (18, 0.103), (19, 0.015), (20, -0.129), (21, 0.059), (22, -0.042), (23, 0.007), (24, -0.052), (25, -0.004), (26, -0.028), (27, -0.023), (28, 0.034), (29, 0.013), (30, -0.1), (31, 0.048), (32, -0.101), (33, -0.079), (34, 0.035), (35, -0.053), (36, 0.001), (37, 0.027), (38, -0.153), (39, 0.082), (40, 0.047), (41, -0.009), (42, -0.047), (43, 0.037), (44, 0.077), (45, -0.093), (46, -0.072), (47, 0.043), (48, 0.004), (49, 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96961218 219 nips-2011-Predicting response time and error rates in visual search

Author: Bo Chen, Vidhya Navalpakkam, Pietro Perona

2 0.81263441 224 nips-2011-Probabilistic Modeling of Dependencies Among Visual Short-Term Memory Representations

Author: Emin Orhan, Robert A. Jacobs

3 0.70766824 85 nips-2011-Emergence of Multiplication in a Biophysical Model of a Wide-Field Visual Neuron for Computing Object Approaches: Dynamics, Peaks, & Fits

Author: Matthias S. Keil

Abstract: Many species show avoidance reactions in response to looming object approaches. In locusts, the corresponding escape behavior correlates with the activity of the lobula giant movement detector (LGMD) neuron. During an object approach, its ﬁring rate was reported to gradually increase until a peak is reached, and then it declines quickly. The η-function predicts that the LGMD activity is a product ˙ between an exponential function of angular size exp(−Θ) and angular velocity Θ, and that peak activity is reached before time-to-contact (ttc). The η-function has become the prevailing LGMD model because it reproduces many experimental observations, and even experimental evidence for the multiplicative operation was reported. Several inconsistencies remain unresolved, though. Here we address ˙ these issues with a new model (ψ-model), which explicitly connects Θ and Θ to biophysical quantities. The ψ-model avoids biophysical problems associated with implementing exp(·), implements the multiplicative operation of η via divisive inhibition, and explains why activity peaks could occur after ttc. It consistently predicts response features of the LGMD, and provides excellent ﬁts to published experimental data, with goodness of ﬁt measures comparable to corresponding ﬁts with the η-function. 1 Introduction: τ and η Collision sensitive neurons were reported in species such different as monkeys [5, 4], pigeons [36, 34], frogs [16, 20], and insects [33, 26, 27, 10, 38]. This indicates a high ecological relevance, and raises the question about how neurons compute a signal that eventually triggers corresponding movement patterns (e.g. escape behavior or interceptive actions). Here, we will focus on visual stimulation. Consider, for simplicity, a circular object (diameter 2l), which approaches the eye at a collision course with constant velocity v. If we do not have any a priori knowledge about the object in question (e.g. its typical size or speed), then we will be able to access only two information sources. These information sources can be measured at the retina and are called optical variables (OVs). The ﬁrst is the visual angle Θ, which can be derived from the number of stimulated photore˙ ˙ ceptors (spatial contrast). The second is its rate of change dΘ(t)/dt ≡ Θ(t). Angular velocity Θ is related to temporal contrast. ˙ How should we combine Θ and Θ in order to track an imminent collision? The perhaps simplest ˙ combination is τ (t) ≡ Θ(t)/Θ(t) [13, 18]. If the object hit us at time tc , then τ (t) ≈ tc − t will ∗ Also: www.ir3c.ub.edu, Research Institute for Brain, Cognition, and Behaviour (IR3C) Ediﬁci de Ponent, Campus Mundet, Universitat de Barcelona, Passeig Vall d’Hebron, 171. E-08035 Barcelona 1 give us a running estimation of the time that is left until contact1 . Moreover, we do not need to know anything about the approaching object: The ttc estimation computed by τ is practically independent of object size and velocity. Neurons with τ -like responses were indeed identiﬁed in the nucleus retundus of the pigeon brain [34]. In humans, only fast interceptive actions seem to rely exclusively on τ [37, 35]. Accurate ttc estimation, however, seems to involve further mechanisms (rate of disparity change [31]). ˙ Another function of OVs with biological relevance is η ≡ Θ exp(−αΘ), with α = const. [10]. While η-type neurons were found again in pigeons [34] and bullfrogs [20], most data were gathered from the LGMD2 in locusts (e.g. [10, 9, 7, 23]). The η-function is a phenomenological model for the LGMD, and implies three principal hypothesis: (i) An implementation of an exponential function exp(·). Exponentation is thought to take place in the LGMD axon, via active membrane conductances [8]. Experimental data, though, seem to favor a third-power law rather than exp(·). (ii) The LGMD carries out biophysical computations for implementing the multiplicative operation. It has been suggested that multiplication is done within the LGMD itself, by subtracting the loga˙ rithmically encoded variables log Θ − αΘ [10, 8]. (iii) The peak of the η-function occurs before ˆ ttc, at visual angle Θ(t) = 2 arctan(1/α) [9]. It follows ttc for certain stimulus conﬁgurations (e.g. ˆ l/|v| 5ms). In principle, t > tc can be accounted for by η(t + δ) with a ﬁxed delay δ < 0 (e.g. −27ms). But other researchers observed that LGMD activity continuous to rise after ttc even for l/|v| 5ms [28]. These discrepancies remain unexplained so far [29], but stimulation dynamics perhaps plays a role. We we will address these three issues by comparing the novel function “ψ” with the η-function. LGMD computations with the ψ-function: No multiplication, no exponentiation 2 A circular object which starts its approach at distance x0 and with speed v projects a visual angle Θ(t) = 2 arctan[l/(x0 − vt)] on the retina [34, 9]. The kinematics is hence entirely speciﬁed by the ˙ half-size-to-velocity ratio l/|v|, and x0 . Furthermore, Θ(t) = 2lv/((x0 − vt)2 + l2 ). In order to deﬁne ψ, we consider at ﬁrst the LGMD neuron as an RC-circuit with membrane potential3 V [17] dV Cm = β (Vrest − V ) + gexc (Vexc − V ) + ginh (Vinh − V ) (1) dt 4 Cm = membrane capacity ; β ≡ 1/Rm denotes leakage conductance across the cell membrane (Rm : membrane resistance); gexc and ginh are excitatory and inhibitory inputs. Each conductance gi (i = exc, inh ) can drive the membrane potential to its associated reversal potential Vi (usually Vinh ≤ Vexc ). Shunting inhibition means Vinh = Vrest . Shunting inhibition lurks “silently” because it gets effective only if the neuron is driven away from its resting potential. With synaptic input, the neuron decays into its equilibrium state Vrest β + Vexc gexc + Vinh ginh V∞ ≡ (2) β + gexc + ginh according to V (t) = V∞ (1 − exp(−t/τm )). Without external input, V (t 1) → Vrest . The time scale is set by τm . Without synaptic input τm ≡ Cm /β. Slowly varying inputs gexc , ginh > 0 modify the time scale to approximately τm /(1 + (gexc + ginh )/β). For highly dynamic inputs, such as in late phase of the object approach, the time scale gets dynamical as well. The ψ-model assigns synaptic inputs5 ˙ ˙ ˙ ˙ gexc (t) = ϑ(t), ϑ(t) = ζ1 ϑ(t − ∆tstim ) + (1 − ζ1 )Θ(t) (3a) e ginh (t) = [γϑ(t)] , ϑ(t) = ζ0 ϑ(t − ∆tstim ) + (1 − ζ0 )Θ(t) 1 (3b) This linear approximation gets worse with increasing Θ, but turns out to work well until short before ttc (τ adopts a minimum at tc − 0.428978 · l/|v|). 2 LGMD activity is usually monitored via its postsynaptic neuron, the Descending Contralateral Movement Detector (DCMD) neuron. This represents no problem as LGMD spikes follow DCMD spikes 1:1 under visual stimulation [22] from 300Hz [21] to at least 400Hz [24]. 3 Here we assume that the membrane potential serves as a predictor for the LGMD’s mean ﬁring rate. 4 Set to unity for all simulations. 5 LGMD receives also inhibition from a laterally acting network [21]. The η-function considers only direct feedforward inhibition [22, 6], and so do we. 2 Θ ∈ [7.63Â°, 180.00Â°[ temporal resolution ∆ tstim=1.0ms l/|v|=20.00ms, β=1.00, γ=7.50, e=3.00, ζ0=0.90, ζ1=0.99, nrelax=25 0.04 scaled dΘ/dt continuous discretized 0.035 0.03 Θ(t) (input) ϑ(t) (filtered) voltage V(t) (output) t = 56ms max t =300ms c 0.025 0 10 2 η(t): α=3.29, R =1.00 n =10 → t =37ms log Θ(t) amplitude relax max 0.02 0.015 0.01 0.005 0 −0.005 0 50 100 150 200 250 300 −0.01 0 350 time [ms] 50 100 150 200 250 300 350 time [ms] (b) ψ versus η (a) discretized optical variables Figure 1: (a) The continuous visual angle of an approaching object is shown along with its discretized version. Discretization transforms angular velocity from a continuous variable into a series of “spikes” (rescaled). (b) The ψ function with the inputs shown in a, with nrelax = 25 relaxation time steps. Its peak occurs tmax = 56ms before ttc (tc = 300ms). An η function (α = 3.29) that was ﬁtted to ψ shows good agreement. For continuous optical variables, the peak would occur 4ms earlier, and η would have α = 4.44 with R2 = 1. For nrelax = 10, ψ is farther away from its equilibrium at V∞ , and its peak moves 19ms closer to ttc. t =500ms, dia=12.0cm, ∆t c =1.00ms, dt=10.00µs, discrete=1 stim 250 n relax = 50 2 200 α=4.66, R =0.99 [normal] n = 25 relax 2 α=3.91, R =1.00 [normal] n =0 relax tmax [ms] 150 2 α=1.15, R =0.99 [normal] 100 50 0 β=1.00, γ=7.50, e=3.00, V =−0.001, ζ =0.90, ζ =0.99 inh −50 5 10 15 20 25 30 0 35 1 40 45 50 l/|v| [ms] (a) different nrelax (b) different ∆tstim ˆ ˆ Figure 2: The ﬁgures plot the relative time tmax ≡ tc − t of the response peak of ψ, V (t), as a function of half-size-to-velocity ratio (points). Line ﬁts with slope α and intercept δ were added (lines). The predicted linear relationship in all cases is consistent with experimental evidence [9]. (a) The stimulus time scale is held constant at ∆tstim = 1ms, and several LGMD time scales are deﬁned by nrelax (= number of intercalated relaxation steps for each integration time step). Bigger values of nrelax move V (t) closer to its equilibrium V∞ (t), implying higher slopes α in turn. (b) LGMD time scale is ﬁxed at nrelax = 25, and ∆tstim is manipulated. Because of the discretization of optical variables (OVs) in our simulation, increasing ∆tstim translates to an overall smaller number of jumps in OVs, but each with higher amplitude. Thus, we say ψ(t) ≡ V (t) if and only if gexc and ginh are deﬁned with the last equation. The time ˙ scale of stimulation is deﬁned by ∆tstim (by default 1ms). The variables ϑ and ϑ are lowpass ﬁltered angular size and rate of expansion, respectively. The amount of ﬁltering is deﬁned by memory constants ζ0 and ζ1 (no ﬁltering if zero). The idea is to continue with generating synaptic input ˙ after ttc, where Θ(t > tc ) = const and thus Θ(t > tc ) = 0. Inhibition is ﬁrst weighted by γ, and then potentiated by the exponent e. Hodgkin-Huxley potentiates gating variables n, m ∈ [0, 1] instead (potassium ∝ n4 , sodium ∝ m3 , [12]) and multiplies them with conductances. Gabbiani and co-workers found that the function which transforms membrane potential to ﬁring rate is better described by a power function with e = 3 than by exp(·) (Figure 4d in [8]). 3 Dynamics of the ψ-function 3 Discretization. In a typical experiment, a monitor is placed a short distance away from the insect’s eye, and an approaching object is displayed. Computer screens have a ﬁxed spatial resolution, and as a consequence size increments of the displayed object proceed in discrete jumps. The locust retina is furthermore composed of a discrete array of ommatidia units. We therefore can expect a corresponding step-wise increment of Θ with time, although optical and neuronal ﬁltering may ˙ smooth Θ to some extent again, resulting in ϑ (ﬁgure 1). Discretization renders Θ discontinuous, ˙ For simulating the dynamics of ψ, we discretized angular size what again will be alleviated in ϑ. ˙ with ﬂoor(Θ), and Θ(t) ≈ [Θ(t + ∆tstim ) − Θ(t)]/∆tstim . Discretized optical variables (OVs) were re-normalized to match the range of original (i.e. continuous) OVs. To peak, or not to peak? Rind & Simmons reject the hypothesis that the activity peak signals impending collision on grounds of two arguments [28]: (i) If Θ(t + ∆tstim ) − Θ(t) 3o in consecutively displayed stimulus frames, the illusion of an object approach would be lost. Such stimulation would rather be perceived as a sequence of rapidly appearing (but static) objects, causing reduced responses. (ii) After the last stimulation frame has been displayed (that is Θ = const), LGMD responses keep on building up beyond ttc. This behavior clearly depends on l/|v|, also according to their own data (e.g. Figure 4 in [26]): Response build up after ttc is typically observed for sufﬁ˙ ciently small values of l/|v|. Input into ψ in situations where Θ = const and Θ = 0, respectively, ˙ is accommodated by ϑ and ϑ, respectively. We simulated (i) by setting ∆tstim = 5ms, thus producing larger and more infrequent jumps in discrete OVs than with ∆tstim = 1ms (default). As a consequence, ϑ(t) grows more slowly (deˆ layed build up of inhibition), and the peak occurs later (tmax ≡ tc − t = 10ms with everything else ˆ ˆ identical with ﬁgure 1b). The peak amplitude V = V (t) decreases nearly sixfold with respect to default. Our model thus predicts the reduced responses observed by Rind & Simmons [28]. Linearity. Time of peak ﬁring rate is linearly related to l/|v| [10, 9]. The η-function is consistent ˆ with this experimental evidence: t = tc − αl/|v| + δ (e.g. α = 4.7, δ = −27ms). The ψ-function reproduces this relationship as well (ﬁgure 2), where α depends critically on the time scale of biophysical processes in the LGMD. We studied the impact of this time scale by choosing 10µs for the numerical integration of equation 1 (algorithm: 4th order Runge-Kutta). Apart from improving the numerical stability of the integration algorithm, ψ is far from its equilibrium V∞ (t) in every moment ˙ t, given the stimulation time scale ∆tstim = 1ms 6 . Now, at each value of Θ(t) and Θ(t), respectively, we intercalated nrelax iterations for integrating ψ. Each iteration takes V (t) asymptotically closer to V∞ (t), and limnrelax 1 V (t) = V∞ (t). If the internal processes in the LGMD cannot keep up with stimulation (nrelax = 0), we obtain slopes values that underestimate experimentally found values (ﬁgure 2a). In contrast, for nrelax 25 we get an excellent agreement with the experimentally determined α. This means that – under the reported experimental stimulation conditions (e.g. [9]) – the LGMD would operate relatively close to its steady state7 . Now we ﬁx nrelax at 25 and manipulate ∆tstim instead (ﬁgure 2b). The default value ∆tstim = 1ms corresponds to α = 3.91. Slightly bigger values of ∆tstim (2.5ms and 5ms) underestimate the experimental α. In addition, the line ﬁts also return smaller intercept values then. We see tmax < 0 up to l/|v| ≈ 13.5ms – LGMD activity peaks after ttc! Or, in other words, LGMD activity continues to increase after ttc. In the limit, where stimulus dynamics is extremely fast, and LGMD processes are kept far from equilibrium at each instant of the approach, α gets very small. As a consequence, tmax gets largely independent of l/|v|: The activity peak would cling to tmax although we varied l/|v|. 4 Freeze! Experimental data versus steady state of “psi” In the previous section, experimentally plausible values for α were obtained if ψ is close to equilibrium at each instant of time during stimulation. In this section we will thus introduce a steady-state 6 Assuming one ∆tstim for each integration time step. This means that by default stimulation and biophysical dynamics will proceed at identical time scales. 7 Notice that in this moment we can only make relative statements - we do not have data at hand for deﬁning absolute time scales 4 tc=500ms, v=2.00m/s ψ∞ → (β varies), γ=3.50, e=3.00, Vinh=−0.001 tc=500ms, v=2.00m/s ψ∞ → β=2.50, γ=3.50, (e varies), Vinh=−0.001 300 tc=500ms, v=2.00m/s ψ∞ → β=2.50, (γ varies), e=3.00, Vinh=−0.001 350 300 β=10.00 β=5.00 norm. rmse = 0.058...0.153 correlation (β,α)=−0.90 (n=4) ∞ β=1.00 e=4.00 norm. |η−ψ | = 0.009...0.114 e=3.00 300 norm. rmse = 0.014...0.160 correlation (e,α)=0.98 (n=4) ∞ e=2.50 250 250 norm. |η−ψ | = 0.043...0.241 ∞ norm. rmse = 0.085...0.315 correlation (γ,α)=1.00 (n=5) 150 tmax [ms] 200 tmax [ms] 200 tmax [ms] γ=5.00 γ=2.50 γ=1.00 γ=0.50 γ=0.25 e=5.00 norm. |η−ψ | = 0.020...0.128 β=2.50 250 200 150 100 150 100 100 50 50 50 0 5 10 15 20 25 30 35 40 45 0 5 50 10 15 20 l/|v| [ms] 25 30 35 40 45 0 5 50 10 15 20 l/|v| [ms] (a) β varies 25 30 35 40 45 50 l/|v| [ms] (b) e varies (c) γ varies ˆ ˆ Figure 3: Each curve shows how the peak ψ∞ ≡ ψ∞ (t) depends on the half-size-to-velocity ratio. In each display, one parameter of ψ∞ is varied (legend), while the others are held constant (ﬁgure title). Line slopes vary according to parameter values. Symbol sizes are scaled according to rmse (see also ﬁgure 4). Rmse was calculated between normalized ψ∞ (t) & normalized η(t) (i.e. both functions ∈ [0, 1] with original minimum and maximum indicated by the textbox). To this end, the ˆ peak of the η-function was placed at tc , by choosing, at each parameter value, α = |v| · (tc − t)/l (for determining correlation, the mean value of α was taken across l/|v|). tc=500ms, v=2.00m/s ψ∞ → (β varies), γ=3.50, e=3.00, Vinh=−0.001 tc=500ms, v=2.00m/s ψ∞ → β=2.50, γ=3.50, (e varies), Vinh=−0.001 tc=500ms, v=2.00m/s ψ∞ → β=2.50, (γ varies), e=3.00, Vinh=−0.001 0.25 β=5.00 0.12 β=2.50 β=1.00 0.1 0.08 (normalized η, ψ∞) 0.12 β=10.00 (normalized η, ψ∞) (normalized η, ψ∞) 0.14 0.1 0.08 γ=5.00 γ=2.50 0.2 γ=1.00 γ=0.50 γ=0.25 0.15 0.06 0.04 0.02 0 5 10 15 20 25 30 35 40 45 50 meant |η(t)−ψ∞(t)| meant |η(t)−ψ∞(t)| meant |η(t)−ψ∞(t)| 0.06 0.04 e=5.00 e=4.00 e=3.00 0.02 e=2.50 10 l/|v| [ms] 15 20 25 30 35 40 45 50 l/|v| [ms] (a) β varies (b) e varies 0.1 0.05 0 5 10 15 20 25 30 35 40 45 50 l/|v| [ms] (c) γ varies Figure 4: This ﬁgure complements ﬁgure 3. It visualizes the time averaged absolute difference between normalized ψ∞ (t) & normalized η(t). For η, its value of α was chosen such that the maxima of both functions coincide. Although not being a ﬁt, it gives a rough estimate on how the shape of both curves deviate from each other. The maximum possible difference would be one. version of ψ (i.e. equation 2 with Vrest = 0, Vexc = 1, and equations 3 plugged in), ψ∞ (t) ≡ e ˙ Θ(t) + Vinh [γΘ(t)] e ˙ β + Θ(t) + [γΘ(t)] (4) (Here we use continuous versions of angular size and rate of expansion). The ψ∞ -function makes life easier when it comes to ﬁtting experimental data. However, it has its limitations, because we brushed the whole dynamic of ψ under the carpet. Figure 3 illustrates how the linˆ ear relationship (=“linearity”) between tmax ≡ tc − t and l/|v| is inﬂuenced by changes in parameter values. Changing any of the values of e, β, γ predominantly causes variation in line slopes. The smallest slope changes are obtained by varying Vinh (data not shown; we checked Vinh = 0, −0.001, −0.01, −0.1). For Vinh −0.01, linearity is getting slightly compromised, as slope increases with l/|v| (e.g. Vinh = −1 α ∈ [4.2, 4.7]). In order to get a notion about how well the shape of ψ∞ (t) matches η(t), we computed timeaveraged difference measures between normalized versions of both functions (details: ﬁgure 3 & 4). Bigger values of β match η better at smaller, but worse at bigger values of l/|v| (ﬁgure 4a). Smaller β cause less variation across l/|v|. As to variation of e, overall curve shapes seem to be best aligned with e = 3 to e = 4 (ﬁgure 4b). Furthermore, better matches between ψ∞ (t) and η(t) correspond to bigger values of γ (ﬁgure 4c). And ﬁnally, Vinh marches again to a different tune (data not shown). Vinh = −0.1 leads to the best agreement (≈ 0.04 across l/|v|) of all Vinh , quite different from the other considered values. For the rest, ψ∞ (t) and η(t) align the same (all have maximum 0.094), 5 ˙ (a) Θ = 126o /s ˙ (b) Θ = 63o /s Figure 5: The original data (legend label “HaGaLa95”) were resampled from ref. [10] and show ˙ DCMD responses to an object approach with Θ = const. Thus, Θ increases linearly with time. The η-function (ﬁtting function: Aη(t+δ)+o) and ψ∞ (ﬁtting function: Aψ∞ (t)+o) were ﬁtted to these data: (a) (Figure 3 Di in [10]) Good ﬁts for ψ∞ are obtained with e = 5 or higher (e = 3 R2 = 0.35 and rmse = 0.644; e = 4 R2 = 0.45 and rmse = 0.592). “Psi” adopts a sigmoid-like curve form which (subjectively) appears to ﬁt the original data better than η. (b) (Figure 3 Dii in [10]) “Psi” yields an excellent ﬁt for e = 3. RoHaTo10 gregarious locust LV=0.03s Θ(t), lv=30ms e011pos014 sgolay with 100 t =107ms max ttc=5.00s ψ adj.R2 0.95 (LM:3) ∞ η(t) adj.R2 1 (TR::1) 2 ψ : R =0.95, rmse=0.004, 3 coefficients ∞ → β=2.22, γ=0.70, e=3.00, V =−0.001, A=0.07, o=0.02, δ=0.00ms inh η: R2=1.00, rmse=0.001 → α=3.30, A=0.08, o=0.0, δ=−10.5ms 3.4 3.6 3.8 4 4.2 4.4 4.6 4.8 5 5.2 time [s] (b) α versus β (a) spike trace Figure 6: (a) DCMD activity in response to a black square (l/|v| = 30ms, legend label “e011pos14”, ref. [30]) approaching to the eye center of a gregarious locust (ﬁnal visual angle 50o ). Data show the ﬁrst stimulation so habituation is minimal. The spike trace (sampled at 104 Hz) was full wave rectiﬁed, lowpass ﬁltered, and sub-sampled to 1ms resolution. Firing rate was estimated with Savitzky-Golay ﬁltering (“sgolay”). The ﬁts of the η-function (Aη(t + δ) + o; 4 coefﬁcients) and ψ∞ -function (Aψ∞ (t) with ﬁxed e, o, δ, Vinh ; 3 coefﬁcients) provide both excellent ﬁts to ﬁring rate. (b) Fitting coefﬁcient α (→ η-function) inversely correlates with β (→ ψ∞ ) when ﬁtting ﬁring rates of another 5 trials as just described (continuous line = line ﬁt to the data points). Similar correlation values would be obtained if e is ﬁxed at values e = 2.5, 4, 5 c = −0.95, −0.96, −0.91. If o was determined by the ﬁtting algorithm, then c = −0.70. No clear correlations with α were obtained for γ. despite of covering different orders of magnitude with Vinh = 0, −0.001, −0.01. Decelerating approach. Hatsopoulos et al. [10] recorded DCMD activity in response to an ap˙ proaching object which projected image edges on the retina moving at constant velocity: Θ = const. ˙ This “linear approach” is perceived as if the object is getting increasingly implies Θ(t) = Θ0 + Θt. slower. But what appears a relatively unnatural movement pattern serves as a test for the functions η & ψ∞ . Figure 5 illustrates that ψ∞ passes the test, and consistently predicts that activity sharply rises in the initial approach phase, and subsequently declines (η passed this test already in the year 1995). 6 Spike traces. We re-sampled about 30 curves obtained from LGMD recordings from a variety of publications, and ﬁtted η & ψ∞ -functions. We cannot show the results here, but in terms of goodness of ﬁt measures, both functions are in the same ballbark. Rather, ﬁgure 6a shows a representative example [30]. When α and β are plotted against each other for ﬁve trials, we see a strong inverse correlation (ﬁgure 6b). Although ﬁve data points are by no means a ﬁrm statistical sample, the strong correlation could indicate that β and α play similar roles in both functions. Biophysically, β is the leakage conductance, which determines the (passive) membrane time constant τm ∝ 1/β of the neuron. Voltage drops within τm to exp(−1) times its initial value. Bigger values of β mean shorter τm (i.e., “faster neurons”). Getting back to η, this would suggest α ∝ τm , such that higher (absolute) values for α would possibly indicate a slower dynamic of the underlying processes. 5 Discussion (“The Good, the Bad, and the Ugly”) Up to now, mainly two classes of LGMD models existed: The phenomenological η-function on the one hand, and computational models with neuronal layers presynaptic to the LGMD on the other (e.g. [25, 15]; real-world video sequences & robotics: e.g. [3, 14, 32, 2]). Computational models predict that LGMD response features originate from excitatory and inhibitory interactions in – and between – presynaptic neuronal layers. Put differently, non-linear operations are generated in the presynaptic network, and can be a function of many (model) parameters (e.g. synaptic weights, time constants, etc.). In contrast, the η-function assigns concrete nonlinear operations to the LGMD [7]. The η-function is accessible to mathematical analysis, whereas computational models have to be probed with videos or artiﬁcial stimulus sequences. The η-function is vague about biophysical parameters, whereas (good) computational models need to be precise at each (model) parameter value. The η-function establishes a clear link between physical stimulus attributes and LGMD activity: It postulates what is to be computed from the optical variables (OVs). But in computational models, such a clear understanding of LGMD inputs cannot always be expected: Presynaptic processing may strongly transform OVs. The ψ function thus represents an intermediate model class: It takes OVs as input, and connects them with biophysical parameters of the LGMD. For the neurophysiologist, the situation could hardly be any better. Psi implements the multiplicative operation of the η-function by shunting inhibition (equation 1: Vexc ≈ Vrest and Vinh ≈ Vrest ). The η-function ﬁts ψ very well according to our dynamical simulations (ﬁgure 1), and satisfactory by the approximate criterion of ﬁgure 4. We can conclude that ψ implements the η-function in a biophysically plausible way. However, ψ does neither explicitly specify η’s multiplicative operation, nor its exponential function exp(·). Instead we have an interaction between shunting inhibition and a power law (·)e , with e ≈ 3. So what about power laws in neurons? Because of e > 1, we have an expansive nonlinearity. Expansive power-law nonlinearities are well established in phenomenological models of simple cells of the primate visual cortex [1, 11]. Such models approximate a simple cell’s instantaneous ﬁring rate r from linear ﬁltering of a stimulus (say Y ) by r ∝ ([Y ]+ )e , where [·]+ sets all negative values to zero and lets all positive pass. Although experimental evidence favors linear thresholding operations like r ∝ [Y − Ythres ]+ , neuronal responses can behave according to power law functions if Y includes stimulus-independent noise [19]. Given this evidence, the power-law function of the inhibitory input into ψ could possibly be interpreted as a phenomenological description of presynaptic processes. The power law would also be the critical feature by means of which the neurophysiologist could distinguish between the η function and ψ. A study of Gabbiani et al. aimed to provide direct evidence for a neuronal implementation of the η-function [8]. Consequently, the study would be an evidence ˙ for a biophysical implementation of “direct” multiplication via log Θ − αΘ. Their experimental evidence fell somewhat short in the last part, where “exponentation through active membrane conductances” should invert logarithmic encoding. Speciﬁcally, the authors observed that “In 7 out of 10 neurons, a third-order power law best described the data” (sixth-order in one animal). Alea iacta est. Acknowledgments MSK likes to thank Stephen M. Rogers for kindly providing the recording data for compiling ﬁgure 6. MSK furthermore acknowledges support from the Spanish Government, by the Ramon and Cajal program and the research grant DPI2010-21513. 7 References [1] D.G. Albrecht and D.B. Hamilton, Striate cortex of monkey and cat: contrast response function, Journal of Neurophysiology 48 (1982), 217–237. [2] S. Bermudez i Badia, U. Bernardet, and P.F.M.J. Verschure, Non-linear neuronal responses as an emergent property of afferent networks: A case study of the locust lobula giant movemement detector, PLoS Computational Biology 6 (2010), no. 3, e1000701. [3] M. Blanchard, F.C. Rind, and F.M.J. Verschure, Collision avoidance using a model of locust LGMD neuron, Robotics and Autonomous Systems 30 (2000), 17–38. [4] D.F. Cooke and M.S.A. Graziano, Super-ﬂinchers and nerves of steel: Defensive movements altered by chemical manipulation of a cortical motor area, Neuron 43 (2004), no. 4, 585–593. [5] L. Fogassi, V. Gallese, L. Fadiga, G. Luppino, M. Matelli, and G. Rizzolatti, Coding of peripersonal space in inferior premotor cortex (area f4), Journal of Neurophysiology 76 (1996), 141–157. [6] F. Gabbiani, I. Cohen, and G. Laurent, Time-dependent activation of feed-forward inhibition in a looming sensitive neuron, Journal of Neurophysiology 94 (2005), 2150–2161. [7] F. Gabbiani, H.G. Krapp, N. Hatsopolous, C.H. Mo, C. Koch, and G. Laurent, Multiplication and stimulus invariance in a looming-sensitive neuron, Journal of Physiology - Paris 98 (2004), 19–34. [8] F. Gabbiani, H.G. Krapp, C. Koch, and G. Laurent, Multiplicative computation in a visual neuron sensitive to looming, Nature 420 (2002), 320–324. [9] F. Gabbiani, H.G. Krapp, and G. Laurent, Computation of object approach by a wide-ﬁeld, motionsensitive neuron, Journal of Neuroscience 19 (1999), no. 3, 1122–1141. [10] N. Hatsopoulos, F. Gabbiani, and G. Laurent, Elementary computation of object approach by a wide-ﬁeld visual neuron, Science 270 (1995), 1000–1003. [11] D.J. Heeger, Modeling simple-cell direction selectivity with normalized, half-squared, linear operators, Journal of Neurophysiology 70 (1993), 1885–1898. [12] A.L. Hodkin and A.F. Huxley, A quantitative description of membrane current and its application to conduction and excitation in nerve, Journal of Physiology 117 (1952), 500–544. [13] F. Hoyle, The black cloud, Pinguin Books, London, 1957. [14] M.S. Keil, E. Roca-Morena, and A. Rodr´guez-V´ zquez, A neural model of the locust visual system for ı a detection of object approaches with real-world scenes, Proceedings of the Fourth IASTED International Conference (Marbella, Spain), vol. 5119, 6-8 September 2004, pp. 340–345. [15] M.S. Keil and A. Rodr´guez-V´ zquez, Towards a computational approach for collision avoidance with ı a real-world scenes, Proceedings of SPIE: Bioengineered and Bioinspired Systems (Maspalomas, Gran Canaria, Canary Islands, Spain) (A. Rodr´guez-V´ zquez, D. Abbot, and R. Carmona, eds.), vol. 5119, ı a SPIE - The International Society for Optical Engineering, 19-21 May 2003, pp. 285–296. [16] J.G. King, J.Y. Lettvin, and E.R. Gruberg, Selective, unilateral, reversible loss of behavioral responses to looming stimuli after injection of tetrodotoxin or cadmium chloride into the frog optic nerve, Brain Research 841 (1999), no. 1-2, 20–26. [17] C. Koch, Biophysics of computation: information processing in single neurons, Oxford University Press, New York, 1999. [18] D.N. Lee, A theory of visual control of braking based on information about time-to-collision, Perception 5 (1976), 437–459. [19] K.D. Miller and T.W. Troyer, Neural noise can explain expansive, power-law nonlinearities in neuronal response functions, Journal of Neurophysiology 87 (2002), 653–659. [20] Hideki Nakagawa and Kang Hongjian, Collision-sensitive neurons in the optic tectum of the bullfrog, rana catesbeiana, Journal of Neurophysiology 104 (2010), no. 5, 2487–2499. [21] M. O’Shea and C.H.F. Rowell, Projection from habituation by lateral inhibition, Nature 254 (1975), 53– 55. [22] M. O’Shea and J.L.D. Williams, The anatomy and output connection of a locust visual interneurone: the lobula giant movement detector (lgmd) neurone, Journal of Comparative Physiology 91 (1974), 257–266. [23] S. Peron and F. Gabbiani, Spike frequency adaptation mediates looming stimulus selectivity, Nature Neuroscience 12 (2009), no. 3, 318–326. [24] F.C. Rind, A chemical synapse between two motion detecting neurones in the locust brain, Journal of Experimental Biology 110 (1984), 143–167. [25] F.C. Rind and D.I. Bramwell, Neural network based on the input organization of an identiﬁed neuron signaling implending collision, Journal of Neurophysiology 75 (1996), no. 3, 967–985. 8 [26] F.C. Rind and P.J. Simmons, Orthopteran DCMD neuron: a reevaluation of responses to moving objects. I. Selective responses to approaching objects, Journal of Neurophysiology 68 (1992), no. 5, 1654–1666. [27] , Orthopteran DCMD neuron: a reevaluation of responses to moving objects. II. Critical cues for detecting approaching objects, Journal of Neurophysiology 68 (1992), no. 5, 1667–1682. [28] , Signaling of object approach by the dcmd neuron of the locust, Journal of Neurophysiology 77 (1997), 1029–1033. [29] , Reply, Trends in Neuroscience 22 (1999), no. 5, 438. [30] S.M. Roger, G.W.J. Harston, F. Kilburn-Toppin, T. Matheson, M. Burrows, F. Gabbiani, and H.G. Krapp, Spatiotemporal receptive ﬁeld properties of a looming-sensitive neuron in solitarious and gregarious phases of desert locust, Journal of Neurophysiology 103 (2010), 779–792. [31] S.K. Rushton and J.P. Wann, Weighted combination of size and disparity: a computational model for timing ball catch, Nature Neuroscience 2 (1999), no. 2, 186–190. [32] Yue. S., Rind. F.C., M.S. Keil, J. Cuadri, and R. Stafford, A bio-inspired visual collision detection mechanism for cars: Optimisation of a model of a locust neuron to a novel environment, Neurocomputing 69 (2006), 1591–1598. [33] G.R. Schlotterer, Response of the locust descending movement detector neuron to rapidly approaching and withdrawing visual stimuli, Canadian Journal of Zoology 55 (1977), 1372–1376. [34] H. Sun and B.J. Frost, Computation of different optical variables of looming objects in pigeon nucleus rotundus neurons, Nature Neuroscience 1 (1998), no. 4, 296–303. [35] J.R. Tresilian, Visually timed action: time-out for ’tau’?, Trends in Cognitive Sciences 3 (1999), no. 8, 1999. [36] Y. Wang and B.J. Frost, Time to collision is signalled by neurons in the nucleus rotundus of pigeons, Nature 356 (1992), 236–238. [37] J.P. Wann, Anticipating arrival: is the tau-margin a specious theory?, Journal of Experimental Psychology and Human Perceptual Performance 22 (1979), 1031–1048. [38] M. Wicklein and N.J. Strausfeld, Organization and signiﬁcance of neurons that detect change of visual depth in the hawk moth manduca sexta, The Journal of Comparative Neurology 424 (2000), no. 2, 356– 376. 9

4 0.63031065 2 nips-2011-A Brain-Machine Interface Operating with a Real-Time Spiking Neural Network Control Algorithm

Author: Julie Dethier, Paul Nuyujukian, Chris Eliasmith, Terrence C. Stewart, Shauki A. Elasaad, Krishna V. Shenoy, Kwabena A. Boahen

Abstract: Motor prostheses aim to restore function to disabled patients. Despite compelling proof of concept systems, barriers to clinical translation remain. One challenge is to develop a low-power, fully-implantable system that dissipates only minimal power so as not to damage tissue. To this end, we implemented a Kalman-ﬁlter based decoder via a spiking neural network (SNN) and tested it in brain-machine interface (BMI) experiments with a rhesus monkey. The Kalman ﬁlter was trained to predict the arm’s velocity and mapped on to the SNN using the Neural Engineering Framework (NEF). A 2,000-neuron embedded Matlab SNN implementation runs in real-time and its closed-loop performance is quite comparable to that of the standard Kalman ﬁlter. The success of this closed-loop decoder holds promise for hardware SNN implementations of statistical signal processing algorithms on neuromorphic chips, which may offer power savings necessary to overcome a major obstacle to the successful clinical translation of neural motor prostheses. ∗ Present: Research Fellow F.R.S.-FNRS, Systmod Unit, University of Liege, Belgium. 1 1 Cortically-controlled motor prostheses: the challenge Motor prostheses aim to restore function for severely disabled patients by translating neural signals from the brain into useful control signals for prosthetic limbs or computer cursors. Several proof of concept demonstrations have shown encouraging results, but barriers to clinical translation still remain. One example is the development of a fully-implantable system that meets power dissipation constraints, but is still powerful enough to perform complex operations. A recently reported closedloop cortically-controlled motor prosthesis is capable of producing quick, accurate, and robust computer cursor movements by decoding neural signals (threshold-crossings) from a 96-electrode array in rhesus macaque premotor/motor cortex [1]-[4]. This, and previous designs (e.g., [5]), employ versions of the Kalman ﬁlter, ubiquitous in statistical signal processing. Such a ﬁlter and its variants are the state-of-the-art decoder for brain-machine interfaces (BMIs) in humans [5] and monkeys [2]. While these recent advances are encouraging, clinical translation of such BMIs requires fullyimplanted systems, which in turn impose severe power dissipation constraints. Even though it is an open, actively-debated question as to how much of the neural prosthetic system must be implanted, we note that there are no reports to date demonstrating a fully implantable 100-channel wireless transmission system, motivating performing decoding within the implanted chip. This computation is constrained by a stringent power budget: A 6 × 6mm2 implant must dissipate less than 10mW to avoid heating the brain by more than 1◦ C [6], which is believed to be important for long term cell health. With this power budget, current approaches can not scale to higher electrode densities or to substantially more computer-intensive decode/control algorithms. The feasibility of mapping a Kalman-ﬁlter based decoder algorithm [1]-[4] on to a spiking neural network (SNN) has been explored off-line (open-loop). In these off-line tests, the SNN’s performance virtually matched that of the standard implementation [7]. These simulations provide conﬁdence that this algorithm—and others similar to it—could be implemented using an ultra-low-power approach potentially capable of meeting the severe power constraints set by clinical translation. This neuromorphic approach uses very-large-scale integrated systems containing microelectronic analog circuits to morph neural systems into silicon chips [8, 9]. These neuromorphic circuits may yield tremendous power savings—50nW per silicon neuron [10]—over digital circuits because they use physical operations to perform mathematical computations (analog approach). When implemented on a chip designed using the neuromorphic approach, a 2,000-neuron SNN network can consume as little as 100µW. Demonstrating this approach’s feasibility in a closed-loop system running in real-time is a key, non-incremental step in the development of a fully implantable decoding chip, and is necessary before proceeding with fabricating and implanting the chip. As noise, delay, and over-ﬁtting play a more important role in the closed-loop setting, it is not obvious that the SNN’s stellar open-loop performance will hold up. In addition, performance criteria are different in the closed-loop and openloop settings (e.g., time per target vs. root mean squared error). Therefore, a SNN of a different size may be required to meet the desired speciﬁcations. Here we present results and assess the performance and viability of the SNN Kalman-ﬁlter based decoder in real-time, closed-loop tests, with the monkey performing a center-out-and-back target acquisition task. To achieve closed-loop operation, we developed an embedded Matlab implementation that ran a 2,000-neuron version of the SNN in real-time on a PC. We achieved almost a 50-fold speed-up by performing part of the computation in a lower-dimensional space deﬁned by the formal method we used to map the Kalman ﬁlter on to the SNN. This shortcut allowed us to run a larger SNN in real-time than would otherwise be possible. 2 Spiking neural network mapping of control theory algorithms As reported in [11], a formal methodology, called the Neural Engineering Framework (NEF), has been developed to map control-theory algorithms onto a computational fabric consisting of a highly heterogeneous population of spiking neurons simply by programming the strengths of their connections. These artiﬁcial neurons are characterized by a nonlinear multi-dimensional-vector-to-spikerate function—ai (x(t)) for the ith neuron—with parameters (preferred direction, maximum ﬁring rate, and spiking-threshold) drawn randomly from a wide distribution (standard deviation ≈ mean). 2 Spike rate (spikes/s) Representation ˆ x → ai (x) → x = ∑i ai (x)φix ˜ ai (x) = G(αi φix · x + Jibias ) 400 Transformation y = Ax → b j (Aˆ ) x Aˆ = ∑i ai (x)Aφix x x(t) B' y(t) A' 200 0 −1 Dynamics ˙ x = Ax → x = h ∗ A x A = τA + I 0 Stimulus x 1 bk(t) y(t) B' h(t) x(t) A' aj(t) Figure 1: NEF’s three principles. Representation. 1D tuning curves of a population of 50 leaky integrate-and-ﬁre neurons. The neurons’ tuning curves map control variables (x) to spike rates (ai (x)); this nonlinear transformation is inverted by linear weighted decoding. G() is the neurons’ nonlinear current-to-spike-rate function. Transformation. SNN with populations bk (t) and a j (t) representing y(t) and x(t). Feedforward and recurrent weights are determined by B and A , as described next. Dynamics. The system’s dynamics is captured in a neurally plausible fashion by replacing integration with the synapses’ spike response, h(t), and replacing the matrices with A = τA + I and B = τB to compensate. The neural engineering approach to conﬁguring SNNs to perform arbitrary computations is underlined by three principles (Figure 1) [11]-[14]: Representation is deﬁned by nonlinear encoding of x(t) as a spike rate, ai (x(t))—represented by the neuron tuning curve—combined with optimal weighted linear decoding of ai (x(t)) to recover ˆ an estimate of x(t), x(t) = ∑i ai (x(t))φix , where φix are the decoding weights. Transformation is performed by using alternate decoding weights in the decoding operation to map transformations of x(t) directly into transformations of ai (x(t)). For example, y(t) = Ax(t) is represented by the spike rates b j (Aˆ (t)), where unit j’s input is computed directly from unit i’s x output using Aˆ (t) = ∑i ai (x(t))Aφix , an alternative linear weighting. x Dynamics brings the ﬁrst two principles together and adds the time dimension to the circuit. This principle aims at reuniting the control-theory and neural levels by modifying the matrices to render the system neurally plausible, thereby permitting the synapses’ spike response, h(t), (i.e., impulse ˙ response) to capture the system’s dynamics. For example, for h(t) = τ −1 e−t/τ , x = Ax(t) is realized by replacing A with A = τA + I. This so-called neurally plausible matrix yields an equivalent dynamical system: x(t) = h(t) ∗ A x(t), where convolution replaces integration. The nonlinear encoding process—from a multi-dimensional stimulus, x(t), to a one-dimensional soma current, Ji (x(t)), to a ﬁring rate, ai (x(t))—is speciﬁed as: ai (x(t)) = G(Ji (x(t))). (1) Here G is the neurons’ nonlinear current-to-spike-rate function, which is given by G(Ji (x)) = τ ref − τ RC ln (1 − Jth /Ji (x)) −1 , (2) for the leaky integrate-and-ﬁre model (LIF). The LIF neuron has two behavioral regimes: subthreshold and super-threshold. The sub-threshold regime is described by an RC circuit with time constant τ RC . When the sub-threshold soma voltage reaches the threshold, Vth , the neuron emits a spike δ (t −tn ). After this spike, the neuron is reset and rests for τ ref seconds (absolute refractory period) before it resumes integrating. Jth = Vth /R is the minimum input current that produces spiking. Ignoring the soma’s RC time-constant when specifying the SNN’s dynamics are reasonable because the neurons cross threshold at a rate that is proportional to their input current, which thus sets the spike rate instantaneously, without any ﬁltering [11]. The conversion from a multi-dimensional stimulus, x(t), to a one-dimensional soma current, Ji , is ˜ performed by assigning to the neuron a preferred direction, φix , in the stimulus space and taking the dot-product: ˜ Ji (x(t)) = αi φix · x(t) + Jibias , (3) 3 where αi is a gain or conversion factor, and Jibias is a bias current that accounts for background ˜ activity. For a 1D space, φix is either +1 or −1 (drawn randomly), for ON and OFF neurons, respectively. The resulting tuning curves are illustrated in Figure 1, left. The linear decoding process is characterized by the synapses’ spike response, h(t) (i.e., post-synaptic currents), and the decoding weights, φix , which are obtained by minimizing the mean square error. A single noise term, η, takes into account all sources of noise, which have the effect of introducing uncertainty into the decoding process. Hence, the transmitted ﬁring rate can be written as ai (x(t)) + ηi , where ai (x(t)) represents the noiseless set of tuning curves and ηi is a random variable picked from a zero-mean Gaussian distribution with variance σ 2 . Consequently, the mean square error can be written as [11]: E = 1 ˆ [x(t) − x(t)]2 2 x,η,t = 2 1 2 x(t) − ∑ (ai (x(t)) + ηi ) φix i (4) x,η,t where · x,η denotes integration over the range of x and η, the expected noise. We assume that the noise is independent and has the same variance for each neuron [11], which yields: E= where σ2 1 2 2 x(t) − ∑ ai (x(t))φix i x,t 1 + σ 2 ∑(φix )2 , 2 i (5) is the noise variance ηi η j . This expression is minimized by: N φix = ∑ Γ−1 ϒ j , ij (6) j with Γi j = ai (x)a j (x) x + σ 2 δi j , where δ is the Kronecker delta function matrix, and ϒ j = xa j (x) x [11]. One consequence of modeling noise in the neural representation is that the matrix Γ is invertible despite the use of a highly overcomplete representation. In a noiseless representation, Γ is generally singular because, due to the large number of neurons, there is a high probability of having two neurons with similar tuning curves leading to two similar rows in Γ. 3 Kalman-ﬁlter based cortical decoder In the 1960’s, Kalman described a method that uses linear ﬁltering to track the state of a dynamical system throughout time using a model of the dynamics of the system as well as noisy measurements [15]. The model dynamics gives an estimate of the state of the system at the next time step. This estimate is then corrected using the observations (i.e., measurements) at this time step. The relative weights for these two pieces of information are given by the Kalman gain, K [15, 16]. Whereas the Kalman gain is updated at each iteration, the state and observation matrices (deﬁned below)—and corresponding noise matrices—are supposed constant. In the case of prosthetic applications, the system’s state vector is the cursor’s kinematics, xt = y [veltx , velt , 1], where the constant 1 allows for a ﬁxed offset compensation. The measurement vector, yt , is the neural spike rate (spike counts in each time step) of 192 channels of neural threshold crossings. The system’s dynamics is modeled by: xt yt = Axt−1 + wt , = Cxt + qt , (7) (8) where A is the state matrix, C is the observation matrix, and wt and qt are additive, Gaussian noise sources with wt ∼ N (0, W) and qt ∼ N (0, Q). The model parameters (A, C, W and Q) are ﬁt with training data by correlating the observed hand kinematics with the simultaneously measured neural signals (Figure 2). For an efﬁcient decoding, we derived the steady-state update equation by replacing the adaptive Kalman gain by its steady-state formulation: K = (I + WCQ−1 C)−1 W CT Q−1 . This yields the following estimate of the system’s state: xt = (I − KC)Axt−1 + Kyt = MDT xt−1 + MDT yt , x y 4 (9) a Velocity (cm/s) Neuron 10 c 150 5 100 b 50 20 0 −20 0 0 x−velocity y−velocity 2000 4000 6000 8000 Time (ms) 10000 12000 1cm 14000 Trials: 0034-0049 Figure 2: Neural and kinematic measurements (monkey J, 2011-04-16, 16 continuous trials) used to ﬁt the standard Kalman ﬁlter model. a. The 192 cortical recordings fed as input to ﬁt the Kalman ﬁlter’s matrices (color code refers to the number of threshold crossings observed in each 50ms bin). b. Hand x- and y-velocity measurements correlated with the neural data to obtain the Kalman ﬁlter’s matrices. c. Cursor kinematics of 16 continuous trials under direct hand control. where MDT = (I − KC)A and MDT = K are the discrete time (DT) Kalman matrices. The steadyx y state formulation improves efﬁciency with little loss in accuracy because the optimal Kalman gain rapidly converges (typically less than 100 iterations). Indeed, in neural applications under both open-loop and closed-loop conditions, the difference between the full Kalman ﬁlter and its steadystate implementation falls to within 1% in a few seconds [17]. This simplifying assumption reduces the execution time for decoding a typical neuronal ﬁring rate signal approximately seven-fold [17], a critical speed-up for real-time applications. 4 Kalman ﬁlter with a spiking neural network To implement the Kalman ﬁlter with a SNN by applying the NEF, we ﬁrst convert Equation 9 from DT to continuous time (CT), and then replace the CT matrices with neurally plausible ones, which yields: x(t) = h(t) ∗ A x(t) + B y(t) , (10) where A = τMCT + I, B = τMCT , with MCT = MDT − I /∆t and MCT = MDT /∆t, the CT x y x x y y Kalman matrices, and ∆t = 50ms, the discrete time step; τ is the synaptic time-constant. The jth neuron’s input current (see Equation 3) is computed from the system’s current state, x(t), which is computed from estimates of the system’s previous state (ˆ (t) = ∑i ai (t)φix ) and current x y input (ˆ (t) = ∑k bk (t)φk ) using Equation 10. This yields: y ˜x J j (x(t)) = α j φ j · x(t) + J bias j ˜x ˆ ˆ = α j φ j · h(t) ∗ A x(t) + B y(t) ˜x = α j φ j · h(t) ∗ A + J bias j ∑ ai (t)φix + B ∑ bk (t)φky i + J bias j (11) k This last equation can be written in a neural network form: J j (x(t)) = h(t) ∗ ∑ ω ji ai (t) + ∑ ω jk bk (t) i + J bias j (12) k y ˜x ˜x where ω ji = α j φ j A φix and ω jk = α j φ j B φk are the recurrent and feedforward weights, respectively. 5 Efﬁcient implementation of the SNN In this section, we describe the two distinct steps carried out when implementing the SNN: creating and running the network. The ﬁrst step has no computational constraints whereas the second must be very efﬁcient in order to be successfully deployed in the closed-loop experimental setting. 5 x ( 1000 x ( = 1000 1000 = 1000 x 1000 b 1000 x 1000 1000 a Figure 3: Computing a 1000-neuron pool’s recurrent connections. a. Using connection weights requires multiplying a 1000×1000 matrix by a 1000 ×1 vector. b. Operating in the lower-dimensional state space requires multiplying a 1 × 1000 vector by a 1000 × 1 vector to get the decoded state, multiplying this state by a component of the A matrix to update it, and multiplying the updated state by a 1000 × 1 vector to re-encode it as ﬁring rates, which are then used to update the soma current for every neuron. Network creation: This step generates, for a speciﬁed number of neurons composing the network, x ˜x the gain α j , bias current J bias , preferred direction φ j , and decoding weight φ j for each neuron. The j ˜x preferred directions φ j are drawn randomly from a uniform distribution over the unit sphere. The maximum ﬁring rate, max G(J j (x)), and the normalized x-axis intercept, G(J j (x)) = 0, are drawn randomly from a uniform distribution on [200, 400] Hz and [-1, 1], respectively. From these two speciﬁcations, α j and J bias are computed using Equation 2 and Equation 3. The decoding weights j x φ j are computed by minimizing the mean square error (Equation 6). For efﬁcient implementation, we used two 1D integrators (i.e., two recurrent neuron pools, with each pool representing a scalar) rather than a single 3D integrator (i.e., one recurrent neuron pool, with the pool representing a 3D vector by itself) [13]. The constant 1 is fed to the 1D integrators as an input, rather than continuously integrated as part of the state vector. We also replaced the bk (t) units’ spike rates (Figure 1, middle) with the 192 neural measurements (spike counts in 50ms bins), y which is equivalent to choosing φk from a standard basis (i.e., a unit vector with 1 at the kth position and 0 everywhere else) [7]. Network simulation: This step runs the simulation to update the soma current for every neuron, based on input spikes. The soma voltage is then updated following RC circuit dynamics. Gaussian noise is normally added at this step, the rest of the simulation being noiseless. Neurons with soma voltage above threshold generate a spike and enter their refractory period. The neuron ﬁring rates are decoded using the linear decoding weights to get the updated states values, x and y-velocity. These values are smoothed with a ﬁlter identical to h(t), but with τ set to 5ms instead of 20ms to avoid introducing signiﬁcant delay. Then the simulation step starts over again. In order to ensure rapid execution of the simulation step, neuron interactions are not updated dix rectly using the connection matrix (Equation 12), but rather indirectly with the decoding matrix φ j , ˜x dynamics matrix A , and preferred direction matrix φ j (Equation 11). To see why this is more efﬁcient, suppose we have 1000 neurons in the a population for each of the state vector’s two scalars. Computing the recurrent connections using connection weights requires multiplying a 1000 × 1000 matrix by a 1000-dimensional vector (Figure 3a). This requires 106 multiplications and about 106 sums. Decoding each scalar (i.e., ∑i ai (t)φix ), however, requires only 1000 multiplications and 1000 sums. The decoded state vector is then updated by multiplying it by the (diagonal) A matrix, another 2 products and 1 sum. The updated state vector is then encoded by multiplying it with the neurons’ preferred direction vectors, another 1000 multiplications per scalar (Figure 3b). The resulting total of about 3000 operations is nearly three orders of magnitude fewer than using the connection weights to compute the identical transformation. To measure the speedup, we simulated a 2,000-neuron network on a computer running Matlab 2011a (Intel Core i7, 2.7-GHz, Mac OS X Lion). Although the exact run-times depend on the computing hardware and software, the run-time reduction factor should remain approximately constant across platforms. For each reported result, we ran the simulation 10 times to obtain a reliable estimate of the execution time. The run-time for neuron interactions using the recurrent connection weights was 9.9ms and dropped to 2.7µs in the lower-dimensional space, approximately a 3,500-fold speedup. Only the recurrent interactions beneﬁt from the speedup, the execution time for the rest of the operations remaining constant. The run-time for a 50ms network simulation using the recurrent connec6 Table 1: Model parameters Symbol max G(J j (x)) G(J j (x)) = 0 J bias j αj ˜x φj Range 200-400 Hz −1 to 1 Satisﬁes ﬁrst two Satisﬁes ﬁrst two ˜x φj = 1 Description Maximum ﬁring rate Normalized x-axis intercept Bias current Gain factor Preferred-direction vector σ2 τ RC j τ ref j τ PSC j 0.1 20 ms 1 ms 20 ms Gaussian noise variance RC time constant Refractory period PSC time constant tion weights was 0.94s and dropped to 0.0198s in the lower-dimensional space, a 47-fold speedup. These results demonstrate the efﬁciency the lower-dimensional space offers, which made the closedloop application of SNNs possible. 6 Closed-loop implementation An adult male rhesus macaque (monkey J) was trained to perform a center-out-and-back reaching task for juice rewards to one of eight targets, with a 500ms hold time (Figure 4a) [1]. All animal protocols and procedures were approved by the Stanford Institutional Animal Care and Use Committee. Hand position was measured using a Polaris optical tracking system at 60Hz (Northern Digital Inc.). Neural data were recorded from two 96-electrode silicon arrays (Blackrock Microsystems) implanted in the dorsal pre-motor and motor cortex. These recordings (-4.5 RMS threshold crossing applied to each electrode’s signal) yielded tuned activity for the direction and speed of arm movements. As detailed in [1], a standard Kalman ﬁlter model was ﬁt by correlating the observed hand kinematics with the simultaneously measured neural signals, while the monkey moved his arm to acquire virtual targets (Figure 2). The resulting model was used in a closed-loop system to control an on-screen cursor in real-time (Figure 4a, Decoder block). A steady-state version of this model serves as the standard against which the SNN implementation’s performance is compared. We built a SNN using the NEF methodology based on derived Kalman ﬁlter parameters mentioned above. This SNN was then simulated on an xPC Target (Mathworks) x86 system (Dell T3400, Intel Core 2 Duo E8600, 3.33GHz). It ran in closed-loop, replacing the standard Kalman ﬁlter as the decoder block in Figure 4a. The parameter values listed in Table 1 were used for the SNN implementation. We ensured that the time constants τiRC ,τiref , and τiPSC were smaller than the implementation’s time step (50ms). Noise was not explicitly added. It arose naturally from the ﬂuctuations produced by representing a scalar with ﬁltered spike trains, which has been shown to have effects similar to Gaussian noise [11]. For the purpose of computing the linear decoding weights (i.e., Γ), we modeled the resulting noise as Gaussian with a variance of 0.1. A 2,000-neuron version of the SNN-based decoder was tested in a closed-loop system, the largest network our embedded MatLab implementation could run in real-time. There were 1206 trials total among which 301 (center-outs only) were performed with the SNN and 302 with the standard (steady-state) Kalman ﬁlter. The block structure was randomized and interleaved, so that there is no behavioral bias present in the ﬁndings. 100 trials under hand control are used as a baseline comparison. Success corresponds to a target acquisition under 1500ms, with 500ms hold time. Success rates were higher than 99% on all blocks for the SNN implementation and 100% for the standard Kalman ﬁlter. The average time to acquire the target was slightly slower for the SNN (Figure 5b)—711ms vs. 661ms, respectively—we believe this could be improved by using more neurons in the SNN.1 The average distance to target (Figure 5a) and the average velocity of the cursor (Figure 5c) are very similar. 1 Off-line, the SNN performed better as we increased the number of neurons [7]. 7 a Neural Spikes b c BMI: Kalman decoder BMI: SNN decoder Decoder Cursor Velocity 1cm 1cm Trials: 2056-2071 Trials: 1748-1763 5 0 0 400 Time after Target Onset (ms) 800 Target acquisition time histogram 40 Mean cursor velocity 50 Standard Kalman filter 40 20 Hand 30 30 Spiking Neural Network 20 10 0 c Cursor Velocity (cm/s) b Mean distance to target 10 Percent of Trials (%) a Distance to Target (cm) Figure 4: Experimental setup and results. a. Data are recorded from two 96-channel silicon electrode arrays implanted in dorsal pre-motor and motor cortex of an adult male monkey performing a centerout-and-back reach task for juice rewards to one of eight targets with a 500ms hold time. b. BMI position kinematics of 16 continuous trials for the standard Kalman ﬁlter implementation. c. BMI position kinematics of 16 continuous trials for the SNN implementation. 10 0 500 1000 Target Acquire Time (ms) 1500 0 0 200 400 600 800 Time after Target Onset (ms) 1000 Figure 5: SNN (red) performance compared to standard Kalman ﬁlter (blue) (hand trials are shown for reference (yellow)). The SNN achieves similar results—success rates are higher than 99% on all blocks—as the standard Kalman ﬁlter implementation. a. Plot of distance to target vs. time both after target onset for different control modalities. The thicker traces represent the average time when the cursor ﬁrst enters the acceptance window until successfully entering for the 500ms hold time. b. Histogram of target acquisition time. c. Plot of mean cursor velocity vs. time. 7 Conclusions and future work The SNN’s performance was quite comparable to that produced by a standard Kalman ﬁlter implementation. The 2,000-neuron network had success rates higher than 99% on all blocks, with mean distance to target, target acquisition time, and mean cursor velocity curves very similar to the ones obtained with the standard implementation. Future work will explore whether these results extend to additional animals. As the Kalman ﬁlter and its variants are the state-of-the-art in cortically-controlled motor prostheses [1]-[5], these simulations provide conﬁdence that similar levels of performance can be attained with a neuromorphic system, which can potentially overcome the power constraints set by clinical applications. Our ultimate goal is to develop an ultra-low-power neuromorphic chip for prosthetic applications on to which control theory algorithms can be mapped using the NEF. As our next step in this direction, we will begin exploring this mapping with Neurogrid, a hardware platform with sixteen programmable neuromorphic chips that can simulate up to a million spiking neurons in real-time [9]. However, bandwidth limitations prevent Neurogrid from realizing random connectivity patterns. It can only connect each neuron to thousands of others if neighboring neurons share common inputs — just as they do in the cortex. Such columnar organization may be possible with NEF-generated networks if preferred directions vectors are assigned topographically rather than randomly. Implementing this constraint effectively is a subject of ongoing research. Acknowledgment This work was supported in part by the Belgian American Education Foundation(J. Dethier), Stanford NIH Medical Scientist Training Program (MSTP) and Soros Fellowship (P. Nuyujukian), DARPA Revolutionizing Prosthetics program (N66001-06-C-8005, K. V. Shenoy), and two NIH Director’s Pioneer Awards (DP1-OD006409, K. V. Shenoy; DPI-OD000965, K. Boahen). 8 References [1] V. Gilja, Towards clinically viable neural prosthetic systems, Ph.D. Thesis, Department of Computer Science, Stanford University, 2010, pp 19–22 and pp 57–73. [2] V. Gilja, P. Nuyujukian, C.A. Chestek, J.P. Cunningham, J.M. Fan, B.M. Yu, S.I. Ryu, and K.V. Shenoy, A high-performance continuous cortically-controlled prosthesis enabled by feedback control design, 2010 Neuroscience Meeting Planner, San Diego, CA: Society for Neuroscience, 2010. [3] P. Nuyujukian, V. Gilja, C.A. Chestek, J.P. Cunningham, J.M. Fan, B.M. Yu, S.I. Ryu, and K.V. Shenoy, Generalization and robustness of a continuous cortically-controlled prosthesis enabled by feedback control design, 2010 Neuroscience Meeting Planner, San Diego, CA: Society for Neuroscience, 2010. [4] V. Gilja, C.A. Chestek, I. Diester, J.M. Henderson, K. Deisseroth, and K.V. Shenoy, Challenges and opportunities for next-generation intra-cortically based neural prostheses, IEEE Transactions on Biomedical Engineering, 2011, in press. [5] S.P. Kim, J.D. Simeral, L.R. Hochberg, J.P. Donoghue, and M.J. Black, Neural control of computer cursor velocity by decoding motor cortical spiking activity in humans with tetraplegia, Journal of Neural Engineering, vol. 5, 2008, pp 455–476. [6] S. Kim, P. Tathireddy, R.A. Normann, and F. Solzbacher, Thermal impact of an active 3-D microelectrode array implanted in the brain, IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 15, 2007, pp 493–501. [7] J. Dethier, V. Gilja, P. Nuyujukian, S.A. Elassaad, K.V. Shenoy, and K. Boahen, Spiking neural network decoder for brain-machine interfaces, IEEE Engineering in Medicine & Biology Society Conference on Neural Engineering, Cancun, Mexico, 2011, pp 396–399. [8] K. Boahen, Neuromorphic microchips, Scientiﬁc American, vol. 292(5), 2005, pp 56–63. [9] R. Silver, K. Boahen, S. Grillner, N. Kopell, and K.L. Olsen, Neurotech for neuroscience: unifying concepts, organizing principles, and emerging tools, Journal of Neuroscience, vol. 27(44), 2007, pp 11807– 11819. [10] J.V. Arthur and K. Boahen, Silicon neuron design: the dynamical systems approach, IEEE Transactions on Circuits and Systems, vol. 58(5), 2011, pp 1034-1043. [11] C. Eliasmith and C.H. Anderson, Neural engineering: computation, representation, and dynamics in neurobiological systems, MIT Press, Cambridge, MA; 2003. [12] C. Eliasmith, A uniﬁed approach to building and controlling spiking attractor networks, Neural Computation, vol. 17, 2005, pp 1276–1314. [13] R. Singh and C. Eliasmith, Higher-dimensional neurons explain the tuning and dynamics of working memory cells, The Journal of Neuroscience, vol. 26(14), 2006, pp 3667–3678. [14] C. Eliasmith, How to build a brain: from function to implementation, Synthese, vol. 159(3), 2007, pp 373–388. [15] R.E. Kalman, A new approach to linear ﬁltering and prediction problems, Transactions of the ASME– Journal of Basic Engineering, vol. 82(Series D), 1960, pp 35–45. [16] G. Welsh and G. Bishop, An introduction to the Kalman Filter, University of North Carolina at Chapel Hill Chapel Hill NC, vol. 95(TR 95-041), 1995, pp 1–16. [17] W.Q. Malik, W. Truccolo, E.N. Brown, and L.R. Hochberg, Efﬁcient decoding with steady-state Kalman ﬁlter in neural interface systems, IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 19(1), 2011, pp 25–34. 9

5 0.61645234 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

Author: Kamiar R. Rad, Liam Paninski

6 0.61004859 34 nips-2011-An Unsupervised Decontamination Procedure For Improving The Reliability Of Human Judgments

7 0.58285886 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

8 0.57158357 23 nips-2011-Active dendrites: adaptation to spike-based communication

9 0.55757797 24 nips-2011-Active learning of neural response functions with Gaussian processes

10 0.53329116 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data

11 0.52817976 75 nips-2011-Dynamical segmentation of single trials from population neural data

12 0.52239406 302 nips-2011-Variational Learning for Recurrent Spiking Networks

13 0.52114481 37 nips-2011-Analytical Results for the Error in Filtering of Gaussian Processes

14 0.51558971 86 nips-2011-Empirical models of spiking in neural populations

15 0.49809223 249 nips-2011-Sequence learning with hidden units in spiking neural networks

16 0.48072001 44 nips-2011-Bayesian Spike-Triggered Covariance Analysis

17 0.47860745 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

18 0.45183671 280 nips-2011-Testing a Bayesian Measure of Representativeness Using a Large Image Database

19 0.44202662 99 nips-2011-From Stochastic Nonlinear Integrate-and-Fire to Generalized Linear Models

20 0.42658487 292 nips-2011-Two is better than one: distinct roles for familiarity and recollection in retrieving palimpsest memories

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.015), (4, 0.064), (15, 0.206), (20, 0.022), (26, 0.014), (31, 0.13), (33, 0.026), (43, 0.069), (45, 0.083), (57, 0.073), (65, 0.033), (74, 0.042), (83, 0.089), (84, 0.03), (99, 0.042)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83629698 219 nips-2011-Predicting response time and error rates in visual search

Author: Bo Chen, Vidhya Navalpakkam, Pietro Perona

2 0.76553673 81 nips-2011-Efficient anomaly detection using bipartite k-NN graphs

Author: Kumar Sricharan, Alfred O. Hero

Abstract: Learning minimum volume sets of an underlying nominal distribution is a very effective approach to anomaly detection. Several approaches to learning minimum volume sets have been proposed in the literature, including the K-point nearest neighbor graph (K-kNNG) algorithm based on the geometric entropy minimization (GEM) principle [4]. The K-kNNG detector, while possessing several desirable characteristics, suffers from high computation complexity, and in [4] a simpler heuristic approximation, the leave-one-out kNNG (L1O-kNNG) was proposed. In this paper, we propose a novel bipartite k-nearest neighbor graph (BPkNNG) anomaly detection scheme for estimating minimum volume sets. Our bipartite estimator retains all the desirable theoretical properties of the K-kNNG, while being computationally simpler than the K-kNNG and the surrogate L1OkNNG detectors. We show that BP-kNNG is asymptotically consistent in recovering the p-value of each test point. Experimental results are given that illustrate the superior performance of BP-kNNG as compared to the L1O-kNNG and other state of the art anomaly detection schemes.

3 0.71701944 75 nips-2011-Dynamical segmentation of single trials from population neural data

Author: Biljana Petreska, Byron M. Yu, John P. Cunningham, Gopal Santhanam, Stephen I. Ryu, Krishna V. Shenoy, Maneesh Sahani

Abstract: Simultaneous recordings of many neurons embedded within a recurrentlyconnected cortical network may provide concurrent views into the dynamical processes of that network, and thus its computational function. In principle, these dynamics might be identiﬁed by purely unsupervised, statistical means. Here, we show that a Hidden Switching Linear Dynamical Systems (HSLDS) model— in which multiple linear dynamical laws approximate a nonlinear and potentially non-stationary dynamical process—is able to distinguish different dynamical regimes within single-trial motor cortical activity associated with the preparation and initiation of hand movements. The regimes are identiﬁed without reference to behavioural or experimental epochs, but nonetheless transitions between them correlate strongly with external events whose timing may vary from trial to trial. The HSLDS model also performs better than recent comparable models in predicting the ﬁring rate of an isolated neuron based on the ﬁring rates of others, suggesting that it captures more of the “shared variance” of the data. Thus, the method is able to trace the dynamical processes underlying the coordinated evolution of network activity in a way that appears to reﬂect its computational role. 1

4 0.70385051 135 nips-2011-Information Rates and Optimal Decoding in Large Neural Populations

Author: Kamiar R. Rad, Liam Paninski

5 0.6974991 159 nips-2011-Learning with the weighted trace-norm under arbitrary sampling distributions

Author: Rina Foygel, Ohad Shamir, Nati Srebro, Ruslan Salakhutdinov

Abstract: We provide rigorous guarantees on learning with the weighted trace-norm under arbitrary sampling distributions. We show that the standard weighted-trace norm might fail when the sampling distribution is not a product distribution (i.e. when row and column indexes are not selected independently), present a corrected variant for which we establish strong learning guarantees, and demonstrate that it works better in practice. We provide guarantees when weighting by either the true or empirical sampling distribution, and suggest that even if the true distribution is known (or is uniform), weighting by the empirical distribution may be beneﬁcial. 1

6 0.68907362 133 nips-2011-Inferring spike-timing-dependent plasticity from spike train data

7 0.68862748 249 nips-2011-Sequence learning with hidden units in spiking neural networks

8 0.68844235 86 nips-2011-Empirical models of spiking in neural populations

9 0.68488127 292 nips-2011-Two is better than one: distinct roles for familiarity and recollection in retrieving palimpsest memories

10 0.68048823 37 nips-2011-Analytical Results for the Error in Filtering of Gaussian Processes

11 0.6792872 273 nips-2011-Structural equations and divisive normalization for energy-dependent component analysis

12 0.67690223 102 nips-2011-Generalised Coupled Tensor Factorisation

13 0.67670906 82 nips-2011-Efficient coding of natural images with a population of noisy Linear-Nonlinear neurons

14 0.67244744 140 nips-2011-Kernel Embeddings of Latent Tree Graphical Models

15 0.67123884 301 nips-2011-Variational Gaussian Process Dynamical Systems

16 0.67084455 183 nips-2011-Neural Reconstruction with Approximate Message Passing (NeuRAMP)

17 0.6685307 229 nips-2011-Query-Aware MCMC

18 0.66784352 31 nips-2011-An Application of Tree-Structured Expectation Propagation for Channel Decoding

19 0.66764408 180 nips-2011-Multiple Instance Filtering

20 0.66742927 57 nips-2011-Comparative Analysis of Viterbi Training and Maximum Likelihood Estimation for HMMs