nips nips2002 nips2002-79 knowledge-graph by maker-knowledge-mining

79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions

Source: pdf

Author: Maneesh Sahani, Jennifer F. Linden

Abstract: An essential step in understanding the function of sensory nervous systems is to characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. One increasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses of one or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal ﬁring. The estimation techniques usually employed, such as Wiener ﬁltering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters that control the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract An essential step in understanding the function of sensory nervous systems is to characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. [sent-10, score-0.14]

2 One increasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses of one or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal ﬁring. [sent-11, score-0.484]

3 The estimation techniques usually employed, such as Wiener ﬁltering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. [sent-12, score-0.27]

4 We show that by learning hyperparameters that control the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input. [sent-14, score-0.223]

5 1 Introduction A common experimental approach to the measurement of the stimulus-response function (SRF) of sensory neurons, particularly in the visual and auditory modalities, is “reverse correlation” and its related non-linear extensions [1]. [sent-15, score-0.124]

6 The neural response to a continuous, rapidly varying stimulus , is measured and used in an attempt to reconstruct the functional mapping . [sent-16, score-0.29]

7 Such linear ﬁlter estimates are often called STRFs for spatio-temporal (in the visual case) or spectro-temporal (in the auditory case) receptive ﬁelds. [sent-18, score-0.171]

8 The general the SRF may also be parameterized on the basis of known or guessed non-linear properties of the neurons, or may be expanded in terms of the Volterra or Wiener integral power series. [sent-19, score-0.218]

9 ¥£¡ ¦¤¢ ¥£ ¨ ¥£¡ ¦¤¡ § ©¦¤¢ ¥£ ¦¤¡ § ¥£ ¦¤¡ § ¥£¡ ¦¤¢ In practice, the stimulus is often a discrete-time process . [sent-21, score-0.222]

10 In the auditory experiments that will be considered below, it is set by the rate of the component tone pulses in a random ! [sent-23, score-0.243]

11 On time-scales ﬁner than that set by this discretization rate, the stimulus is strongly autocorrelated. [sent-25, score-0.252]

12 To estimate the ﬁrst-order kernel up to a given . [sent-30, score-0.09]

13 If maximum time lag , we construct a set of input lag-vectors a single stimulus frame, , is itself a -dimensional vector (representing, say, pixels in an image or power in different frequency bands) then the lag vectors are formed by con. [sent-31, score-0.693]

14 The Wiener ﬁlter is then catenating stimulus frames together into vectors of length obtained by least-squares linear regression from the lag vectors to the corresponding observed activities . [sent-32, score-0.471]

15 Higher-order kernels can also be found by linear regression, using augmented versions of the stimulus lag vectors. [sent-34, score-0.325]

16 For example, the second-order kernel is obtained by regres(or, sion using input vectors formed by all quadratic combinations of the elements of equivalently, by support-vector-like kernel regression using a homogeneous second-order polynomial kernel). [sent-35, score-0.326]

17 It should be clear, however, that the basic techniques can be extended to higher orders at the expense of additional computational load, provided only that a sensible deﬁnition of smoothness in these higher-order kernels is available. [sent-37, score-0.161]

18 ¡ The least-squares solution to a regression problem is identical to the maximum likelihood (ML) value of the weight vector for the probabilistic regression model with Gaussian output noise of constant variance : " #! [sent-38, score-0.326]

19 3 ¡ 1 5 6¥ " 4 2 ¡ ) ' ¡ $ 0( &% As is common with ML learning, weight vectors obtained in this way are often overﬁt to the training data, and so give poor estimates of the true underlying stimulus-response function. [sent-40, score-0.158]

20 If the stimulus is uncorrelated, the MLestimated weight along some input dimension is proportional to the observed correlation between that dimension of the stimulus and the output response. [sent-42, score-0.517]

21 Furthermore, if the true relationship between stimulus and response is non-linear, limited sampling of the input space may also lead to observed correlations that would have been absent given unlimited data. [sent-44, score-0.29]

22 Many of these approaches are equivalent to the maximum a posteriori (MAP) estimation of parameters under a suitable prior distribution. [sent-46, score-0.131]

23 Here, we investigate an approach in which these prior distributions are optimized with reference to the data; as such, they cease to be “prior” in a strict sense, and instead become part of a hierarchical probabalistic model. [sent-47, score-0.127]

24 A distribution on the regression parameters is ﬁrst speciﬁed up to the unknown values of some hyperparameters. [sent-48, score-0.107]

25 Finally, the estimate of the parameters is given by the MAP weight vector under the optimized “prior”. [sent-50, score-0.109]

26 Such evidence optimization schemes have previously been used in the context of linear, kernel and Gaussian-process regression. [sent-51, score-0.211]

27 We show that, with realistic data volumes, such techniques provide considerably better estimates of the stimulus-response function than do the unregularized (ML) Wiener estimates. [sent-52, score-0.121]

28 We assessed the generalization ability of parameters chosen by maximum likelihood and by various evidence optimization schemes on a set of responses collected from the auditory cortex of rodents. [sent-54, score-0.414]

29 As will be seen, evidence optimization yielded estimates that generalized far better than those obtained by the more elementary ML techniques, and so provided a more accurate picture of the underlying stimulus-response function. [sent-55, score-0.259]

30 Recordings often reﬂected the activity of a number of neurons; single neurons were identiﬁed by Bayesian spike-sorting techniques [2, 3] whenever possible. [sent-57, score-0.093]

31 The stimulus consisted of 20 ms tone pulses (ramped up and down with a 5 ms cosine gate) presented at random center frequencies, maximal intensities, and times, such that pulses at more than one frequency might be played simultaneously. [sent-58, score-0.743]

32 This stimulus resembled that used in a previous study [4], except in the variation of pulse intensity. [sent-59, score-0.28]

33 The times, frequencies and sound intensities of all tone pulses were chosen independently within the discretizations of those variables (20 ms bins in time, 1/12 octave bins covering either 2–32 or 25–100 kHz in frequency, and 5 dB SPL bins covering 25–70 dB SPL in level). [sent-60, score-0.626]

34 At any time point, the stimulus averaged two tone pulses per octave, with an expected loudness of approximately 73 dB SPL for the 2–32 kHz stimulus and 70 dB SPL for the 25–100 kHz stimulus. [sent-61, score-0.594]

35 Each pulse was ramped up and down with a 5 ms cosine gate. [sent-62, score-0.239]

36 At each recording site, the 2–32 kHz stimulus was repeated for 20 trials, and the 25–100 kHz stimulus for 10 trials. [sent-64, score-0.55]

37 Neural responses from all 10 or 20 trials were histogrammed in 20 ms bins aligned with stimulus pulse durations. [sent-65, score-0.528]

38 Thus, in the regression framework, the instantaneous input vector comprised the sound amplitudes at each possible frequency at time , and the output was the number of spikes per trial collected into the th bin. [sent-66, score-0.252]

39 The repetition of the same stimulus made it possible to partition the recorded response power into a stimulus-related (signal) component and a noise component. [sent-67, score-0.515]

40 ) Only those 92 recordings in which the signal power was signiﬁcantly greater than zero were used in this study. [sent-70, score-0.365]

41 The total duration of the stimulus was divided 10 times into a training data segment (9/10 of the total) and a test data segment (1/10), such that all 10 test segments were disjoint. [sent-72, score-0.222]

42 Performance was assessed by the predictive power, that is the test data variance minus average squared prediction error. [sent-73, score-0.204]

43 The 10 estimates of the predictive power were averaged, and normalized by the estimated signal power to give a number less than 1. [sent-74, score-0.828]

44 Note that the predictive power could be negative in cases where the mean was a better description of the test data than was the model prediction. [sent-75, score-0.342]

45 In graphs of the predictive power as a function of noise level, the estimate of the noise power is also shown after normalization by the estimated signal power. [sent-76, score-0.797]

46 3 Evidence optimization for linear regression As is common in regression problems, it is convenient to collect all the stimulus vectors and observed responses into matrices. [sent-77, score-0.619]

47 Thus, we described the input by a matrix , the th column of which is the input lag-vector . [sent-78, score-0.121]

48 Similarly, we collect the outputs into a row vector , the th element of which is . [sent-79, score-0.098]

49 3 $ ¡ ¢ ¡ 3 We now choose the prior distribution on to be normal with zero mean (having no prior reason to favour either positive or negative weights) and covariance matrix . [sent-84, score-0.322]

50 3 We seek to optimize this evidence with respect to the hyperparameters in , and the noise variance . [sent-97, score-0.251]

51 If the covariance matrix contains a parameter , then the derivative of the log-evidence with respect to is given by G H #§ $ G H 1 WV¢ U¢ ! [sent-99, score-0.088]

52 4 Automatic relevance determination (ARD) The most common evidence optimization scheme for regression is known as automatic relevance determination (ARD). [sent-105, score-0.351]

53 The prior covariance on the weights is taken to be of the form with . [sent-107, score-0.197]

54 That is, the weights are taken to be independent with potentially different prior precisions . [sent-108, score-0.179]

55 A pronounced general feature of the maxima discovered by this approach is that many of the optimal precisions are inﬁnite (that is, the variances are zero). [sent-112, score-0.094]

56 Since the prior distribution is centered on zero, this forces the corresponding weight to vanish. [sent-113, score-0.133]

57 In practice, as the iterated value of a precision crosses some pre-determined threshold, the corresponding input dimension is eliminated from the regression problem. [sent-114, score-0.138]

58 The results of evidence optimization suggest that such inputs are irrelevant to predicting the output; hence the name given to this technique. [sent-115, score-0.152]

59 The resulting MAP estimates obtained under the optimized ARD prior thus tend to be sparse, with only a small number of non-zero weights often appearing as isolated spots in the STRF. [sent-116, score-0.278]

60 The estimated STRFs for one example recording using ML and ARD are shown in the two left-most panels of ﬁgure 1 (the other panels show smoothed estimates which will be described below), with the estimated weight vectors rearranged into time-frequency matrices. [sent-117, score-0.528]

61 The sparsity of the ARD solution is evident in the reduction of apparent estimation noise at higher frequencies and longer time lags. [sent-118, score-0.179]

62 Assessed by cross-validation, as described above, the ARD estimate accurately predicted 26% of the signal power in test data, whereas the ML estimate (or Wiener kernel) predicted only 12%. [sent-120, score-0.333]

63 5 normalized ML predictive power 0 0 25 50 normalized noise power 0 20 0 40 no. [sent-135, score-0.846]

64 of recordings Figure 2: Comparison of ARD and ML predictions. [sent-136, score-0.122]

65 normalized prediction difference (ARD − ML) normalized ARD predictive power This improvement in predictive quality was evident in every one of the 92 recordings with signiﬁcant signal power, indicating that the optimized prior does improve estimation accuracy. [sent-137, score-1.151]

66 The left-most panel of ﬁgure 2 compares the normalized cross-validation predictive power of the two STRF estimates. [sent-138, score-0.504]

67 The other two panels show the difference in predictive powers as function of noise (in the center) and as a histogram (right). [sent-139, score-0.317]

68 The advantage of the evidence-optimization approach is clearly most pronounced at higher noise levels. [sent-140, score-0.122]

69 5 Automatic smoothness determination (ASD) In many regression problems, such as those for which ARD was developed, the different input dimensions are often unrelated; indeed they may be measured in different units. [sent-141, score-0.265]

70 In such contexts, an independent prior on the weights, as in ARD, is reasonable. [sent-142, score-0.091]

71 Furthermore, we might expect weights that are nearby in either time or frequency (or space) to be similar in value; that is, the STRF is likely to be smooth on the scale at which we model it. [sent-144, score-0.098]

72 Here we introduce a new evidence optimization scheme, in which the prior covariance matrix is used to favour smoothing of the STRF weights. [sent-145, score-0.428]

73 Instead, we introduce hyperparameters and that set the scale of smoothness in the spectral (or spatial) and temporal dimensions respectively, and then, for each recording, optimize the evidence to determine their appropriate values. [sent-147, score-0.262]

74 The element of each of these gives the squared distance between the weights and in terms of center frequency (or space) and time respectively. [sent-149, score-0.133]

75 In this scheme, the hyperparameters and set the correlation distances for the weights along the spectral (spatial) and temporal dimensions, while the additional hyperparameter sets their overall scale. [sent-151, score-0.158]

76 @ S TRQI )0'(& & " The third panel of ﬁgure 1 shows the ASD-optimized MAP estimate of the STRF for the same example recording discussed previously. [sent-158, score-0.175]

77 57 (1/12 octave) bins in frequency; the effect of this smoothing of the STRF estimate is evident. [sent-161, score-0.157]

78 In the population of 92 recordings (ﬁgure 3, upper panels) MAP estimates based on the ASD-optimized prior again outperformed ML (Wiener kernel) estimates substantially on every single recording considered, particularly on those with poorer signal-to-noise ratios. [sent-164, score-0.475]

79 They also tended to predict more accurately than the ARD-based estimates (ﬁgure 3, lower panels). [sent-165, score-0.106]

80 6 ARD in an ASD-deﬁned basis The two evidence optimization frameworks presented above appear inconsistent. [sent-167, score-0.223]

81 ARD yields a sparse, independent prior, and often leads to isolated non-zero weights in the estimated STRF. [sent-168, score-0.114]

82 5 normalized ML predictive power 0 0 20 40 normalized noise power 60 0 10 20 number of recordings normalized ASD predictive power 0. [sent-178, score-1.434]

83 5 0 normalized ARD predictive power 0 20 40 normalized noise power −0. [sent-187, score-0.846]

84 2 60 0 10 20 number of recordings normalized predictive power difference (ML − ASD) 1. [sent-188, score-0.588]

85 5 normalized predictive power difference (ARD − ASD) normalized ASD predictive power 0. [sent-189, score-0.932]

86 Nonetheless, both frameworks appear to improve the ability of estimated models to generalize to novel data. [sent-191, score-0.107]

87 3 A3 $ 1 ¡¨ § 1 ¢ ¥ 1 #§ 1 3 ¡ ¢ ¡ #§ b 1 ©¨ We choose to be the (positive branch) matrix square root of the optimal prior matrix (see (10)) obtained from ASD. [sent-195, score-0.147]

88 If now we introduce and optimize a diagonal prior covariance of the ARD form in this transformed problem, we are indirectly optimizing a covariance matrix of the form in the original basis. [sent-199, score-0.27]

89 Intuitively, the sparseness driven by ARD is applied to basis vectors drawn from the rows of the transformation matrix , rather than to individual weights. [sent-200, score-0.125]

90 If this basis reﬂects the smoothness prior obtained from ASD then the resulting prior will combine the smoothness and sparseness of two approaches. [sent-201, score-0.403]

91 5 , it is possible to rewrite the joint density By decomposing the prior covariance of (3) as 1 £ " £ £ £ 1 " ¨ £ 0. [sent-202, score-0.151]

92 04 20 normalized prediction difference (ASD/RD − ASD) normalized ASD/RD predictive power 0. [sent-219, score-0.59]

93 of recordings Figure 4: Comparison of ARD in the ASD basis and simple ASD in this basis will be formed by a superposition of Gaussian components, each of which individually matches the ASD prior on its covariance. [sent-221, score-0.309]

94 The results of this procedure (labelled ASD/RD) on our example recording are shown in the rightmost panel of ﬁgure 1. [sent-222, score-0.144]

95 The combined prior shows a similar degree of smoothing to the ASD-optimized prior alone; in addition, like the ARD prior, it suppresses the apparent background estimation noise at higher frequencies and longer time lags. [sent-223, score-0.378]

96 This improvement over estimates derived from ASD alone is borne out in the whole population (ﬁgure 4), although the gain is smaller than in the previous cases. [sent-225, score-0.109]

97 7 Conclusions We have demonstrated a succession of evidence-optimization techniques which appear to improve the accuracy of STRF estimates from noisy data. [sent-226, score-0.121]

98 The mean improvement in prediction of the ASD/RD method over the Wiener kernel is 40% of the stimulus-related signal power. [sent-227, score-0.147]

99 Considering that the best linear predictor would on average capture no more than 40% of the signal power in these data even in the absence of noise (Sahani and Linden, “How Linear are Auditory Cortical Responses? [sent-228, score-0.313]

100 These results apply to the case of linear models; our current work is directed toward extensions to non-linear SRFs within an augmented linear regression framework. [sent-230, score-0.107]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('ard', 0.457), ('asd', 0.419), ('stimulus', 0.222), ('strf', 0.187), ('power', 0.186), ('ml', 0.176), ('wiener', 0.16), ('predictive', 0.156), ('trqi', 0.131), ('normalized', 0.124), ('recordings', 0.122), ('srf', 0.114), ('regression', 0.107), ('recording', 0.106), ('ms', 0.104), ('evidence', 0.102), ('auditory', 0.093), ('panels', 0.091), ('prior', 0.091), ('khz', 0.089), ('spl', 0.083), ('linden', 0.083), ('bins', 0.081), ('smoothness', 0.081), ('pulses', 0.08), ('hyperparameters', 0.079), ('estimates', 0.078), ('sahani', 0.077), ('noise', 0.07), ('tone', 0.07), ('lag', 0.066), ('responses', 0.063), ('covariance', 0.06), ('kernel', 0.059), ('pulse', 0.058), ('gure', 0.058), ('signal', 0.057), ('octave', 0.055), ('favour', 0.052), ('marmarelis', 0.052), ('volterra', 0.052), ('pronounced', 0.052), ('frequency', 0.052), ('neurons', 0.05), ('optimization', 0.05), ('assessed', 0.048), ('mackay', 0.048), ('weights', 0.046), ('determination', 0.046), ('fir', 0.046), ('ramped', 0.046), ('rrr', 0.046), ('smoothing', 0.045), ('tr', 0.043), ('techniques', 0.043), ('weight', 0.042), ('chord', 0.042), ('precisions', 0.042), ('strfs', 0.042), ('frequencies', 0.041), ('estimated', 0.041), ('estimation', 0.04), ('maneesh', 0.039), ('frameworks', 0.039), ('xw', 0.039), ('vectors', 0.038), ('panel', 0.038), ('response', 0.037), ('kernels', 0.037), ('ner', 0.037), ('optimized', 0.036), ('element', 0.035), ('db', 0.034), ('hyperparameter', 0.033), ('substitution', 0.033), ('intensities', 0.033), ('lter', 0.033), ('collect', 0.032), ('formed', 0.032), ('basis', 0.032), ('sensory', 0.031), ('rapidly', 0.031), ('estimate', 0.031), ('improvement', 0.031), ('cosine', 0.031), ('tipping', 0.031), ('input', 0.031), ('transformed', 0.031), ('th', 0.031), ('collected', 0.031), ('discretization', 0.03), ('width', 0.029), ('yielded', 0.029), ('evident', 0.028), ('fourier', 0.028), ('accurately', 0.028), ('matrix', 0.028), ('ability', 0.027), ('sparseness', 0.027), ('isolated', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999982 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions

Author: Maneesh Sahani, Jennifer F. Linden

2 0.50041664 103 nips-2002-How Linear are Auditory Cortical Responses?

Author: Maneesh Sahani, Jennifer F. Linden

Abstract: By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive ﬁeld (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. How well can such a model account for neural responses at the very ﬁrst stages of auditory cortical processing? To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses.

3 0.26482272 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex

Author: Christian K. Machens, Michael Wehr, Anthony M. Zador

Abstract: How do cortical neurons represent the acoustic environment? This question is often addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations and music. We have used in vivo whole cell methods in the rat auditory cortex to record subthreshold membrane potential ﬂuctuations elicited by these stimuli. Whole cell recording reveals the total synaptic input to a neuron from all the other neurons in the circuit, instead of just its output—a sparse binary spike train—as in conventional single unit physiological recordings. Whole cell recording thus provides a much richer source of information about the neuron’s response. Many neurons responded robustly and reliably to the complex stimuli in our ensemble. Here we analyze the linear component—the spectrotemporal receptive ﬁeld (STRF)—of the transformation from the sound (as represented by its time-varying spectrogram) to the neuron’s membrane potential. We ﬁnd that the STRF has a rich dynamical structure, including excitatory regions positioned in general accord with the prediction of the simple tuning curve. We also ﬁnd that in many cases, much of the neuron’s response, although deterministically related to the stimulus, cannot be predicted by the linear component, indicating the presence of as-yet-uncharacterized nonlinear response properties.

4 0.13549541 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise

Author: Alon Fishbach, Bradford J. May

Abstract: Psychophysical data suggest that temporal modulations of stimulus amplitude envelopes play a prominent role in the perceptual segregation of concurrent sounds. In particular, the detection of an unmodulated signal can be significantly improved by adding amplitude modulation to the spectral envelope of a competing masking noise. This perceptual phenomenon is known as “Comodulation Masking Release” (CMR). Despite the obvious influence of temporal structure on the perception of complex auditory scenes, the physiological mechanisms that contribute to CMR and auditory streaming are not well known. A recent physiological study by Nelken and colleagues has demonstrated an enhanced cortical representation of auditory signals in modulated noise. Our study evaluates these CMR-like response patterns from the perspective of a hypothetical auditory edge-detection neuron. It is shown that this simple neural model for the detection of amplitude transients can reproduce not only the physiological data of Nelken et al., but also, in light of previous results, a variety of physiological and psychoacoustical phenomena that are related to the perceptual segregation of concurrent sounds. 1 In t rod u ct i on The temporal structure of a complex sound exerts strong influences on auditory physiology (e.g. [10, 16]) and perception (e.g. [9, 19, 20]). In particular, studies of auditory scene analysis have demonstrated the importance of the temporal structure of amplitude envelopes in the perceptual segregation of concurrent sounds [2, 7]. Common amplitude transitions across frequency serve as salient cues for grouping sound energy into unified perceptual objects. Conversely, asynchronous amplitude transitions enhance the separation of competing acoustic events [3, 4]. These general principles are manifested in perceptual phenomena as diverse as comodulation masking release (CMR) [13], modulation detection interference [22] and synchronous onset grouping [8]. Despite the obvious importance of timing information in psychoacoustic studies of auditory masking, the way in which the CNS represents the temporal structure of an amplitude envelope is not well understood. Certainly many physiological studies have demonstrated neural sensitivities to envelope transitions, but this sensitivity is only beginning to be related to the variety of perceptual experiences that are evoked by signals in noise. Nelken et al. [15] have suggested a correspondence between neural responses to time-varying amplitude envelopes and psychoacoustic masking phenomena. In their study of neurons in primary auditory cortex (A1), adding temporal modulation to background noise lowered the detection thresholds of unmodulated tones. This enhanced signal detection is similar to the perceptual phenomenon that is known as comodulation masking release [13]. Fishbach et al. [11] have recently proposed a neural model for the detection of “auditory edges” (i.e., amplitude transients) that can account for numerous physiological [14, 17, 18] and psychoacoustical [3, 21] phenomena. The encompassing utility of this edge-detection model suggests a common mechanism that may link the auditory processing and perception of auditory signals in a complex auditory scene. Here, it is shown that the auditory edge detection model can accurately reproduce the cortical CMR-like responses previously described by Nelken and colleagues. 2 Th e M od el The model is described in detail elsewhere [11]. In short, the basic operation of the model is the calculation of the first-order time derivative of the log-compressed envelope of the stimulus. A computational model [23] is used to convert the acoustic waveform to a physiologically plausible auditory nerve representation (Fig 1a). The simulated neural response has a medium spontaneous rate and a characteristic frequency that is set to the frequency of the target tone. To allow computation of the time derivative of the stimulus envelope, we hypothesize the existence of a temporal delay dimension, along which the stimulus is progressively delayed. The intermediate delay layer (Fig 1b) is constructed from an array of neurons with ascending membrane time constants (τ); each neuron is modeled by a conventional integrate-and-fire model (I&F;, [12]). Higher membrane time constant induces greater delay in the neuron’s response [1]. The output of the delay layer converges to a single output neuron (Fig. 1c) via a set of connection with various efficacies that reflect a receptive field of a gaussian derivative. This combination of excitatory and inhibitory connections carries out the time-derivative computation. Implementation details and parameters are given in [11]. The model has 2 adjustable and 6 fixed parameters, the former were used to fit the responses of the model to single unit responses to variety of stimuli [11]. The results reported here are not sensitive to these parameters. (a) AN model (b) delay-layer (c) edge-detector neuron τ=6 ms I&F; Neuron τ=4 ms τ=3 ms bandpass log d dt RMS Figure 1: Schematic diagram of the model and a block diagram of the basic operation of each model component (shaded area). The stimulus is converted to a neural representation (a) that approximates the average firing rate of a medium spontaneous-rate AN fiber [23]. The operation of this stage can be roughly described as the log-compressed rms output of a bandpass filter. The neural representation is fed to a series of neurons with ascending membrane time constant (b). The kernel functions that are used to simulate these neurons are plotted for a few neurons along with the time constants used. The output of the delay-layer neurons converge to a single I&F; neuron (c) using a set of connections with weights that reflect a shape of a gaussian derivative. Solid arrows represent excitatory connections and white arrows represent inhibitory connections. The absolute efficacy is represented by the width of the arrows. 3 Resu lt s Nelken et al. [15] report that amplitude modulation can substantially modify the noise-driven discharge rates of A1 neurons in Halothane-anesthetized cats. Many cortical neurons show only a transient onset response to unmodulated noise but fire in synchrony (“lock”) to the envelope of modulated noise. A significant reduction in envelope-locked discharge rates is observed if an unmodulated tone is added to modulated noise. As summarized in Fig. 2, this suppression of envelope locking can reveal the presence of an auditory signal at sound pressure levels that are not detectable in unmodulated noise. It has been suggested that this pattern of neural responding may represent a physiological equivalent of CMR. Reproduction of CMR-like cortical activity can be illustrated by a simplified case in which the analytical amplitude envelope of the stimulus is used as the input to the edge-detector model. In keeping with the actual physiological approach of Nelken et al., the noise envelope is shaped by a trapezoid modulator for these simulations. Each cycle of modulation, E N(t), is given by: t 0≤t < 3D E N (t ) = P P − D (t − 3 D ) 3 D ≤ t < 4 D 0 4 D ≤ t < 8D £ P D ¢ ¡ where P is the peak pressure level and D is set to 12.5 ms. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 150 300 Time (ms) Figure 2: Responses of an A1 unit to a combination of noise and tone at many tone levels, replotted from Nelken et al. [15]. (a) Unmodulated noise and (b) modulated noise. The noise envelope is illustrated by the thick line above each figure. Each row shows the response of the neuron to the noise plus the tone at the level specified on the ordinate. The dashed line in (b) indicates the detection threshold level for the tone. The detection threshold (as defined and calculated by Nelken et al.) in the unmodulated noise was not reached. Since the basic operation of the model is the calculation of the rectified timederivative of the log-compressed envelope of the stimulus, the expected noisedriven rate of the model can be approximated by: ( ) ¢ E (t ) P0 d A ln 1 + dt ¡ M N ( t ) = max 0, ¥ ¤ £ where A=20/ln(10) and P0 =2e-5 Pa. The expected firing rate in response to the noise plus an unmodulated signal (tone) can be similarly approximated by: ) ¨ E ( t ) + PS P0 ¦ ( d A ln 1 + dt § M N + S ( t ) = max 0, © where PS is the peak pressure level of the tone. Clearly, both MN (t) and MN+S (t) are identically zero outside the interval [0 D]. Within this interval it holds that: M N (t ) = AP D P0 + P D t 0≤t < D Clearly, M N + S < M N for the interval [0 D] of each modulation cycle. That is, the addition of a tone reduces the responses of the model to the rising part of the modulated envelope. Higher tone levels (Ps ) cause greater reduction in the model’s firing rate. (c) (b) Level derivative (dB SPL/ms) Level (dB SPL) (a) (d) Time (ms) Figure 3: An illustration of the basic operation of the model on various amplitude envelopes. The simplified operation of the model includes log compression of the amplitude envelope (a and c) and rectified time-derivative of the log-compressed envelope (b and d). (a) A 30 dB SPL tone is added to a modulated envelope (peak level of 70 dB SPL) 300 ms after the beginning of the stimulus (as indicated by the horizontal line). The addition of the tone causes a great reduction in the time derivative of the log-compressed envelope (b). When the envelope of the noise is unmodulated (c), the time-derivative of the log-compressed envelope (d) shows a tiny spike when the tone is added (marked by the arrow). Fig. 3 demonstrates the effect of a low-level tone on the time-derivative of the logcompressed envelope of a noise. When the envelope is modulated (Fig. 3a) the addition of the tone greatly reduces the derivative of the rising part of the modulation (Fig. 3b). In the absence of modulations (Fig. 3c), the tone presentation produces a negligible effect on the level derivative (Fig. 3d). Model simulations of neural responses to the stimuli used by Nelken et al. are plotted in Fig. 4. As illustrated schematically in Fig 3 (d), the presence of the tone does not cause any significant change in the responses of the model to the unmodulated noise (Fig. 4a). In the modulated noise, however, tones of relatively low levels reduce the responses of the model to the rising part of the envelope modulations. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 Time (ms) 150 300 Figure 4: Simulated responses of the model to a combination of a tone and Unmodulated noise (a) and modulated noise (b). All conventions are as in Fig. 2. 4 Di scu ssi on This report uses an auditory edge-detection model to simulate the actual physiological consequences of amplitude modulation on neural sensitivity in cortical area A1. The basic computational operation of the model is the calculation of the smoothed time-derivative of the log-compressed stimulus envelope. The ability of the model to reproduce cortical response patterns in detail across a variety of stimulus conditions suggests similar time-sensitive mechanisms may contribute to the physiological correlates of CMR. These findings augment our previous observations that the simple edge-detection model can successfully predict a wide range of physiological and perceptual phenomena [11]. Former applications of the model to perceptual phenomena have been mainly related to auditory scene analysis, or more specifically the ability of the auditory system to distinguish multiple sound sources. In these cases, a sharp amplitude transition at stimulus onset (“auditory edge”) was critical for sound segregation. Here, it is shown that the detection of acoustic signals also may be enhanced through the suppression of ongoing responses to the concurrent modulations of competing background sounds. Interestingly, these temporal fluctuations appear to be a common property of natural soundscapes [15]. The model provides testable predictions regarding how signal detection may be influenced by the temporal shape of amplitude modulation. Carlyon et al. [6] measured CMR in human listeners using three types of noise modulation: squarewave, sine wave and multiplied noise. From the perspective of the edge-detection model, these psychoacoustic results are intriguing because the different modulator types represent manipulations of the time derivative of masker envelopes. Squarewave modulation had the most sharply edged time derivative and produced the greatest masking release. Fig. 5 plots the responses of the model to a pure-tone signal in square-wave and sine-wave modulated noise. As in the psychoacoustical data of Carlyon et al., the simulated detection threshold was lower in the context of square-wave modulation. Our modeling results suggest that the sharply edged square wave evoked higher levels of noise-driven activity and therefore created a sensitive background for the suppressing effects of the unmodulated tone. (b) 60 Spikes/sec Tone level (dB SPL) (a) 10 0 200 400 600 0 Time (ms) 200 400 600 Figure 5: Simulated responses of the model to a combination of a tone at various levels and a sine-wave modulated noise (a) or a square-wave modulated noise (b). Each row shows the response of the model to the noise plus the tone at the level specified on the abscissa. The shape of the noise modulator is illustrated above each figure. The 100 ms tone starts 250 ms after the noise onset. Note that the tone detection threshold (marked by the dashed line) is 10 dB lower for the square-wave modulator than for the sine-wave modulator, in accordance with the psychoacoustical data of Carlyon et al. [6]. Although the physiological basis of our model was derived from studies of neural responses in the cat auditory system, the key psychoacoustical observations of Carlyon et al. have been replicated in recent behavioral studies of cats (Budelis et al. [5]). These data support the generalization of human perceptual processing to other species and enhance the possible correspondence between the neuronal CMR-like effect and the psychoacoustical masking phenomena. Clearly, the auditory system relies on information other than the time derivative of the stimulus envelope for the detection of auditory signals in background noise. Further physiological and psychoacoustic assessments of CMR-like masking effects are needed not only to refine the predictive abilities of the edge-detection model but also to reveal the additional sources of acoustic information that influence signal detection in constantly changing natural environments. Ackn ow led g men t s This work was supported in part by a NIDCD grant R01 DC004841. Refe ren ces [1] Agmon-Snir H., Segev I. (1993). “Signal delay and input synchronization in passive dendritic structure”, J. Neurophysiol. 70, 2066-2085. [2] Bregman A.S. (1990). “Auditory scene analysis: The perceptual organization of sound”, MIT Press, Cambridge, MA. [3] Bregman A.S., Ahad P.A., Kim J., Melnerich L. (1994) “Resetting the pitch-analysis system. 1. Effects of rise times of tones in noise backgrounds or of harmonics in a complex tone”, Percept. Psychophys. 56 (2), 155-162. [4] Bregman A.S., Ahad P.A., Kim J. (1994) “Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlapping tones”, J. Acoust. Soc. Am. 96 (5), 2694-2703. [5] Budelis J., Fishbach A., May B.J. (2002) “Behavioral assessments of comodulation masking release in cats”, Abst. Assoc. for Res. in Otolaryngol. 25. [6] Carlyon R.P., Buus S., Florentine M. (1989) “Comodulation masking release for three types of modulator as a function of modulation rate”, Hear. Res. 42, 37-46. [7] Darwin C.J. (1997) “Auditory grouping”, Trends in Cog. Sci. 1(9), 327-333. [8] Darwin C.J., Ciocca V. (1992) “Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component”, J. Acoust. Soc. Am. 91 , 33813390. [9] Drullman R., Festen H.M., Plomp R. (1994) “Effect of temporal envelope smearing on speech reception”, J. Acoust. Soc. Am. 95 (2), 1053-1064. [10] Eggermont J J. (1994). “Temporal modulation transfer functions for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity”, Hear. Res. 74, 51-66. [11] Fishbach A., Nelken I., Yeshurun Y. (2001) “Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients”, J. Neurophysiol. 85, 2303–2323. [12] Gerstner W. (1999) “Spiking neurons”, in Pulsed Neural Networks , edited by W. Maass, C. M. Bishop, (MIT Press, Cambridge, MA). [13] Hall J.W., Haggard M.P., Fernandes M.A. (1984) “Detection in noise by spectrotemporal pattern analysis”, J. Acoust. Soc. Am. 76, 50-56. [14] Heil P. (1997) “Auditory onset responses revisited. II. Response strength”, J. Neurophysiol. 77, 2642-2660. [15] Nelken I., Rotman Y., Bar-Yosef O. (1999) “Responses of auditory cortex neurons to structural features of natural sounds”, Nature 397, 154-157. [16] Phillips D.P. (1988). “Effect of Tone-Pulse Rise Time on Rate-Level Functions of Cat Auditory Cortex Neurons: Excitatory and Inhibitory Processes Shaping Responses to Tone Onset”, J. Neurophysiol. 59, 1524-1539. [17] Phillips D.P., Burkard R. (1999). “Response magnitude and timing of auditory response initiation in the inferior colliculus of the awake chinchilla”, J. Acoust. Soc. Am. 105, 27312737. [18] Phillips D.P., Semple M.N., Kitzes L.M. (1995). “Factors shaping the tone level sensitivity of single neurons in posterior field of cat auditory cortex”, J. Neurophysiol. 73, 674-686. [19] Rosen S. (1992) “Temporal information in speech: acoustic, auditory and linguistic aspects”, Phil. Trans. R. Soc. Lond. B 336, 367-373. [20] Shannon R.V., Zeng F.G., Kamath V., Wygonski J, Ekelid M. (1995) “Speech recognition with primarily temporal cues”, Science 270, 303-304. [21] Turner C.W., Relkin E.M., Doucet J. (1994). “Psychophysical and physiological forward masking studies: probe duration and rise-time effects”, J. Acoust. Soc. Am. 96 (2), 795-800. [22] Yost W.A., Sheft S. (1994) “Modulation detection interference – across-frequency processing and auditory grouping”, Hear. Res. 79, 48-58. [23] Zhang X., Heinz M.G., Bruce I.C., Carney L.H. (2001). “A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression”, J. Acoust. Soc. Am. 109 (2), 648-670.

5 0.13027097 43 nips-2002-Binary Coding in Auditory Cortex

Author: Michael R. Deweese, Anthony M. Zador

Abstract: Cortical neurons have been reported to use both rate and temporal codes. Here we describe a novel mode in which each neuron generates exactly 0 or 1 action potentials, but not more, in response to a stimulus. We used cell-attached recording, which ensured single-unit isolation, to record responses in rat auditory cortex to brief tone pips. Surprisingly, the majority of neurons exhibited binary behavior with few multi-spike responses; several dramatic examples consisted of exactly one spike on 100% of trials, with no trial-to-trial variability in spike count. Many neurons were tuned to stimulus frequency. Since individual trials yielded at most one spike for most neurons, the information about stimulus frequency was encoded in the population, and would not have been accessible to later stages of processing that only had access to the activity of a single unit. These binary units allow a more efficient population code than is possible with conventional rate coding units, and are consistent with a model of cortical processing in which synchronous packets of spikes propagate stably from one neuronal population to the next. 1 Binary coding in auditory cortex We recorded responses of neurons in the auditory cortex of anesthetized rats to pure-tone pips of different frequencies [1, 2]. Each pip was presented repeatedly, allowing us to assess the variability of the neural response to multiple presentations of each stimulus. We first recorded multi-unit activity with conventional tungsten electrodes (Fig. 1a). The number of spikes in response to each pip fluctuated markedly from one trial to the next (Fig. 1e), as though governed by a random mechanism such as that generating the ticks of a Geiger counter. Highly variable responses such as these, which are at least as variable as a Poisson process, are the norm in the cortex [3-7], and have contributed to the widely held view that cortical spike trains are so noisy that only the average firing rate can be used to encode stimuli. Because we were recording the activity of an unknown number of neurons, we could not be sure whether the strong trial-to-trial fluctuations reflected the underlying variability of the single units. We therefore used an alternative technique, cell- a b Single-unit recording method 5mV Multi-unit 1sec Raw cellattached voltage 10 kHz c Single-unit . . . . .. .. ... . . .... . ... . Identified spikes Threshold e 28 kHz d Single-unit 80 120 160 200 Time (msec) N = 29 tones 3 2 1 Poisson N = 11 tones ry 40 4 na bi 38 kHz 0 Response variance/mean (spikes/trial) High-pass filtered 0 0 1 2 3 Mean response (spikes/trial) Figure 1: Multi-unit spiking activity was highly variable, but single units obeyed binomial statistics. a Multi-unit spike rasters from a conventional tungsten electrode recording showed high trial-to-trial variability in response to ten repetitions of the same 50 msec pure tone stimulus (bottom). Darker hash marks indicate spike times within the response period, which were used in the variability analysis. b Spikes recorded in cell-attached mode were easily identified from the raw voltage trace (top) by applying a high-pass filter (bottom) and thresholding (dark gray line). Spike times (black squares) were assigned to the peaks of suprathreshold segments. c Spike rasters from a cell-attached recording of single-unit responses to 25 repetitions of the same tone consisted of exactly one well-timed spike per trial (latency standard deviation = 1.0 msec), unlike the multi-unit responses (Fig. 1a). Under the Poisson assumption, this would have been highly unlikely (P ~ 10 -11). d The same neuron as in Fig. 1c responds with lower probability to repeated presentations of a different tone, but there are still no multi-spike responses. e We quantified response variability for each tone by dividing the variance in spike count by the mean spike count across all trials for that tone. Response variability for multi-unit tungsten recording (open triangles) was high for each of the 29 tones (out of 32) that elicited at least one spike on one trial. All but one point lie above one (horizontal gray line), which is the value produced by a Poisson process with any constant or time varying event rate. Single unit responses recorded in cell-attached mode were far less variable (filled circles). Ninety one percent (10/11) of the tones that elicited at least one spike from this neuron produced no multi-spike responses in 25 trials; the corresponding points fall on the diagonal line between (0,1) and (1,0), which provides a strict lower bound on the variability for any response set with a mean between 0 and 1. No point lies above one. attached recording with a patch pipette [8, 9], in order to ensure single unit isolation (Fig. 1b). This recording mode minimizes both of the main sources of error in spike detection: failure to detect a spike in the unit under observation (false negatives), and contamination by spikes from nearby neurons (false positives). It also differs from conventional extracellular recording methods in its selection bias: With cell- attached recording neurons are selected solely on the basis of the experimenter’s ability to form a seal, rather than on the basis of neuronal activity and responsiveness to stimuli as in conventional methods. Surprisingly, single unit responses were far more orderly than suggested by the multi-unit recordings; responses typically consisted of either 0 or 1 spikes per trial, and not more (Fig. 1c-e). In the most dramatic examples, each presentation of the same tone pip elicited exactly one spike (Fig. 1c). In most cases, however, some presentations failed to elicit a spike (Fig. 1d). Although low-variability responses have recently been observed in the cortex [10, 11] and elsewhere [12, 13], the binary behavior described here has not previously been reported for cortical neurons. a 1.4 N = 3055 response sets b 1.2 1 Poisson 28 kHz - 100 msec 0.8 0.6 0.4 0.2 0 0 ry na bi Response variance/mean (spikes/trial) The majority of the neurons (59%) in our study for which statistical significance could be assessed (at the p<0.001 significance level; see Fig. 2, caption) showed noisy binary behavior—“binary” because neurons produced either 0 or 1 spikes, and “noisy” because some stimuli elicited both single spikes and failures. In a substantial fraction of neurons, however, the responses showed more variability. We found no correlation between neuronal variability and cortical layer (inferred from the depth of the recording electrode), cortical area (inside vs. outside of area A1) or depth of anesthesia. Moreover, the binary mode of spiking was not due to the brevity (25 msec) of the stimuli; responses that were binary for short tones were comparably binary when longer (100 msec) tones were used (Fig. 2b). Not assessable Not significant Significant (p<0.001) 0.2 0.4 0.6 0.8 1 1.2 Mean response (spikes/trial) 28 kHz - 25 msec 1.4 0 40 80 120 160 Time (msec) 200 Figure 2: Half of the neuronal population exhibited binary firing behavior. a Of the 3055 sets of responses to 25 msec tones, 2588 (gray points) could not be assessed for significance at the p<0.001 level, 225 (open circles) were not significantly binary, and 242 were significantly binary (black points; see Identification methods for group statistics below). All points were jittered slightly so that overlying points could be seen in the figure. 2165 response sets contained no multi-spike responses; the corresponding points fell on the line from [0,1] to [1,0]. b The binary nature of single unit responses was insensitive to tone duration, even for frequencies that elicited the largest responses. Twenty additional spike rasters from the same neuron (and tone frequency) as in Fig. 1c contain no multi-spike responses whether in response to 100 msec tones (above) or 25 msec tones (below). Across the population, binary responses were as prevalent for 100 msec tones as for 25 msec tones (see Identification methods for group statistics). In many neurons, binary responses showed high temporal precision, with latencies sometimes exhibiting standard deviations as low as 1 msec (Fig. 3; see also Fig. 1c), comparable to previous observations in the auditory cortex [14], and only slightly more precise than in monkey visual area MT [5]. High temporal precision was positively correlated with high response probability (Fig. 3). a b N = (44 cells)x(32 tones) 14 N = 32 tones 12 30 Jitter (msec) Jitter (msec) 40 10 8 6 20 10 4 2 0 0 0 0.2 0.4 0.6 0.8 Mean response (spikes/trial) 1 0 0.4 0.8 1.2 1.6 Mean response (spikes/trial) 2 Figure 3: Trial-to-trial variability in latency of response to repeated presentations of the same tone decreased with increasing response probability. a Scatter plot of standard deviation of latency vs. mean response for 25 presentations each of 32 tones for a different neuron as in Figs. 1 and 2 (gray line is best linear fit). Rasters from 25 repeated presentations of a low response tone (upper left inset, which corresponds to left-most data point) display much more variable latencies than rasters from a high response tone (lower right inset; corresponds to right-most data point). b The negative correlation between latency variability and response size was present on average across the population of 44 neurons described in Identification methods for group statistics (linear fit, gray). The low trial-to-trial variability ruled out the possibility that the firing statistics could be accounted for by a simple rate-modulated Poisson process (Fig. 4a1,a2). In other systems, low variability has sometimes been modeled as a Poisson process followed by a post-spike refractory period [10, 12]. In our system, however, the range in latencies of evoked binary responses was often much greater than the refractory period, which could not have been longer than the 2 msec inter-spike intervals observed during epochs of spontaneous spiking, indicating that binary spiking did not result from any intrinsic property of the spike generating mechanism (Fig. 4a3). Moreover, a single stimulus-evoked spike could suppress subsequent spikes for as long as hundreds of milliseconds (e.g. Figs. 1d,4d), supporting the idea that binary spiking arises through a circuit-level, rather than a single-neuron, mechanism. Indeed, the fact that this suppression is observed even in the cortex of awake animals [15] suggests that binary spiking is not a special property of the anesthetized state. It seems surprising that binary spiking in the cortex has not previously been remarked upon. In the auditory cortex the explanation may be in part technical: Because firing rates in the auditory cortex tend to be low, multi-unit recording is often used to maximize the total amount of data collected. Moreover, our use of cell-attached recording minimizes the usual bias toward responsive or active neurons. Such explanations are not, however, likely to account for the failure to observe binary spiking in the visual cortex, where spike count statistics have been scrutinized more closely [3-7]. One possibility is that this reflects a fundamental difference between the auditory and visual systems. An alternative interpretation— a1 b Response probability 100 spikes/s 2 kHz Poisson simulation c 100 200 300 400 Time (msec) 500 20 Ratio of pool sizes a2 0 16 12 8 4 0 a3 Poisson with refractory period 0 40 80 120 160 200 Time (msec) d Response probability PSTH 0.2 0.4 0.6 0.8 1 Mean spike count per neuron 1 0.8 N = 32 tones 0.6 0.4 0.2 0 2.0 3.8 7.1 13.2 24.9 46.7 Tone frequency (kHz) Figure 4: a The lack of multi-spike responses elicited by the neuron shown in Fig. 3a were not due to an absolute refractory period since the range of latencies for many tones, like that shown here, was much greater than any reasonable estimate for the neuron’s refractory period. (a1) Experimentally recorded responses. (a2) Using the smoothed post stimulus time histogram (PSTH; bottom) from the set of responses in Fig. 4a, we generated rasters under the assumption of Poisson firing. In this representative example, four double-spike responses (arrows at left) were produced in 25 trials. (a3) We then generated rasters assuming that the neuron fired according to a Poisson process subject to a hard refractory period of 2 msec. Even with a refractory period, this representative example includes one triple- and three double-spike responses. The minimum interspike-interval during spontaneous firing events was less than two msec for five of our neurons, so 2 msec is a conservative upper bound for the refractory period. b. Spontaneous activity is reduced following high-probability responses. The PSTH (top; 0.25 msec bins) of the combined responses from the 25% (8/32) of tones that elicited the largest responses from the same neuron as in Figs. 3a and 4a illustrates a preclusion of spontaneous and evoked activity for over 200 msec following stimulation. The PSTHs from progressively less responsive groups of tones show progressively less preclusion following stimulation. c Fewer noisy binary neurons need to be pooled to achieve the same “signal-to-noise ratio” (SNR; see ref. [24]) as a collection of Poisson neurons. The ratio of the number of Poisson to binary neurons required to achieve the same SNR is plotted against the mean number of spikes elicited per neuron following stimulation; here we have defined the SNR to be the ratio of the mean spike count to the standard deviation of the spike count. d Spike probability tuning curve for the same neuron as in Figs. 1c-e and 2b fit to a Gaussian in tone frequency. and one that we favor—is that the difference rests not in the sensory modality, but instead in the difference between the stimuli used. In this view, the binary responses may not be limited to the auditory cortex; neurons in visual and other sensory cortices might exhibit similar responses to the appropriate stimuli. For example, the tone pips we used might be the auditory analog of a brief flash of light, rather than the oriented moving edges or gratings usually used to probe the primary visual cortex. Conversely, auditory stimuli analogous to edges or gratings [16, 17] may be more likely to elicit conventional, rate-modulated Poisson responses in the auditory cortex. Indeed, there may be a continuum between binary and Poisson modes. Thus, even in conventional rate-modulated responses, the first spike is often privileged in that it carries most of the information in the spike train [5, 14, 18]. The first spike may be particularly important as a means of rapidly signaling stimulus transients. Binary responses suggest a mode that complements conventional rate coding. In the simplest rate-coding model, a stimulus parameter (such as the frequency of a tone) governs only the rate at which a neuron generates spikes, but not the detailed positions of the spikes; the actual spike train itself is an instantiation of a random process (such as a Poisson process). By contrast, in the binomial model, the stimulus parameter (frequency) is encoded as the probability of firing (Fig. 4d). Binary coding has implications for cortical computation. In the rate coding model, stimulus encoding is “ergodic”: a stimulus parameter can be read out either by observing the activity of one neuron for a long time, or a population for a short time. By contrast, in the binary model the stimulus value can be decoded only by observing a neuronal population, so that there is no benefit to integrating over long time periods (cf. ref. [19]). One advantage of binary encoding is that it allows the population to signal quickly; the most compact message a neuron can send is one spike [20]. Binary coding is also more efficient in the context of population coding, as quantified by the signal-to-noise ratio (Fig. 4c). The precise organization of both spike number and time we have observed suggests that cortical activity consists, at least under some conditions, of packets of spikes synchronized across populations of neurons. Theoretical work [21-23] has shown how such packets can propagate stably from one population to the next, but only if neurons within each population fire at most one spike per packet; otherwise, the number of spikes per packet—and hence the width of each packet—grows at each propagation step. Interestingly, one prediction of stable propagation models is that spike probability should be related to timing precision, a prediction born out by our observations (Fig. 3). The role of these packets in computation remains an open question. 2 Identification methods for group statistics We recorded responses to 32 different 25 msec tones from each of 175 neurons from the auditory cortices of 16 Sprague-Dawley rats; each tone was repeated between 5 and 75 times (mean = 19). Thus our ensemble consisted of 32x175=5600 response sets, with between 5 and 75 samples in each set. Of these, 3055 response sets contained at least one spike on at least on trial. For each response set, we tested the hypothesis that the observed variability was significantly lower than expected from the null hypothesis of a Poisson process. The ability to assess significance depended on two parameters: the sample size (5-75) and the firing probability. Intuitively, the dependence on firing probability arises because at low firing rates most responses produce only trials with 0 or 1 spikes under both the Poisson and binary models; only at high firing rates do the two models make different predictions, since in that case the Poisson model includes many trials with 2 or even 3 spikes while the binary model generates only solitary spikes (see Fig. 4a1,a2). Using a stringent significance criterion of p<0.001, 467 response sets had a sufficient number of repeats to assess significance, given the observed firing probability. Of these, half (242/467=52%) were significantly less variable than expected by chance, five hundred-fold higher than the 467/1000=0.467 response sets expected, based on the 0.001 significance criterion, to yield a binary response set. Seventy-two neurons had at least one response set for which significance could be assessed, and of these, 49 neurons (49/72=68%) had at least one significantly sub-Poisson response set. Of this population of 49 neurons, five achieved low variability through repeatable bursty behavior (e.g., every spike count was either 0 or 3, but not 1 or 2) and were excluded from further analysis. The remaining 44 neurons formed the basis for the group statistics analyses shown in Figs. 2a and 3b. Nine of these neurons were subjected to an additional protocol consisting of at least 10 presentations each of 100 msec tones and 25 msec tones of all 32 frequencies. Of the 100 msec stimulation response sets, 44 were found to be significantly sub-Poisson at the p<0.05 level, in good agreement with the 43 found to be significant among the responses to 25 msec tones. 3 Bibliography 1. Kilgard, M.P. and M.M. Merzenich, Cortical map reorganization enabled by nucleus basalis activity. Science, 1998. 279(5357): p. 1714-8. 2. Sally, S.L. and J.B. Kelly, Organization of auditory cortex in the albino rat: sound frequency. J Neurophysiol, 1988. 59(5): p. 1627-38. 3. Softky, W.R. and C. Koch, The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J Neurosci, 1993. 13(1): p. 334-50. 4. Stevens, C.F. and A.M. Zador, Input synchrony and the irregular firing of cortical neurons. Nat Neurosci, 1998. 1(3): p. 210-7. 5. Buracas, G.T., A.M. Zador, M.R. DeWeese, and T.D. Albright, Efficient discrimination of temporal patterns by motion-sensitive neurons in primate visual cortex. Neuron, 1998. 20(5): p. 959-69. 6. Shadlen, M.N. and W.T. Newsome, The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J Neurosci, 1998. 18(10): p. 3870-96. 7. Tolhurst, D.J., J.A. Movshon, and A.F. Dean, The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Res, 1983. 23(8): p. 775-85. 8. Otmakhov, N., A.M. Shirke, and R. Malinow, Measuring the impact of probabilistic transmission on neuronal output. Neuron, 1993. 10(6): p. 1101-11. 9. Friedrich, R.W. and G. Laurent, Dynamic optimization of odor representations by slow temporal patterning of mitral cell activity. Science, 2001. 291(5505): p. 889-94. 10. Kara, P., P. Reinagel, and R.C. Reid, Low response variability in simultaneously recorded retinal, thalamic, and cortical neurons. Neuron, 2000. 27(3): p. 635-46. 11. Gur, M., A. Beylin, and D.M. Snodderly, Response variability of neurons in primary visual cortex (V1) of alert monkeys. J Neurosci, 1997. 17(8): p. 2914-20. 12. Berry, M.J., D.K. Warland, and M. Meister, The structure and precision of retinal spike trains. Proc Natl Acad Sci U S A, 1997. 94(10): p. 5411-6. 13. de Ruyter van Steveninck, R.R., G.D. Lewen, S.P. Strong, R. Koberle, and W. Bialek, Reproducibility and variability in neural spike trains. Science, 1997. 275(5307): p. 1805-8. 14. Heil, P., Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol, 1997. 77(5): p. 2616-41. 15. Lu, T., L. Liang, and X. Wang, Temporal and rate representations of timevarying signals in the auditory cortex of awake primates. Nat Neurosci, 2001. 4(11): p. 1131-8. 16. Kowalski, N., D.A. Depireux, and S.A. Shamma, Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. J Neurophysiol, 1996. 76(5): p. 350323. 17. deCharms, R.C., D.T. Blake, and M.M. Merzenich, Optimizing sound features for cortical neurons. Science, 1998. 280(5368): p. 1439-43. 18. Panzeri, S., R.S. Petersen, S.R. Schultz, M. Lebedev, and M.E. Diamond, The role of spike timing in the coding of stimulus location in rat somatosensory cortex. Neuron, 2001. 29(3): p. 769-77. 19. Britten, K.H., M.N. Shadlen, W.T. Newsome, and J.A. Movshon, The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci, 1992. 12(12): p. 4745-65. 20. Delorme, A. and S.J. Thorpe, Face identification using one spike per neuron: resistance to image degradations. Neural Netw, 2001. 14(6-7): p. 795-803. 21. Diesmann, M., M.O. Gewaltig, and A. Aertsen, Stable propagation of synchronous spiking in cortical neural networks. Nature, 1999. 402(6761): p. 529-33. 22. Marsalek, P., C. Koch, and J. Maunsell, On the relationship between synaptic input and spike output jitter in individual neurons. Proc Natl Acad Sci U S A, 1997. 94(2): p. 735-40. 23. Kistler, W.M. and W. Gerstner, Stable propagation of activity pulses in populations of spiking neurons. Neural Comp., 2002. 14: p. 987-997. 24. Zohary, E., M.N. Shadlen, and W.T. Newsome, Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 1994. 370(6485): p. 140-3. 25. Abbott, L.F. and P. Dayan, The effect of correlated variability on the accuracy of a population code. Neural Comput, 1999. 11(1): p. 91-101.

6 0.12978134 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals

7 0.12324698 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex

8 0.11571577 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

9 0.10662464 171 nips-2002-Reconstructing Stimulus-Driven Neural Networks from Spike Times

10 0.099099167 187 nips-2002-Spikernels: Embedding Spiking Neurons in Inner-Product Spaces

11 0.095894255 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons

12 0.092399292 26 nips-2002-An Estimation-Theoretic Framework for the Presentation of Multiple Stimuli

13 0.090259805 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement

14 0.086376853 24 nips-2002-Adaptive Scaling for Feature Selection in SVMs

15 0.081559956 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

16 0.081040993 18 nips-2002-Adaptation and Unsupervised Learning

17 0.074065238 110 nips-2002-Incremental Gaussian Processes

18 0.07057374 142 nips-2002-Maximum Likelihood and the Information Bottleneck

19 0.069015071 95 nips-2002-Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting

20 0.069012582 67 nips-2002-Discriminative Binaural Sound Localization

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.248), (1, 0.196), (2, 0.109), (3, -0.018), (4, -0.085), (5, -0.286), (6, -0.218), (7, 0.086), (8, 0.008), (9, -0.012), (10, 0.038), (11, 0.027), (12, 0.012), (13, 0.017), (14, 0.11), (15, -0.008), (16, 0.032), (17, -0.049), (18, 0.33), (19, -0.035), (20, 0.079), (21, 0.021), (22, -0.004), (23, 0.058), (24, 0.054), (25, 0.112), (26, 0.152), (27, 0.012), (28, 0.083), (29, -0.083), (30, 0.061), (31, -0.103), (32, 0.167), (33, -0.052), (34, -0.122), (35, -0.071), (36, 0.014), (37, 0.119), (38, -0.023), (39, 0.088), (40, -0.041), (41, 0.073), (42, 0.026), (43, -0.081), (44, -0.015), (45, 0.025), (46, -0.139), (47, 0.059), (48, -0.035), (49, 0.024)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95958078 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions

Author: Maneesh Sahani, Jennifer F. Linden

2 0.9290207 103 nips-2002-How Linear are Auditory Cortical Responses?

Author: Maneesh Sahani, Jennifer F. Linden

3 0.7259528 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise

Author: Alon Fishbach, Bradford J. May

4 0.67574143 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex

Author: Christian K. Machens, Michael Wehr, Anthony M. Zador

5 0.41878563 43 nips-2002-Binary Coding in Auditory Cortex

Author: Michael R. Deweese, Anthony M. Zador

6 0.41144595 18 nips-2002-Adaptation and Unsupervised Learning

7 0.37688959 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement

8 0.31370723 95 nips-2002-Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting

9 0.31301296 187 nips-2002-Spikernels: Embedding Spiking Neurons in Inner-Product Spaces

10 0.3116177 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals

11 0.29420969 142 nips-2002-Maximum Likelihood and the Information Bottleneck

12 0.28490192 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

13 0.28417805 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex

14 0.27611005 110 nips-2002-Incremental Gaussian Processes

15 0.27064717 127 nips-2002-Learning Sparse Topographic Representations with Products of Student-t Distributions

16 0.26917773 67 nips-2002-Discriminative Binaural Sound Localization

17 0.26617596 81 nips-2002-Expected and Unexpected Uncertainty: ACh and NE in the Neocortex

18 0.25916794 6 nips-2002-A Formulation for Minimax Probability Machine Regression

19 0.24949446 124 nips-2002-Learning Graphical Models with Mercer Kernels

20 0.22771326 75 nips-2002-Dynamical Causal Learning

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(11, 0.025), (23, 0.037), (42, 0.067), (54, 0.118), (55, 0.036), (57, 0.015), (64, 0.011), (67, 0.027), (68, 0.047), (69, 0.195), (74, 0.054), (92, 0.02), (98, 0.278)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.90391916 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions

Author: Maneesh Sahani, Jennifer F. Linden

2 0.83852696 86 nips-2002-Fast Sparse Gaussian Process Methods: The Informative Vector Machine

Author: Ralf Herbrich, Neil D. Lawrence, Matthias Seeger

Abstract: We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on informationtheoretic principles, previously suggested for active learning. Our goal is not only to learn d–sparse predictors (which can be evaluated in O(d) rather than O(n), d n, n the number of training points), but also to perform training under strong restrictions on time and memory requirements. The scaling of our method is at most O(n · d2 ), and in large real-world classiﬁcation experiments we show that it can match prediction performance of the popular support vector machine (SVM), yet can be signiﬁcantly faster in training. In contrast to the SVM, our approximation produces estimates of predictive probabilities (‘error bars’), allows for Bayesian model selection and is less complex in implementation. 1

3 0.83737731 92 nips-2002-FloatBoost Learning for Classification

Author: Stan Z. Li, Zhenqiu Zhang, Heung-yeung Shum, Hongjiang Zhang

Abstract: AdaBoost [3] minimizes an upper error bound which is an exponential function of the margin on the training set [14]. However, the ultimate goal in applications of pattern classiﬁcation is always minimum error rate. On the other hand, AdaBoost needs an effective procedure for learning weak classiﬁers, which by itself is difﬁcult especially for high dimensional data. In this paper, we present a novel procedure, called FloatBoost, for learning a better boosted classiﬁer. FloatBoost uses a backtrack mechanism after each iteration of AdaBoost to remove weak classiﬁers which cause higher error rates. The resulting ﬂoat-boosted classiﬁer consists of fewer weak classiﬁers yet achieves lower error rates than AdaBoost in both training and test. We also propose a statistical model for learning weak classiﬁers, based on a stagewise approximation of the posterior using an overcomplete set of scalar features. Experimental comparisons of FloatBoost and AdaBoost are provided through a difﬁcult classiﬁcation problem, face detection, where the goal is to learn from training examples a highly nonlinear classiﬁer to differentiate between face and nonface patterns in a high dimensional space. The results clearly demonstrate the promises made by FloatBoost over AdaBoost.

4 0.83675176 103 nips-2002-How Linear are Auditory Cortical Responses?

Author: Maneesh Sahani, Jennifer F. Linden

5 0.83245766 59 nips-2002-Constraint Classification for Multiclass Classification and Ranking

Author: Sariel Har-Peled, Dan Roth, Dav Zimak

Abstract: The constraint classiﬁcation framework captures many ﬂavors of multiclass classiﬁcation including winner-take-all multiclass classiﬁcation, multilabel classiﬁcation and ranking. We present a meta-algorithm for learning in this framework that learns via a single linear classiﬁer in high dimension. We discuss distribution independent as well as margin-based generalization bounds and present empirical and theoretical evidence showing that constraint classiﬁcation beneﬁts over existing methods of multiclass classiﬁcation.

6 0.82743061 56 nips-2002-Concentration Inequalities for the Missing Mass and for Histogram Rule Error

7 0.82362211 129 nips-2002-Learning in Spiking Neural Assemblies

8 0.80483598 50 nips-2002-Circuit Model of Short-Term Synaptic Dynamics

9 0.80317897 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex

10 0.80000055 46 nips-2002-Boosting Density Estimation

11 0.79383063 102 nips-2002-Hidden Markov Model of Cortical Synaptic Plasticity: Derivation of the Learning Rule

12 0.78888392 43 nips-2002-Binary Coding in Auditory Cortex

13 0.78848565 110 nips-2002-Incremental Gaussian Processes

14 0.78513038 41 nips-2002-Bayesian Monte Carlo

15 0.77560961 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior

16 0.77076417 199 nips-2002-Timing and Partial Observability in the Dopamine System

17 0.7700057 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits

18 0.76421559 81 nips-2002-Expected and Unexpected Uncertainty: ACh and NE in the Neocortex

19 0.76182675 180 nips-2002-Selectivity and Metaplasticity in a Unified Calcium-Dependent Model

20 0.7557916 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise