nips nips2002 nips2002-103 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive field (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. How well can such a model account for neural responses at the very first stages of auditory cortical processing? To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. [sent-11, score-0.436]
2 Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive field (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. [sent-12, score-0.666]
3 How well can such a model account for neural responses at the very first stages of auditory cortical processing? [sent-13, score-0.578]
4 To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. [sent-14, score-0.852]
5 We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses. [sent-15, score-0.66]
6 1 Introduction A number of recent studies have suggested that spectrotemporal receptive field (STRF) models [1, 2], which are linear in the stimulus spectrogram, can describe the spiking responses of auditory cortical neurons quite well [3, 4]. [sent-16, score-0.866]
7 At the same time, other authors have pointed out significant non-linearities in auditory cortical responses [5, 6], or have emphasized both linear and non-linear response components [7, 8]. [sent-17, score-0.759]
8 Some of the differences in these results may well arise from differences in the stimulus ensembles used to evoke neuronal responses. [sent-18, score-0.308]
9 However, even for a single type of stimulus, it is extremely difficult to put a number to the proportion of the response that is linear or non-linear, and so to judge the relative contributions of the two components to the stimulus-evoked activity. [sent-19, score-0.26]
10 The difficulty arises because repeated presentations of identical stimulus sequences evoke highly variable responses from neurons at intermediate stages of perceptual systems, even in anaesthetized animals. [sent-20, score-0.562]
11 While this variability may reflect meaningful changes in the internal state of the animal or may be completely random, from the point of view of modelling the relationship between stimulus and neural response it must be treated as noise. [sent-21, score-0.456]
12 As previous authors have noted [9, 10], this noise complicates the evaluation of the performance of a particular class of stimulus-response function (SRF) model (for example, the class of STRF models) in two ways. [sent-22, score-0.308]
13 Perfect prediction of a noisy response is impossible, even in principle, and since the the true underlying relationship between stimulus and neural response is unknown, it is unclear what degree of partial prediction could possibly be expected. [sent-24, score-0.796]
14 This is the ratio of the reduction in variance achieved by the regression model (the total variance of the outputs minus the variance of the residuals) to the total variance of the outputs. [sent-27, score-0.374]
15 Moreover, the reduction of variance on the training data, which appears in the numerator of the , includes some “explanation” of noise due to overfitting. [sent-29, score-0.244]
16 Here, we develop analytic techniques that overcome the systematic noise-related biases in the usual variance measures1 , and thus obtain, for a population of neurons, a quantitative estimate of the fraction of stimulus-related response captured by a given class of models. [sent-35, score-0.642]
17 This statistical framework may be applicable to analysis of response functions for many types of neural data, ranging from intracellular recordings to imaging measurements. [sent-36, score-0.438]
18 We apply it to extracellular recordings from rodent auditory cortex, quantifying the degree to which STRF models can account for neuronal responses to dynamic random chord stimuli. [sent-37, score-0.842]
19 We find that on average less than half of the reliable stimulus-related power in these responses can be captured by spectrogram-linear STRF models. [sent-38, score-0.562]
20 2 Signal power The analysis assumes that the data consist of spike trains or other neural measurements continuously recorded during presentation of a long, complex, rapidly varying stimulus. [sent-39, score-0.508]
21 In the auditory experiment considered here, the discretization was set by the duration of regularly clocked sound pulses of fixed length; in a visual experiment, the discretization might be the frame rate of a movie. [sent-41, score-0.503]
22 The neural response can then be measured with the same level of precision, counting action potentials (or integrating measurements) to estimate a response rate for each time bin, to obtain a response vector . [sent-42, score-0.724]
23 We propose to measure model performance in terms of the fraction of response power predicted successfully, where “power” is used in the sense of average squared deviation from the mean: ( denoting ¤ ¨ © © '0 (% ¦ §¥ ! [sent-43, score-0.723]
24 As argued above, only some part of the total response power is predictable, even in principle; fortunately, this signal power can be estimated by combining repeated responses to the same stimulus sequence. [sent-48, score-1.505]
25 ¤ ¢ £¡ ¤ ¢ ¥£¡ Suppose we have responses , where is the common, stimulusdependent component (signal) in the response and is the (zero-mean) noise component of the response in the th trial. [sent-50, score-0.735]
26 The expected power in each response is given by (where the symbol means “equal in expectation”). [sent-51, score-0.594]
27 This simple relationship depends only on the noise component having been defined to have zero mean, and holds even if the variance or other property of the noise depends on the signal strength. [sent-52, score-0.403]
28 We now construct two trial-averaged quantities, similar to the sum-of-squares terms used in the analysis of variance (ANOVA) [12]: the power of the average response, and the average power per response. [sent-53, score-0.802]
29 Using to indicate trial averages: and Assuming the noise in each trial is independent (although the noise in different time bins within a trial need not be), we have: . [sent-54, score-0.31]
30 ) This estimator is unbiased, provided only that the noise distribution has defined first and second moments and is independent between trials, as can be verified by explicitly calculating its expected value. [sent-57, score-0.204]
31 However, since each of the power terms in (1) is the mean of at least numbers, the central limit theorem suggests that will be approximately normally distributed for recordings that are considerably longer than the time-scale of noise correlation (in the experiment considered here, ). [sent-59, score-0.679]
32 Thus, depends only on the first and second moments of the response distribution; substitution of data-derived estimates of these moments into (2) yields a standard error bar for the estimator. [sent-62, score-0.395]
33 In this way we have obtained an estimate (with corresponding uncertainty) of the maximum possible signal power that any model could accurately predict, without having assumed any particular distribution or time-independence of the noise. [sent-63, score-0.534]
34 ( 2 i2 h S Q G ' 9 8 V G pq7 G ¦ ' 3 Extrapolating Model Performance To compare the performance of an estimated SRF model to this maximal value, we must determine the amount of response power successfully predicted by the model. [sent-64, score-0.721]
35 This is not necessarily the power of the predicted response, since the prediction may be inaccurate. [sent-65, score-0.444]
36 Instead, the residual power in the difference between a measured response and the predicted response to the same stimulus, , is taken as an estimate of the error power. [sent-66, score-0.902]
37 (The measured response used for this evaluation, and the stimulus which elicited it, may or may not also have been used to identify the parameters of the SRF model being evaluated; see explanation of training and test predictive powers below. [sent-67, score-0.886]
38 ) The difference between the ¥ r # ¥¨ r ¥¨ power in the observed response and the error power gives the predictive power of the . [sent-68, score-1.565]
39 model; it is this value that can be compared to the estimated signal power ¦ ¨ ' To be able to describe more than one neuron, an SRF model class must contain parameters that can be adapted to each case. [sent-69, score-0.58]
40 Ideally, the power of the model class to describe a population of neurons would be judged using parameters that produced models closest to the true SRFs (the ideal models), but we do not have a priori knowledge of those parameters. [sent-70, score-0.734]
41 One way to choose SRF model parameters is to minimize the mean squared error (MSE) between the neural response in the training data and the model prediction for the same stimulus; for example, the Wiener kernel minimizes the MSE for a model based on a finite impulse response filter of fixed length. [sent-72, score-0.674]
42 This MSE is identical to the error power that would be obtained when the training data themselves are used as the reference measured response . [sent-73, score-0.641]
43 Thus, by minimizing the MSE, we maximize the predictive power evaluated against the training data. [sent-74, score-0.65]
44 The resulting maximum value, hereafter the training predictive power, will overestimate the predictive ability of the ideal model, since the minimum-MSE parameters will be overfit to the training data. [sent-75, score-0.639]
45 (Overfitting is inevitable, because model estimates based on finite data will always capture some stimulus-independent response variability. [sent-76, score-0.358]
46 ) More precisely, the expected value of the training predictive power is an upper bound on the true predictive power of the model class; we therefore refer to the training predictive power itself as an upper estimate of the SRF model performance. [sent-77, score-2.117]
47 ¥ For any one recording, the predictive power of the ideal SRF model of a particular class can only be bracketed between these upper and lower estimates (that is, between the training and cross-validation predictive powers). [sent-80, score-1.178]
48 As the noise in the recording grows, the model parameters will overfit more and more to the noise, and hence both estimates will grow looser. [sent-81, score-0.333]
49 As such, the estimates may not usefully constrain the predictive power on a particular recording. [sent-83, score-0.69]
50 However, assuming that the predictive power of a single model class is similar for a population of similar neurons, the noise dependence can be exploited to tighten the estimates when applied to the population as a whole, by extrapolating within the population to the zero noise point. [sent-84, score-1.584]
51 This extrapolation allows us to answer the sort of question posed at the outset: how well, in an absolute sense, can a particular SRF model class account for the responses of a population of neurons? [sent-85, score-0.451]
52 ¦ ¨ ' 4 Experimental Methods Extracellular neural responses were collected from the primary auditory cortex of rodents during presentation of dynamic random chord stimuli. [sent-86, score-0.643]
53 Animals (6 CBA/CaJ mice and 4 Long-Evans rats) were anaesthetized with either ketamine/medetomidine or sodium pentobarbital, and a skull fragment over auditory cortex was removed; all surgical and experimental procedures conformed to protocols approved by the UCSF Committee on Animal Research. [sent-87, score-0.479]
54 Neural responses (205 recordings collected from 68 recording sites) were recorded in the thalamo-recipient layers Signal power (spikes2/bin) 0. [sent-89, score-0.77]
55 5 4 0 50 100 150 Number of recordings Figure 1: Signal power in neural responses. [sent-97, score-0.548]
56 of the left auditory cortex while the stimulus (see below) was presented to the right ear. [sent-98, score-0.629]
57 The dynamic random chord stimulus used in the auditory experiments was similar to that used in a previous study [15], except that the intensity of component tone pulses was variable. [sent-101, score-0.774]
58 The times, frequencies and sound intensities of the pulses were chosen randomly and independently from 20 ms bins in time, 1/12 octave bins covering either 2–32 or 25–100 kHz in frequency, and 5 dB SPL bins covering 25–70 dB SPL in level. [sent-103, score-0.355]
59 At any time point, the stimulus averaged two tone pulses per octave, with an expected loudness of approximately 73 dB SPL for the 2–32 kHz stimulus and 70 dB SPL for the 25–100 kHz stimulus. [sent-104, score-0.62]
60 At each recording site, the 2–32 kHz stimulus was repeated 20 times, and the 25–100 kHz stimulus was repeated 10 times. [sent-106, score-0.61]
61 In this case, an additional parameter reflecting the average noise in the response was also estimated. [sent-110, score-0.357]
62 Models incorporating static output non-linearities were fit by kernel regression between the output of the linear model (fit by ARD) and the training data. [sent-111, score-0.213]
63 Cross-validation for lower estimates on model predictive power used 10 disjoint splits into 9/10 training data and 1/10 test data. [sent-114, score-0.817]
64 Extrapolation of the predictive powers in the population, shown in Figs. [sent-115, score-0.338]
65 5 Results We used the techniques described above to ask how accurate a description of auditory cortex responses could be provided by the STRF. [sent-119, score-0.551]
66 Recordings were binned to match the discretization rate of the stimulus and the signal power estimated using equation (1). [sent-120, score-0.818]
67 1 shows the distribution of signal powers obtained, as a scatter plot against the estimated noise power and as a histogram. [sent-122, score-0.826]
68 1, had signal power greater than one standard error above zero. [sent-125, score-0.443]
69 The training predictive power of this model provided the upper estimate for the predictive power of the model class. [sent-128, score-1.428]
70 The cross-validation predictive power of these estimates served as the lower estimates of the model class performance. [sent-132, score-0.903]
71 2 shows the upper ( ) and lower ( ) estimates for the predictive power of the class of linear STRF models in our population of rodent auditory cortex recordings, as a function of the estimated noise level in each recording. [sent-134, score-1.6]
72 The divergence of the estimates at higher noise levels, described above, is evident. [sent-135, score-0.218]
73 At low noise levels the estimates do not converge for the upper estimate and perfectly, the extrapolated values being for the lower (intervals are standard errors). [sent-136, score-0.398]
74 Simulated data were produced by generating Poisson spike trains with mean rates as predicted by the ARD-estimated models for real cortical recordings, and rectifying so that negative predictions were treated as zero. [sent-140, score-0.261]
75 Simulated spike trains were then binned and analyzed in the same manner as real spike trains. [sent-141, score-0.213]
76 Thus, the analysis correctly reports that virtually all of the response power in these simulations is linearly Normalized linearly predictable power 1 1. [sent-145, score-1.048]
77 5 0 0 20 40 Normalized noise power 60 0 10 20 30 Figure 2: Evaluation of STRF predictive power in auditory cortex. [sent-149, score-1.394]
78 predictable from the stimulus spectrogram, attesting to the reliability of the extrapolated estimates for the real data in Fig. [sent-157, score-0.463]
79 Some portion of the scatter of the points about the population average lines in Fig. [sent-159, score-0.271]
80 2 reflects genuine variability in the population, and so the extrapolated scatter at zero noise is also of interest. [sent-160, score-0.294]
81 Intervals containing at least 50% of the population distribution for the cortical data are for the upper estimate and for the lower estimate (assuming normal scatter). [sent-161, score-0.423]
82 These will be overestimates of the spread in the underlying population distribution because of additional scatter from estimation noise. [sent-162, score-0.271]
83 The variability of STRF predictive power in the population appears unimodal, and the hypothesis that the distributions of the deviations from the regression lines are zero-mean normal in both cases cannot be rejected (Kolmogorov-Smirnov test, ). [sent-163, score-0.836]
84 Thus the treatment of these recordings as coming from a single homogeneous population is reasonable. [sent-164, score-0.348]
85 3, there is a small amount of downward bias and population scatter due to the varying amounts of rectification in the simulations; however, most of the observed scatter is due to estimation error resulting from the incorporation of Poisson noise. [sent-166, score-0.374]
86 The resulting cross-validation predictive powers were compared to those of the spectrogram-linear model (data not shown). [sent-169, score-0.383]
87 The addition of a static output nonlinearity contributed very little to the predictive power of the STRF model class. [sent-170, score-0.704]
88 Although the difference in model performance was significant ( , Wilcoxon signed rank test), the mean normalized predictive power increase with the addition of a static output non-linearity was very small (0. [sent-171, score-0.746]
89 ( 5 65 # 5 ¤ ¥£ 6 Conclusions We have demonstrated a novel way to evaluate the fraction of response power in a population of neurons that can be captured by a particular class of SRF models. [sent-173, score-0.964]
90 The confounding effects of noise on evaluation of model performance and estimation of model parameters are overcome by two key analytic steps. [sent-174, score-0.261]
91 First, multiple measurements of neural responses to the same stimulus are used to obtain an unbiased estimate of the fraction of the response variance that is predictable in principle, against which the predictive power of a model may be judged. [sent-175, score-1.593]
92 Data with these features are commonly encountered in sensory neuroscience, where the sensory stimulus can be reliably repeated. [sent-178, score-0.304]
93 Applying this technique to analysis of the primary auditory cortex we find that spectrogramlinear response components can account for only 18% to 40% (on average) of the power in extracellular responses to dynamic random chord stimuli. [sent-180, score-1.329]
94 Further, elaborated models that append a static output non-linearity to the linear filter are barely more effective at predicting responses to novel stimuli than is the linear model class alone. [sent-181, score-0.299]
95 Previous studies of auditory cortex have reached widely varying conclusions regarding the degree of linearity of neural responses. [sent-182, score-0.514]
96 Such discrepancies may indicate that response properties are critically dependent on the statistics of the stimulus ensemble [6, 5, 10], or that cortical response linearity differs between species. [sent-183, score-0.852]
97 Alternatively, as previous measures of linearity have been biased by noise, the divergent estimates might also have arisen from variation in the level of noise power across studies. [sent-184, score-0.667]
98 Our approach represents the first evaluation of auditory cortex response predictability that is free of this potential noise confound. [sent-185, score-0.796]
99 The high degree of response non-linearity we observe may well be a characteristic of all auditory cortical responses, given the many known non-linearities in the peripheral and central auditory systems [17]. [sent-186, score-0.933]
100 Alternatively, it might be unique to auditory cortex responses to noisy sounds like dynamic random chord stimuli, or else may be general to all stimulus ensembles and all sensory cortices. [sent-187, score-0.954]
wordName wordTfidf (topN-words)
[('power', 0.368), ('auditory', 0.292), ('srf', 0.282), ('strf', 0.239), ('predictive', 0.235), ('stimulus', 0.23), ('response', 0.226), ('recordings', 0.18), ('population', 0.168), ('responses', 0.152), ('noise', 0.131), ('cortex', 0.107), ('scatter', 0.103), ('powers', 0.103), ('pulses', 0.099), ('hear', 0.093), ('chord', 0.092), ('cortical', 0.089), ('estimates', 0.087), ('predictable', 0.086), ('linearity', 0.081), ('signal', 0.075), ('spl', 0.074), ('recording', 0.07), ('johannesma', 0.069), ('nelken', 0.069), ('neurons', 0.066), ('khz', 0.066), ('variance', 0.066), ('regression', 0.065), ('mse', 0.065), ('tone', 0.061), ('binned', 0.06), ('extrapolated', 0.06), ('static', 0.056), ('ard', 0.055), ('linden', 0.055), ('spike', 0.053), ('spectrogram', 0.051), ('sahani', 0.051), ('aertsen', 0.051), ('unbiased', 0.051), ('bins', 0.048), ('res', 0.048), ('fraction', 0.048), ('training', 0.047), ('bin', 0.047), ('trains', 0.047), ('rodent', 0.046), ('spectrogramlinear', 0.046), ('class', 0.046), ('extracellular', 0.046), ('ms', 0.046), ('estimate', 0.046), ('estimated', 0.046), ('model', 0.045), ('ensembles', 0.044), ('normalized', 0.042), ('captured', 0.042), ('ideal', 0.041), ('moments', 0.041), ('determination', 0.041), ('repeated', 0.04), ('eggermont', 0.04), ('shamma', 0.04), ('anaesthetized', 0.04), ('mice', 0.04), ('anova', 0.04), ('extrapolation', 0.04), ('prut', 0.04), ('evaluation', 0.04), ('measurements', 0.04), ('prediction', 0.04), ('discretization', 0.039), ('upper', 0.039), ('rotman', 0.037), ('spectrotemporal', 0.037), ('extrapolating', 0.037), ('abeles', 0.037), ('vaadia', 0.037), ('ucsf', 0.037), ('sensory', 0.037), ('predicted', 0.036), ('predictions', 0.036), ('lower', 0.035), ('ear', 0.034), ('maneesh', 0.034), ('rats', 0.034), ('evoke', 0.034), ('overestimate', 0.034), ('sound', 0.034), ('contributions', 0.034), ('degree', 0.034), ('simulated', 0.032), ('octave', 0.032), ('intracellular', 0.032), ('estimator', 0.032), ('intervals', 0.031), ('recti', 0.031), ('db', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999893 103 nips-2002-How Linear are Auditory Cortical Responses?
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive field (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. How well can such a model account for neural responses at the very first stages of auditory cortical processing? To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses.
2 0.50041664 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: An essential step in understanding the function of sensory nervous systems is to characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. One increasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses of one or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal firing. The estimation techniques usually employed, such as Wiener filtering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters that control the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.
3 0.44564316 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex
Author: Christian K. Machens, Michael Wehr, Anthony M. Zador
Abstract: How do cortical neurons represent the acoustic environment? This question is often addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations and music. We have used in vivo whole cell methods in the rat auditory cortex to record subthreshold membrane potential fluctuations elicited by these stimuli. Whole cell recording reveals the total synaptic input to a neuron from all the other neurons in the circuit, instead of just its output—a sparse binary spike train—as in conventional single unit physiological recordings. Whole cell recording thus provides a much richer source of information about the neuron’s response. Many neurons responded robustly and reliably to the complex stimuli in our ensemble. Here we analyze the linear component—the spectrotemporal receptive field (STRF)—of the transformation from the sound (as represented by its time-varying spectrogram) to the neuron’s membrane potential. We find that the STRF has a rich dynamical structure, including excitatory regions positioned in general accord with the prediction of the simple tuning curve. We also find that in many cases, much of the neuron’s response, although deterministically related to the stimulus, cannot be predicted by the linear component, indicating the presence of as-yet-uncharacterized nonlinear response properties.
4 0.28342795 43 nips-2002-Binary Coding in Auditory Cortex
Author: Michael R. Deweese, Anthony M. Zador
Abstract: Cortical neurons have been reported to use both rate and temporal codes. Here we describe a novel mode in which each neuron generates exactly 0 or 1 action potentials, but not more, in response to a stimulus. We used cell-attached recording, which ensured single-unit isolation, to record responses in rat auditory cortex to brief tone pips. Surprisingly, the majority of neurons exhibited binary behavior with few multi-spike responses; several dramatic examples consisted of exactly one spike on 100% of trials, with no trial-to-trial variability in spike count. Many neurons were tuned to stimulus frequency. Since individual trials yielded at most one spike for most neurons, the information about stimulus frequency was encoded in the population, and would not have been accessible to later stages of processing that only had access to the activity of a single unit. These binary units allow a more efficient population code than is possible with conventional rate coding units, and are consistent with a model of cortical processing in which synchronous packets of spikes propagate stably from one neuronal population to the next. 1 Binary coding in auditory cortex We recorded responses of neurons in the auditory cortex of anesthetized rats to pure-tone pips of different frequencies [1, 2]. Each pip was presented repeatedly, allowing us to assess the variability of the neural response to multiple presentations of each stimulus. We first recorded multi-unit activity with conventional tungsten electrodes (Fig. 1a). The number of spikes in response to each pip fluctuated markedly from one trial to the next (Fig. 1e), as though governed by a random mechanism such as that generating the ticks of a Geiger counter. Highly variable responses such as these, which are at least as variable as a Poisson process, are the norm in the cortex [3-7], and have contributed to the widely held view that cortical spike trains are so noisy that only the average firing rate can be used to encode stimuli. Because we were recording the activity of an unknown number of neurons, we could not be sure whether the strong trial-to-trial fluctuations reflected the underlying variability of the single units. We therefore used an alternative technique, cell- a b Single-unit recording method 5mV Multi-unit 1sec Raw cellattached voltage 10 kHz c Single-unit . . . . .. .. ... . . .... . ... . Identified spikes Threshold e 28 kHz d Single-unit 80 120 160 200 Time (msec) N = 29 tones 3 2 1 Poisson N = 11 tones ry 40 4 na bi 38 kHz 0 Response variance/mean (spikes/trial) High-pass filtered 0 0 1 2 3 Mean response (spikes/trial) Figure 1: Multi-unit spiking activity was highly variable, but single units obeyed binomial statistics. a Multi-unit spike rasters from a conventional tungsten electrode recording showed high trial-to-trial variability in response to ten repetitions of the same 50 msec pure tone stimulus (bottom). Darker hash marks indicate spike times within the response period, which were used in the variability analysis. b Spikes recorded in cell-attached mode were easily identified from the raw voltage trace (top) by applying a high-pass filter (bottom) and thresholding (dark gray line). Spike times (black squares) were assigned to the peaks of suprathreshold segments. c Spike rasters from a cell-attached recording of single-unit responses to 25 repetitions of the same tone consisted of exactly one well-timed spike per trial (latency standard deviation = 1.0 msec), unlike the multi-unit responses (Fig. 1a). Under the Poisson assumption, this would have been highly unlikely (P ~ 10 -11). d The same neuron as in Fig. 1c responds with lower probability to repeated presentations of a different tone, but there are still no multi-spike responses. e We quantified response variability for each tone by dividing the variance in spike count by the mean spike count across all trials for that tone. Response variability for multi-unit tungsten recording (open triangles) was high for each of the 29 tones (out of 32) that elicited at least one spike on one trial. All but one point lie above one (horizontal gray line), which is the value produced by a Poisson process with any constant or time varying event rate. Single unit responses recorded in cell-attached mode were far less variable (filled circles). Ninety one percent (10/11) of the tones that elicited at least one spike from this neuron produced no multi-spike responses in 25 trials; the corresponding points fall on the diagonal line between (0,1) and (1,0), which provides a strict lower bound on the variability for any response set with a mean between 0 and 1. No point lies above one. attached recording with a patch pipette [8, 9], in order to ensure single unit isolation (Fig. 1b). This recording mode minimizes both of the main sources of error in spike detection: failure to detect a spike in the unit under observation (false negatives), and contamination by spikes from nearby neurons (false positives). It also differs from conventional extracellular recording methods in its selection bias: With cell- attached recording neurons are selected solely on the basis of the experimenter’s ability to form a seal, rather than on the basis of neuronal activity and responsiveness to stimuli as in conventional methods. Surprisingly, single unit responses were far more orderly than suggested by the multi-unit recordings; responses typically consisted of either 0 or 1 spikes per trial, and not more (Fig. 1c-e). In the most dramatic examples, each presentation of the same tone pip elicited exactly one spike (Fig. 1c). In most cases, however, some presentations failed to elicit a spike (Fig. 1d). Although low-variability responses have recently been observed in the cortex [10, 11] and elsewhere [12, 13], the binary behavior described here has not previously been reported for cortical neurons. a 1.4 N = 3055 response sets b 1.2 1 Poisson 28 kHz - 100 msec 0.8 0.6 0.4 0.2 0 0 ry na bi Response variance/mean (spikes/trial) The majority of the neurons (59%) in our study for which statistical significance could be assessed (at the p<0.001 significance level; see Fig. 2, caption) showed noisy binary behavior—“binary” because neurons produced either 0 or 1 spikes, and “noisy” because some stimuli elicited both single spikes and failures. In a substantial fraction of neurons, however, the responses showed more variability. We found no correlation between neuronal variability and cortical layer (inferred from the depth of the recording electrode), cortical area (inside vs. outside of area A1) or depth of anesthesia. Moreover, the binary mode of spiking was not due to the brevity (25 msec) of the stimuli; responses that were binary for short tones were comparably binary when longer (100 msec) tones were used (Fig. 2b). Not assessable Not significant Significant (p<0.001) 0.2 0.4 0.6 0.8 1 1.2 Mean response (spikes/trial) 28 kHz - 25 msec 1.4 0 40 80 120 160 Time (msec) 200 Figure 2: Half of the neuronal population exhibited binary firing behavior. a Of the 3055 sets of responses to 25 msec tones, 2588 (gray points) could not be assessed for significance at the p<0.001 level, 225 (open circles) were not significantly binary, and 242 were significantly binary (black points; see Identification methods for group statistics below). All points were jittered slightly so that overlying points could be seen in the figure. 2165 response sets contained no multi-spike responses; the corresponding points fell on the line from [0,1] to [1,0]. b The binary nature of single unit responses was insensitive to tone duration, even for frequencies that elicited the largest responses. Twenty additional spike rasters from the same neuron (and tone frequency) as in Fig. 1c contain no multi-spike responses whether in response to 100 msec tones (above) or 25 msec tones (below). Across the population, binary responses were as prevalent for 100 msec tones as for 25 msec tones (see Identification methods for group statistics). In many neurons, binary responses showed high temporal precision, with latencies sometimes exhibiting standard deviations as low as 1 msec (Fig. 3; see also Fig. 1c), comparable to previous observations in the auditory cortex [14], and only slightly more precise than in monkey visual area MT [5]. High temporal precision was positively correlated with high response probability (Fig. 3). a b N = (44 cells)x(32 tones) 14 N = 32 tones 12 30 Jitter (msec) Jitter (msec) 40 10 8 6 20 10 4 2 0 0 0 0.2 0.4 0.6 0.8 Mean response (spikes/trial) 1 0 0.4 0.8 1.2 1.6 Mean response (spikes/trial) 2 Figure 3: Trial-to-trial variability in latency of response to repeated presentations of the same tone decreased with increasing response probability. a Scatter plot of standard deviation of latency vs. mean response for 25 presentations each of 32 tones for a different neuron as in Figs. 1 and 2 (gray line is best linear fit). Rasters from 25 repeated presentations of a low response tone (upper left inset, which corresponds to left-most data point) display much more variable latencies than rasters from a high response tone (lower right inset; corresponds to right-most data point). b The negative correlation between latency variability and response size was present on average across the population of 44 neurons described in Identification methods for group statistics (linear fit, gray). The low trial-to-trial variability ruled out the possibility that the firing statistics could be accounted for by a simple rate-modulated Poisson process (Fig. 4a1,a2). In other systems, low variability has sometimes been modeled as a Poisson process followed by a post-spike refractory period [10, 12]. In our system, however, the range in latencies of evoked binary responses was often much greater than the refractory period, which could not have been longer than the 2 msec inter-spike intervals observed during epochs of spontaneous spiking, indicating that binary spiking did not result from any intrinsic property of the spike generating mechanism (Fig. 4a3). Moreover, a single stimulus-evoked spike could suppress subsequent spikes for as long as hundreds of milliseconds (e.g. Figs. 1d,4d), supporting the idea that binary spiking arises through a circuit-level, rather than a single-neuron, mechanism. Indeed, the fact that this suppression is observed even in the cortex of awake animals [15] suggests that binary spiking is not a special property of the anesthetized state. It seems surprising that binary spiking in the cortex has not previously been remarked upon. In the auditory cortex the explanation may be in part technical: Because firing rates in the auditory cortex tend to be low, multi-unit recording is often used to maximize the total amount of data collected. Moreover, our use of cell-attached recording minimizes the usual bias toward responsive or active neurons. Such explanations are not, however, likely to account for the failure to observe binary spiking in the visual cortex, where spike count statistics have been scrutinized more closely [3-7]. One possibility is that this reflects a fundamental difference between the auditory and visual systems. An alternative interpretation— a1 b Response probability 100 spikes/s 2 kHz Poisson simulation c 100 200 300 400 Time (msec) 500 20 Ratio of pool sizes a2 0 16 12 8 4 0 a3 Poisson with refractory period 0 40 80 120 160 200 Time (msec) d Response probability PSTH 0.2 0.4 0.6 0.8 1 Mean spike count per neuron 1 0.8 N = 32 tones 0.6 0.4 0.2 0 2.0 3.8 7.1 13.2 24.9 46.7 Tone frequency (kHz) Figure 4: a The lack of multi-spike responses elicited by the neuron shown in Fig. 3a were not due to an absolute refractory period since the range of latencies for many tones, like that shown here, was much greater than any reasonable estimate for the neuron’s refractory period. (a1) Experimentally recorded responses. (a2) Using the smoothed post stimulus time histogram (PSTH; bottom) from the set of responses in Fig. 4a, we generated rasters under the assumption of Poisson firing. In this representative example, four double-spike responses (arrows at left) were produced in 25 trials. (a3) We then generated rasters assuming that the neuron fired according to a Poisson process subject to a hard refractory period of 2 msec. Even with a refractory period, this representative example includes one triple- and three double-spike responses. The minimum interspike-interval during spontaneous firing events was less than two msec for five of our neurons, so 2 msec is a conservative upper bound for the refractory period. b. Spontaneous activity is reduced following high-probability responses. The PSTH (top; 0.25 msec bins) of the combined responses from the 25% (8/32) of tones that elicited the largest responses from the same neuron as in Figs. 3a and 4a illustrates a preclusion of spontaneous and evoked activity for over 200 msec following stimulation. The PSTHs from progressively less responsive groups of tones show progressively less preclusion following stimulation. c Fewer noisy binary neurons need to be pooled to achieve the same “signal-to-noise ratio” (SNR; see ref. [24]) as a collection of Poisson neurons. The ratio of the number of Poisson to binary neurons required to achieve the same SNR is plotted against the mean number of spikes elicited per neuron following stimulation; here we have defined the SNR to be the ratio of the mean spike count to the standard deviation of the spike count. d Spike probability tuning curve for the same neuron as in Figs. 1c-e and 2b fit to a Gaussian in tone frequency. and one that we favor—is that the difference rests not in the sensory modality, but instead in the difference between the stimuli used. In this view, the binary responses may not be limited to the auditory cortex; neurons in visual and other sensory cortices might exhibit similar responses to the appropriate stimuli. For example, the tone pips we used might be the auditory analog of a brief flash of light, rather than the oriented moving edges or gratings usually used to probe the primary visual cortex. Conversely, auditory stimuli analogous to edges or gratings [16, 17] may be more likely to elicit conventional, rate-modulated Poisson responses in the auditory cortex. Indeed, there may be a continuum between binary and Poisson modes. Thus, even in conventional rate-modulated responses, the first spike is often privileged in that it carries most of the information in the spike train [5, 14, 18]. The first spike may be particularly important as a means of rapidly signaling stimulus transients. Binary responses suggest a mode that complements conventional rate coding. In the simplest rate-coding model, a stimulus parameter (such as the frequency of a tone) governs only the rate at which a neuron generates spikes, but not the detailed positions of the spikes; the actual spike train itself is an instantiation of a random process (such as a Poisson process). By contrast, in the binomial model, the stimulus parameter (frequency) is encoded as the probability of firing (Fig. 4d). Binary coding has implications for cortical computation. In the rate coding model, stimulus encoding is “ergodic”: a stimulus parameter can be read out either by observing the activity of one neuron for a long time, or a population for a short time. By contrast, in the binary model the stimulus value can be decoded only by observing a neuronal population, so that there is no benefit to integrating over long time periods (cf. ref. [19]). One advantage of binary encoding is that it allows the population to signal quickly; the most compact message a neuron can send is one spike [20]. Binary coding is also more efficient in the context of population coding, as quantified by the signal-to-noise ratio (Fig. 4c). The precise organization of both spike number and time we have observed suggests that cortical activity consists, at least under some conditions, of packets of spikes synchronized across populations of neurons. Theoretical work [21-23] has shown how such packets can propagate stably from one population to the next, but only if neurons within each population fire at most one spike per packet; otherwise, the number of spikes per packet—and hence the width of each packet—grows at each propagation step. Interestingly, one prediction of stable propagation models is that spike probability should be related to timing precision, a prediction born out by our observations (Fig. 3). The role of these packets in computation remains an open question. 2 Identification methods for group statistics We recorded responses to 32 different 25 msec tones from each of 175 neurons from the auditory cortices of 16 Sprague-Dawley rats; each tone was repeated between 5 and 75 times (mean = 19). Thus our ensemble consisted of 32x175=5600 response sets, with between 5 and 75 samples in each set. Of these, 3055 response sets contained at least one spike on at least on trial. For each response set, we tested the hypothesis that the observed variability was significantly lower than expected from the null hypothesis of a Poisson process. The ability to assess significance depended on two parameters: the sample size (5-75) and the firing probability. Intuitively, the dependence on firing probability arises because at low firing rates most responses produce only trials with 0 or 1 spikes under both the Poisson and binary models; only at high firing rates do the two models make different predictions, since in that case the Poisson model includes many trials with 2 or even 3 spikes while the binary model generates only solitary spikes (see Fig. 4a1,a2). Using a stringent significance criterion of p<0.001, 467 response sets had a sufficient number of repeats to assess significance, given the observed firing probability. Of these, half (242/467=52%) were significantly less variable than expected by chance, five hundred-fold higher than the 467/1000=0.467 response sets expected, based on the 0.001 significance criterion, to yield a binary response set. Seventy-two neurons had at least one response set for which significance could be assessed, and of these, 49 neurons (49/72=68%) had at least one significantly sub-Poisson response set. Of this population of 49 neurons, five achieved low variability through repeatable bursty behavior (e.g., every spike count was either 0 or 3, but not 1 or 2) and were excluded from further analysis. The remaining 44 neurons formed the basis for the group statistics analyses shown in Figs. 2a and 3b. Nine of these neurons were subjected to an additional protocol consisting of at least 10 presentations each of 100 msec tones and 25 msec tones of all 32 frequencies. Of the 100 msec stimulation response sets, 44 were found to be significantly sub-Poisson at the p<0.05 level, in good agreement with the 43 found to be significant among the responses to 25 msec tones. 3 Bibliography 1. Kilgard, M.P. and M.M. Merzenich, Cortical map reorganization enabled by nucleus basalis activity. Science, 1998. 279(5357): p. 1714-8. 2. Sally, S.L. and J.B. Kelly, Organization of auditory cortex in the albino rat: sound frequency. J Neurophysiol, 1988. 59(5): p. 1627-38. 3. Softky, W.R. and C. Koch, The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J Neurosci, 1993. 13(1): p. 334-50. 4. Stevens, C.F. and A.M. Zador, Input synchrony and the irregular firing of cortical neurons. Nat Neurosci, 1998. 1(3): p. 210-7. 5. Buracas, G.T., A.M. Zador, M.R. DeWeese, and T.D. Albright, Efficient discrimination of temporal patterns by motion-sensitive neurons in primate visual cortex. Neuron, 1998. 20(5): p. 959-69. 6. Shadlen, M.N. and W.T. Newsome, The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J Neurosci, 1998. 18(10): p. 3870-96. 7. Tolhurst, D.J., J.A. Movshon, and A.F. Dean, The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Res, 1983. 23(8): p. 775-85. 8. Otmakhov, N., A.M. Shirke, and R. Malinow, Measuring the impact of probabilistic transmission on neuronal output. Neuron, 1993. 10(6): p. 1101-11. 9. Friedrich, R.W. and G. Laurent, Dynamic optimization of odor representations by slow temporal patterning of mitral cell activity. Science, 2001. 291(5505): p. 889-94. 10. Kara, P., P. Reinagel, and R.C. Reid, Low response variability in simultaneously recorded retinal, thalamic, and cortical neurons. Neuron, 2000. 27(3): p. 635-46. 11. Gur, M., A. Beylin, and D.M. Snodderly, Response variability of neurons in primary visual cortex (V1) of alert monkeys. J Neurosci, 1997. 17(8): p. 2914-20. 12. Berry, M.J., D.K. Warland, and M. Meister, The structure and precision of retinal spike trains. Proc Natl Acad Sci U S A, 1997. 94(10): p. 5411-6. 13. de Ruyter van Steveninck, R.R., G.D. Lewen, S.P. Strong, R. Koberle, and W. Bialek, Reproducibility and variability in neural spike trains. Science, 1997. 275(5307): p. 1805-8. 14. Heil, P., Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol, 1997. 77(5): p. 2616-41. 15. Lu, T., L. Liang, and X. Wang, Temporal and rate representations of timevarying signals in the auditory cortex of awake primates. Nat Neurosci, 2001. 4(11): p. 1131-8. 16. Kowalski, N., D.A. Depireux, and S.A. Shamma, Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. J Neurophysiol, 1996. 76(5): p. 350323. 17. deCharms, R.C., D.T. Blake, and M.M. Merzenich, Optimizing sound features for cortical neurons. Science, 1998. 280(5368): p. 1439-43. 18. Panzeri, S., R.S. Petersen, S.R. Schultz, M. Lebedev, and M.E. Diamond, The role of spike timing in the coding of stimulus location in rat somatosensory cortex. Neuron, 2001. 29(3): p. 769-77. 19. Britten, K.H., M.N. Shadlen, W.T. Newsome, and J.A. Movshon, The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci, 1992. 12(12): p. 4745-65. 20. Delorme, A. and S.J. Thorpe, Face identification using one spike per neuron: resistance to image degradations. Neural Netw, 2001. 14(6-7): p. 795-803. 21. Diesmann, M., M.O. Gewaltig, and A. Aertsen, Stable propagation of synchronous spiking in cortical neural networks. Nature, 1999. 402(6761): p. 529-33. 22. Marsalek, P., C. Koch, and J. Maunsell, On the relationship between synaptic input and spike output jitter in individual neurons. Proc Natl Acad Sci U S A, 1997. 94(2): p. 735-40. 23. Kistler, W.M. and W. Gerstner, Stable propagation of activity pulses in populations of spiking neurons. Neural Comp., 2002. 14: p. 987-997. 24. Zohary, E., M.N. Shadlen, and W.T. Newsome, Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 1994. 370(6485): p. 140-3. 25. Abbott, L.F. and P. Dayan, The effect of correlated variability on the accuracy of a population code. Neural Comput, 1999. 11(1): p. 91-101.
5 0.25031182 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise
Author: Alon Fishbach, Bradford J. May
Abstract: Psychophysical data suggest that temporal modulations of stimulus amplitude envelopes play a prominent role in the perceptual segregation of concurrent sounds. In particular, the detection of an unmodulated signal can be significantly improved by adding amplitude modulation to the spectral envelope of a competing masking noise. This perceptual phenomenon is known as “Comodulation Masking Release” (CMR). Despite the obvious influence of temporal structure on the perception of complex auditory scenes, the physiological mechanisms that contribute to CMR and auditory streaming are not well known. A recent physiological study by Nelken and colleagues has demonstrated an enhanced cortical representation of auditory signals in modulated noise. Our study evaluates these CMR-like response patterns from the perspective of a hypothetical auditory edge-detection neuron. It is shown that this simple neural model for the detection of amplitude transients can reproduce not only the physiological data of Nelken et al., but also, in light of previous results, a variety of physiological and psychoacoustical phenomena that are related to the perceptual segregation of concurrent sounds. 1 In t rod u ct i on The temporal structure of a complex sound exerts strong influences on auditory physiology (e.g. [10, 16]) and perception (e.g. [9, 19, 20]). In particular, studies of auditory scene analysis have demonstrated the importance of the temporal structure of amplitude envelopes in the perceptual segregation of concurrent sounds [2, 7]. Common amplitude transitions across frequency serve as salient cues for grouping sound energy into unified perceptual objects. Conversely, asynchronous amplitude transitions enhance the separation of competing acoustic events [3, 4]. These general principles are manifested in perceptual phenomena as diverse as comodulation masking release (CMR) [13], modulation detection interference [22] and synchronous onset grouping [8]. Despite the obvious importance of timing information in psychoacoustic studies of auditory masking, the way in which the CNS represents the temporal structure of an amplitude envelope is not well understood. Certainly many physiological studies have demonstrated neural sensitivities to envelope transitions, but this sensitivity is only beginning to be related to the variety of perceptual experiences that are evoked by signals in noise. Nelken et al. [15] have suggested a correspondence between neural responses to time-varying amplitude envelopes and psychoacoustic masking phenomena. In their study of neurons in primary auditory cortex (A1), adding temporal modulation to background noise lowered the detection thresholds of unmodulated tones. This enhanced signal detection is similar to the perceptual phenomenon that is known as comodulation masking release [13]. Fishbach et al. [11] have recently proposed a neural model for the detection of “auditory edges” (i.e., amplitude transients) that can account for numerous physiological [14, 17, 18] and psychoacoustical [3, 21] phenomena. The encompassing utility of this edge-detection model suggests a common mechanism that may link the auditory processing and perception of auditory signals in a complex auditory scene. Here, it is shown that the auditory edge detection model can accurately reproduce the cortical CMR-like responses previously described by Nelken and colleagues. 2 Th e M od el The model is described in detail elsewhere [11]. In short, the basic operation of the model is the calculation of the first-order time derivative of the log-compressed envelope of the stimulus. A computational model [23] is used to convert the acoustic waveform to a physiologically plausible auditory nerve representation (Fig 1a). The simulated neural response has a medium spontaneous rate and a characteristic frequency that is set to the frequency of the target tone. To allow computation of the time derivative of the stimulus envelope, we hypothesize the existence of a temporal delay dimension, along which the stimulus is progressively delayed. The intermediate delay layer (Fig 1b) is constructed from an array of neurons with ascending membrane time constants (τ); each neuron is modeled by a conventional integrate-and-fire model (I&F;, [12]). Higher membrane time constant induces greater delay in the neuron’s response [1]. The output of the delay layer converges to a single output neuron (Fig. 1c) via a set of connection with various efficacies that reflect a receptive field of a gaussian derivative. This combination of excitatory and inhibitory connections carries out the time-derivative computation. Implementation details and parameters are given in [11]. The model has 2 adjustable and 6 fixed parameters, the former were used to fit the responses of the model to single unit responses to variety of stimuli [11]. The results reported here are not sensitive to these parameters. (a) AN model (b) delay-layer (c) edge-detector neuron τ=6 ms I&F; Neuron τ=4 ms τ=3 ms bandpass log d dt RMS Figure 1: Schematic diagram of the model and a block diagram of the basic operation of each model component (shaded area). The stimulus is converted to a neural representation (a) that approximates the average firing rate of a medium spontaneous-rate AN fiber [23]. The operation of this stage can be roughly described as the log-compressed rms output of a bandpass filter. The neural representation is fed to a series of neurons with ascending membrane time constant (b). The kernel functions that are used to simulate these neurons are plotted for a few neurons along with the time constants used. The output of the delay-layer neurons converge to a single I&F; neuron (c) using a set of connections with weights that reflect a shape of a gaussian derivative. Solid arrows represent excitatory connections and white arrows represent inhibitory connections. The absolute efficacy is represented by the width of the arrows. 3 Resu lt s Nelken et al. [15] report that amplitude modulation can substantially modify the noise-driven discharge rates of A1 neurons in Halothane-anesthetized cats. Many cortical neurons show only a transient onset response to unmodulated noise but fire in synchrony (“lock”) to the envelope of modulated noise. A significant reduction in envelope-locked discharge rates is observed if an unmodulated tone is added to modulated noise. As summarized in Fig. 2, this suppression of envelope locking can reveal the presence of an auditory signal at sound pressure levels that are not detectable in unmodulated noise. It has been suggested that this pattern of neural responding may represent a physiological equivalent of CMR. Reproduction of CMR-like cortical activity can be illustrated by a simplified case in which the analytical amplitude envelope of the stimulus is used as the input to the edge-detector model. In keeping with the actual physiological approach of Nelken et al., the noise envelope is shaped by a trapezoid modulator for these simulations. Each cycle of modulation, E N(t), is given by: t 0≤t < 3D E N (t ) = P P − D (t − 3 D ) 3 D ≤ t < 4 D 0 4 D ≤ t < 8D £ P D ¢ ¡ where P is the peak pressure level and D is set to 12.5 ms. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 150 300 Time (ms) Figure 2: Responses of an A1 unit to a combination of noise and tone at many tone levels, replotted from Nelken et al. [15]. (a) Unmodulated noise and (b) modulated noise. The noise envelope is illustrated by the thick line above each figure. Each row shows the response of the neuron to the noise plus the tone at the level specified on the ordinate. The dashed line in (b) indicates the detection threshold level for the tone. The detection threshold (as defined and calculated by Nelken et al.) in the unmodulated noise was not reached. Since the basic operation of the model is the calculation of the rectified timederivative of the log-compressed envelope of the stimulus, the expected noisedriven rate of the model can be approximated by: ( ) ¢ E (t ) P0 d A ln 1 + dt ¡ M N ( t ) = max 0, ¥ ¤ £ where A=20/ln(10) and P0 =2e-5 Pa. The expected firing rate in response to the noise plus an unmodulated signal (tone) can be similarly approximated by: ) ¨ E ( t ) + PS P0 ¦ ( d A ln 1 + dt § M N + S ( t ) = max 0, © where PS is the peak pressure level of the tone. Clearly, both MN (t) and MN+S (t) are identically zero outside the interval [0 D]. Within this interval it holds that: M N (t ) = AP D P0 + P D t 0≤t < D Clearly, M N + S < M N for the interval [0 D] of each modulation cycle. That is, the addition of a tone reduces the responses of the model to the rising part of the modulated envelope. Higher tone levels (Ps ) cause greater reduction in the model’s firing rate. (c) (b) Level derivative (dB SPL/ms) Level (dB SPL) (a) (d) Time (ms) Figure 3: An illustration of the basic operation of the model on various amplitude envelopes. The simplified operation of the model includes log compression of the amplitude envelope (a and c) and rectified time-derivative of the log-compressed envelope (b and d). (a) A 30 dB SPL tone is added to a modulated envelope (peak level of 70 dB SPL) 300 ms after the beginning of the stimulus (as indicated by the horizontal line). The addition of the tone causes a great reduction in the time derivative of the log-compressed envelope (b). When the envelope of the noise is unmodulated (c), the time-derivative of the log-compressed envelope (d) shows a tiny spike when the tone is added (marked by the arrow). Fig. 3 demonstrates the effect of a low-level tone on the time-derivative of the logcompressed envelope of a noise. When the envelope is modulated (Fig. 3a) the addition of the tone greatly reduces the derivative of the rising part of the modulation (Fig. 3b). In the absence of modulations (Fig. 3c), the tone presentation produces a negligible effect on the level derivative (Fig. 3d). Model simulations of neural responses to the stimuli used by Nelken et al. are plotted in Fig. 4. As illustrated schematically in Fig 3 (d), the presence of the tone does not cause any significant change in the responses of the model to the unmodulated noise (Fig. 4a). In the modulated noise, however, tones of relatively low levels reduce the responses of the model to the rising part of the envelope modulations. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 Time (ms) 150 300 Figure 4: Simulated responses of the model to a combination of a tone and Unmodulated noise (a) and modulated noise (b). All conventions are as in Fig. 2. 4 Di scu ssi on This report uses an auditory edge-detection model to simulate the actual physiological consequences of amplitude modulation on neural sensitivity in cortical area A1. The basic computational operation of the model is the calculation of the smoothed time-derivative of the log-compressed stimulus envelope. The ability of the model to reproduce cortical response patterns in detail across a variety of stimulus conditions suggests similar time-sensitive mechanisms may contribute to the physiological correlates of CMR. These findings augment our previous observations that the simple edge-detection model can successfully predict a wide range of physiological and perceptual phenomena [11]. Former applications of the model to perceptual phenomena have been mainly related to auditory scene analysis, or more specifically the ability of the auditory system to distinguish multiple sound sources. In these cases, a sharp amplitude transition at stimulus onset (“auditory edge”) was critical for sound segregation. Here, it is shown that the detection of acoustic signals also may be enhanced through the suppression of ongoing responses to the concurrent modulations of competing background sounds. Interestingly, these temporal fluctuations appear to be a common property of natural soundscapes [15]. The model provides testable predictions regarding how signal detection may be influenced by the temporal shape of amplitude modulation. Carlyon et al. [6] measured CMR in human listeners using three types of noise modulation: squarewave, sine wave and multiplied noise. From the perspective of the edge-detection model, these psychoacoustic results are intriguing because the different modulator types represent manipulations of the time derivative of masker envelopes. Squarewave modulation had the most sharply edged time derivative and produced the greatest masking release. Fig. 5 plots the responses of the model to a pure-tone signal in square-wave and sine-wave modulated noise. As in the psychoacoustical data of Carlyon et al., the simulated detection threshold was lower in the context of square-wave modulation. Our modeling results suggest that the sharply edged square wave evoked higher levels of noise-driven activity and therefore created a sensitive background for the suppressing effects of the unmodulated tone. (b) 60 Spikes/sec Tone level (dB SPL) (a) 10 0 200 400 600 0 Time (ms) 200 400 600 Figure 5: Simulated responses of the model to a combination of a tone at various levels and a sine-wave modulated noise (a) or a square-wave modulated noise (b). Each row shows the response of the model to the noise plus the tone at the level specified on the abscissa. The shape of the noise modulator is illustrated above each figure. The 100 ms tone starts 250 ms after the noise onset. Note that the tone detection threshold (marked by the dashed line) is 10 dB lower for the square-wave modulator than for the sine-wave modulator, in accordance with the psychoacoustical data of Carlyon et al. [6]. Although the physiological basis of our model was derived from studies of neural responses in the cat auditory system, the key psychoacoustical observations of Carlyon et al. have been replicated in recent behavioral studies of cats (Budelis et al. [5]). These data support the generalization of human perceptual processing to other species and enhance the possible correspondence between the neuronal CMR-like effect and the psychoacoustical masking phenomena. Clearly, the auditory system relies on information other than the time derivative of the stimulus envelope for the detection of auditory signals in background noise. Further physiological and psychoacoustic assessments of CMR-like masking effects are needed not only to refine the predictive abilities of the edge-detection model but also to reveal the additional sources of acoustic information that influence signal detection in constantly changing natural environments. Ackn ow led g men t s This work was supported in part by a NIDCD grant R01 DC004841. Refe ren ces [1] Agmon-Snir H., Segev I. (1993). “Signal delay and input synchronization in passive dendritic structure”, J. Neurophysiol. 70, 2066-2085. [2] Bregman A.S. (1990). “Auditory scene analysis: The perceptual organization of sound”, MIT Press, Cambridge, MA. [3] Bregman A.S., Ahad P.A., Kim J., Melnerich L. (1994) “Resetting the pitch-analysis system. 1. Effects of rise times of tones in noise backgrounds or of harmonics in a complex tone”, Percept. Psychophys. 56 (2), 155-162. [4] Bregman A.S., Ahad P.A., Kim J. (1994) “Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlapping tones”, J. Acoust. Soc. Am. 96 (5), 2694-2703. [5] Budelis J., Fishbach A., May B.J. (2002) “Behavioral assessments of comodulation masking release in cats”, Abst. Assoc. for Res. in Otolaryngol. 25. [6] Carlyon R.P., Buus S., Florentine M. (1989) “Comodulation masking release for three types of modulator as a function of modulation rate”, Hear. Res. 42, 37-46. [7] Darwin C.J. (1997) “Auditory grouping”, Trends in Cog. Sci. 1(9), 327-333. [8] Darwin C.J., Ciocca V. (1992) “Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component”, J. Acoust. Soc. Am. 91 , 33813390. [9] Drullman R., Festen H.M., Plomp R. (1994) “Effect of temporal envelope smearing on speech reception”, J. Acoust. Soc. Am. 95 (2), 1053-1064. [10] Eggermont J J. (1994). “Temporal modulation transfer functions for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity”, Hear. Res. 74, 51-66. [11] Fishbach A., Nelken I., Yeshurun Y. (2001) “Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients”, J. Neurophysiol. 85, 2303–2323. [12] Gerstner W. (1999) “Spiking neurons”, in Pulsed Neural Networks , edited by W. Maass, C. M. Bishop, (MIT Press, Cambridge, MA). [13] Hall J.W., Haggard M.P., Fernandes M.A. (1984) “Detection in noise by spectrotemporal pattern analysis”, J. Acoust. Soc. Am. 76, 50-56. [14] Heil P. (1997) “Auditory onset responses revisited. II. Response strength”, J. Neurophysiol. 77, 2642-2660. [15] Nelken I., Rotman Y., Bar-Yosef O. (1999) “Responses of auditory cortex neurons to structural features of natural sounds”, Nature 397, 154-157. [16] Phillips D.P. (1988). “Effect of Tone-Pulse Rise Time on Rate-Level Functions of Cat Auditory Cortex Neurons: Excitatory and Inhibitory Processes Shaping Responses to Tone Onset”, J. Neurophysiol. 59, 1524-1539. [17] Phillips D.P., Burkard R. (1999). “Response magnitude and timing of auditory response initiation in the inferior colliculus of the awake chinchilla”, J. Acoust. Soc. Am. 105, 27312737. [18] Phillips D.P., Semple M.N., Kitzes L.M. (1995). “Factors shaping the tone level sensitivity of single neurons in posterior field of cat auditory cortex”, J. Neurophysiol. 73, 674-686. [19] Rosen S. (1992) “Temporal information in speech: acoustic, auditory and linguistic aspects”, Phil. Trans. R. Soc. Lond. B 336, 367-373. [20] Shannon R.V., Zeng F.G., Kamath V., Wygonski J, Ekelid M. (1995) “Speech recognition with primarily temporal cues”, Science 270, 303-304. [21] Turner C.W., Relkin E.M., Doucet J. (1994). “Psychophysical and physiological forward masking studies: probe duration and rise-time effects”, J. Acoust. Soc. Am. 96 (2), 795-800. [22] Yost W.A., Sheft S. (1994) “Modulation detection interference – across-frequency processing and auditory grouping”, Hear. Res. 79, 48-58. [23] Zhang X., Heinz M.G., Bruce I.C., Carney L.H. (2001). “A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression”, J. Acoust. Soc. Am. 109 (2), 648-670.
6 0.19555235 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
7 0.18576333 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals
8 0.18562163 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex
9 0.15417539 187 nips-2002-Spikernels: Embedding Spiking Neurons in Inner-Product Spaces
10 0.14946657 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
11 0.14721315 26 nips-2002-An Estimation-Theoretic Framework for the Presentation of Multiple Stimuli
12 0.13237052 171 nips-2002-Reconstructing Stimulus-Driven Neural Networks from Spike Times
13 0.09706375 18 nips-2002-Adaptation and Unsupervised Learning
14 0.094019353 199 nips-2002-Timing and Partial Observability in the Dopamine System
15 0.088667236 76 nips-2002-Dynamical Constraints on Computing with Spike Timing in the Cortex
16 0.087277308 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
17 0.087124042 147 nips-2002-Monaural Speech Separation
18 0.084796749 153 nips-2002-Neural Decoding of Cursor Motion Using a Kalman Filter
19 0.084654309 11 nips-2002-A Model for Real-Time Computation in Generic Neural Microcircuits
20 0.080938891 95 nips-2002-Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting
topicId topicWeight
[(0, -0.276), (1, 0.34), (2, 0.129), (3, -0.049), (4, -0.066), (5, -0.358), (6, -0.275), (7, 0.074), (8, -0.045), (9, -0.017), (10, 0.021), (11, 0.034), (12, 0.01), (13, 0.003), (14, 0.07), (15, -0.01), (16, 0.045), (17, -0.012), (18, 0.334), (19, -0.046), (20, 0.089), (21, 0.023), (22, -0.018), (23, 0.05), (24, 0.055), (25, 0.124), (26, 0.154), (27, -0.009), (28, 0.048), (29, -0.079), (30, 0.047), (31, -0.065), (32, 0.104), (33, -0.015), (34, -0.085), (35, -0.06), (36, 0.043), (37, 0.071), (38, -0.016), (39, 0.067), (40, -0.026), (41, 0.06), (42, 0.006), (43, -0.072), (44, 0.026), (45, 0.044), (46, -0.111), (47, 0.032), (48, -0.029), (49, 0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.9736805 103 nips-2002-How Linear are Auditory Cortical Responses?
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive field (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. How well can such a model account for neural responses at the very first stages of auditory cortical processing? To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses.
2 0.9246493 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: An essential step in understanding the function of sensory nervous systems is to characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. One increasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses of one or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal firing. The estimation techniques usually employed, such as Wiener filtering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters that control the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.
3 0.82852691 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise
Author: Alon Fishbach, Bradford J. May
Abstract: Psychophysical data suggest that temporal modulations of stimulus amplitude envelopes play a prominent role in the perceptual segregation of concurrent sounds. In particular, the detection of an unmodulated signal can be significantly improved by adding amplitude modulation to the spectral envelope of a competing masking noise. This perceptual phenomenon is known as “Comodulation Masking Release” (CMR). Despite the obvious influence of temporal structure on the perception of complex auditory scenes, the physiological mechanisms that contribute to CMR and auditory streaming are not well known. A recent physiological study by Nelken and colleagues has demonstrated an enhanced cortical representation of auditory signals in modulated noise. Our study evaluates these CMR-like response patterns from the perspective of a hypothetical auditory edge-detection neuron. It is shown that this simple neural model for the detection of amplitude transients can reproduce not only the physiological data of Nelken et al., but also, in light of previous results, a variety of physiological and psychoacoustical phenomena that are related to the perceptual segregation of concurrent sounds. 1 In t rod u ct i on The temporal structure of a complex sound exerts strong influences on auditory physiology (e.g. [10, 16]) and perception (e.g. [9, 19, 20]). In particular, studies of auditory scene analysis have demonstrated the importance of the temporal structure of amplitude envelopes in the perceptual segregation of concurrent sounds [2, 7]. Common amplitude transitions across frequency serve as salient cues for grouping sound energy into unified perceptual objects. Conversely, asynchronous amplitude transitions enhance the separation of competing acoustic events [3, 4]. These general principles are manifested in perceptual phenomena as diverse as comodulation masking release (CMR) [13], modulation detection interference [22] and synchronous onset grouping [8]. Despite the obvious importance of timing information in psychoacoustic studies of auditory masking, the way in which the CNS represents the temporal structure of an amplitude envelope is not well understood. Certainly many physiological studies have demonstrated neural sensitivities to envelope transitions, but this sensitivity is only beginning to be related to the variety of perceptual experiences that are evoked by signals in noise. Nelken et al. [15] have suggested a correspondence between neural responses to time-varying amplitude envelopes and psychoacoustic masking phenomena. In their study of neurons in primary auditory cortex (A1), adding temporal modulation to background noise lowered the detection thresholds of unmodulated tones. This enhanced signal detection is similar to the perceptual phenomenon that is known as comodulation masking release [13]. Fishbach et al. [11] have recently proposed a neural model for the detection of “auditory edges” (i.e., amplitude transients) that can account for numerous physiological [14, 17, 18] and psychoacoustical [3, 21] phenomena. The encompassing utility of this edge-detection model suggests a common mechanism that may link the auditory processing and perception of auditory signals in a complex auditory scene. Here, it is shown that the auditory edge detection model can accurately reproduce the cortical CMR-like responses previously described by Nelken and colleagues. 2 Th e M od el The model is described in detail elsewhere [11]. In short, the basic operation of the model is the calculation of the first-order time derivative of the log-compressed envelope of the stimulus. A computational model [23] is used to convert the acoustic waveform to a physiologically plausible auditory nerve representation (Fig 1a). The simulated neural response has a medium spontaneous rate and a characteristic frequency that is set to the frequency of the target tone. To allow computation of the time derivative of the stimulus envelope, we hypothesize the existence of a temporal delay dimension, along which the stimulus is progressively delayed. The intermediate delay layer (Fig 1b) is constructed from an array of neurons with ascending membrane time constants (τ); each neuron is modeled by a conventional integrate-and-fire model (I&F;, [12]). Higher membrane time constant induces greater delay in the neuron’s response [1]. The output of the delay layer converges to a single output neuron (Fig. 1c) via a set of connection with various efficacies that reflect a receptive field of a gaussian derivative. This combination of excitatory and inhibitory connections carries out the time-derivative computation. Implementation details and parameters are given in [11]. The model has 2 adjustable and 6 fixed parameters, the former were used to fit the responses of the model to single unit responses to variety of stimuli [11]. The results reported here are not sensitive to these parameters. (a) AN model (b) delay-layer (c) edge-detector neuron τ=6 ms I&F; Neuron τ=4 ms τ=3 ms bandpass log d dt RMS Figure 1: Schematic diagram of the model and a block diagram of the basic operation of each model component (shaded area). The stimulus is converted to a neural representation (a) that approximates the average firing rate of a medium spontaneous-rate AN fiber [23]. The operation of this stage can be roughly described as the log-compressed rms output of a bandpass filter. The neural representation is fed to a series of neurons with ascending membrane time constant (b). The kernel functions that are used to simulate these neurons are plotted for a few neurons along with the time constants used. The output of the delay-layer neurons converge to a single I&F; neuron (c) using a set of connections with weights that reflect a shape of a gaussian derivative. Solid arrows represent excitatory connections and white arrows represent inhibitory connections. The absolute efficacy is represented by the width of the arrows. 3 Resu lt s Nelken et al. [15] report that amplitude modulation can substantially modify the noise-driven discharge rates of A1 neurons in Halothane-anesthetized cats. Many cortical neurons show only a transient onset response to unmodulated noise but fire in synchrony (“lock”) to the envelope of modulated noise. A significant reduction in envelope-locked discharge rates is observed if an unmodulated tone is added to modulated noise. As summarized in Fig. 2, this suppression of envelope locking can reveal the presence of an auditory signal at sound pressure levels that are not detectable in unmodulated noise. It has been suggested that this pattern of neural responding may represent a physiological equivalent of CMR. Reproduction of CMR-like cortical activity can be illustrated by a simplified case in which the analytical amplitude envelope of the stimulus is used as the input to the edge-detector model. In keeping with the actual physiological approach of Nelken et al., the noise envelope is shaped by a trapezoid modulator for these simulations. Each cycle of modulation, E N(t), is given by: t 0≤t < 3D E N (t ) = P P − D (t − 3 D ) 3 D ≤ t < 4 D 0 4 D ≤ t < 8D £ P D ¢ ¡ where P is the peak pressure level and D is set to 12.5 ms. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 150 300 Time (ms) Figure 2: Responses of an A1 unit to a combination of noise and tone at many tone levels, replotted from Nelken et al. [15]. (a) Unmodulated noise and (b) modulated noise. The noise envelope is illustrated by the thick line above each figure. Each row shows the response of the neuron to the noise plus the tone at the level specified on the ordinate. The dashed line in (b) indicates the detection threshold level for the tone. The detection threshold (as defined and calculated by Nelken et al.) in the unmodulated noise was not reached. Since the basic operation of the model is the calculation of the rectified timederivative of the log-compressed envelope of the stimulus, the expected noisedriven rate of the model can be approximated by: ( ) ¢ E (t ) P0 d A ln 1 + dt ¡ M N ( t ) = max 0, ¥ ¤ £ where A=20/ln(10) and P0 =2e-5 Pa. The expected firing rate in response to the noise plus an unmodulated signal (tone) can be similarly approximated by: ) ¨ E ( t ) + PS P0 ¦ ( d A ln 1 + dt § M N + S ( t ) = max 0, © where PS is the peak pressure level of the tone. Clearly, both MN (t) and MN+S (t) are identically zero outside the interval [0 D]. Within this interval it holds that: M N (t ) = AP D P0 + P D t 0≤t < D Clearly, M N + S < M N for the interval [0 D] of each modulation cycle. That is, the addition of a tone reduces the responses of the model to the rising part of the modulated envelope. Higher tone levels (Ps ) cause greater reduction in the model’s firing rate. (c) (b) Level derivative (dB SPL/ms) Level (dB SPL) (a) (d) Time (ms) Figure 3: An illustration of the basic operation of the model on various amplitude envelopes. The simplified operation of the model includes log compression of the amplitude envelope (a and c) and rectified time-derivative of the log-compressed envelope (b and d). (a) A 30 dB SPL tone is added to a modulated envelope (peak level of 70 dB SPL) 300 ms after the beginning of the stimulus (as indicated by the horizontal line). The addition of the tone causes a great reduction in the time derivative of the log-compressed envelope (b). When the envelope of the noise is unmodulated (c), the time-derivative of the log-compressed envelope (d) shows a tiny spike when the tone is added (marked by the arrow). Fig. 3 demonstrates the effect of a low-level tone on the time-derivative of the logcompressed envelope of a noise. When the envelope is modulated (Fig. 3a) the addition of the tone greatly reduces the derivative of the rising part of the modulation (Fig. 3b). In the absence of modulations (Fig. 3c), the tone presentation produces a negligible effect on the level derivative (Fig. 3d). Model simulations of neural responses to the stimuli used by Nelken et al. are plotted in Fig. 4. As illustrated schematically in Fig 3 (d), the presence of the tone does not cause any significant change in the responses of the model to the unmodulated noise (Fig. 4a). In the modulated noise, however, tones of relatively low levels reduce the responses of the model to the rising part of the envelope modulations. (b) Modulated noise 76 Spikes/sec Tone level (dB SPL) (a) Unmodulated noise 26 0 150 300 0 Time (ms) 150 300 Figure 4: Simulated responses of the model to a combination of a tone and Unmodulated noise (a) and modulated noise (b). All conventions are as in Fig. 2. 4 Di scu ssi on This report uses an auditory edge-detection model to simulate the actual physiological consequences of amplitude modulation on neural sensitivity in cortical area A1. The basic computational operation of the model is the calculation of the smoothed time-derivative of the log-compressed stimulus envelope. The ability of the model to reproduce cortical response patterns in detail across a variety of stimulus conditions suggests similar time-sensitive mechanisms may contribute to the physiological correlates of CMR. These findings augment our previous observations that the simple edge-detection model can successfully predict a wide range of physiological and perceptual phenomena [11]. Former applications of the model to perceptual phenomena have been mainly related to auditory scene analysis, or more specifically the ability of the auditory system to distinguish multiple sound sources. In these cases, a sharp amplitude transition at stimulus onset (“auditory edge”) was critical for sound segregation. Here, it is shown that the detection of acoustic signals also may be enhanced through the suppression of ongoing responses to the concurrent modulations of competing background sounds. Interestingly, these temporal fluctuations appear to be a common property of natural soundscapes [15]. The model provides testable predictions regarding how signal detection may be influenced by the temporal shape of amplitude modulation. Carlyon et al. [6] measured CMR in human listeners using three types of noise modulation: squarewave, sine wave and multiplied noise. From the perspective of the edge-detection model, these psychoacoustic results are intriguing because the different modulator types represent manipulations of the time derivative of masker envelopes. Squarewave modulation had the most sharply edged time derivative and produced the greatest masking release. Fig. 5 plots the responses of the model to a pure-tone signal in square-wave and sine-wave modulated noise. As in the psychoacoustical data of Carlyon et al., the simulated detection threshold was lower in the context of square-wave modulation. Our modeling results suggest that the sharply edged square wave evoked higher levels of noise-driven activity and therefore created a sensitive background for the suppressing effects of the unmodulated tone. (b) 60 Spikes/sec Tone level (dB SPL) (a) 10 0 200 400 600 0 Time (ms) 200 400 600 Figure 5: Simulated responses of the model to a combination of a tone at various levels and a sine-wave modulated noise (a) or a square-wave modulated noise (b). Each row shows the response of the model to the noise plus the tone at the level specified on the abscissa. The shape of the noise modulator is illustrated above each figure. The 100 ms tone starts 250 ms after the noise onset. Note that the tone detection threshold (marked by the dashed line) is 10 dB lower for the square-wave modulator than for the sine-wave modulator, in accordance with the psychoacoustical data of Carlyon et al. [6]. Although the physiological basis of our model was derived from studies of neural responses in the cat auditory system, the key psychoacoustical observations of Carlyon et al. have been replicated in recent behavioral studies of cats (Budelis et al. [5]). These data support the generalization of human perceptual processing to other species and enhance the possible correspondence between the neuronal CMR-like effect and the psychoacoustical masking phenomena. Clearly, the auditory system relies on information other than the time derivative of the stimulus envelope for the detection of auditory signals in background noise. Further physiological and psychoacoustic assessments of CMR-like masking effects are needed not only to refine the predictive abilities of the edge-detection model but also to reveal the additional sources of acoustic information that influence signal detection in constantly changing natural environments. Ackn ow led g men t s This work was supported in part by a NIDCD grant R01 DC004841. Refe ren ces [1] Agmon-Snir H., Segev I. (1993). “Signal delay and input synchronization in passive dendritic structure”, J. Neurophysiol. 70, 2066-2085. [2] Bregman A.S. (1990). “Auditory scene analysis: The perceptual organization of sound”, MIT Press, Cambridge, MA. [3] Bregman A.S., Ahad P.A., Kim J., Melnerich L. (1994) “Resetting the pitch-analysis system. 1. Effects of rise times of tones in noise backgrounds or of harmonics in a complex tone”, Percept. Psychophys. 56 (2), 155-162. [4] Bregman A.S., Ahad P.A., Kim J. (1994) “Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlapping tones”, J. Acoust. Soc. Am. 96 (5), 2694-2703. [5] Budelis J., Fishbach A., May B.J. (2002) “Behavioral assessments of comodulation masking release in cats”, Abst. Assoc. for Res. in Otolaryngol. 25. [6] Carlyon R.P., Buus S., Florentine M. (1989) “Comodulation masking release for three types of modulator as a function of modulation rate”, Hear. Res. 42, 37-46. [7] Darwin C.J. (1997) “Auditory grouping”, Trends in Cog. Sci. 1(9), 327-333. [8] Darwin C.J., Ciocca V. (1992) “Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component”, J. Acoust. Soc. Am. 91 , 33813390. [9] Drullman R., Festen H.M., Plomp R. (1994) “Effect of temporal envelope smearing on speech reception”, J. Acoust. Soc. Am. 95 (2), 1053-1064. [10] Eggermont J J. (1994). “Temporal modulation transfer functions for AM and FM stimuli in cat auditory cortex. Effects of carrier type, modulating waveform and intensity”, Hear. Res. 74, 51-66. [11] Fishbach A., Nelken I., Yeshurun Y. (2001) “Auditory edge detection: a neural model for physiological and psychoacoustical responses to amplitude transients”, J. Neurophysiol. 85, 2303–2323. [12] Gerstner W. (1999) “Spiking neurons”, in Pulsed Neural Networks , edited by W. Maass, C. M. Bishop, (MIT Press, Cambridge, MA). [13] Hall J.W., Haggard M.P., Fernandes M.A. (1984) “Detection in noise by spectrotemporal pattern analysis”, J. Acoust. Soc. Am. 76, 50-56. [14] Heil P. (1997) “Auditory onset responses revisited. II. Response strength”, J. Neurophysiol. 77, 2642-2660. [15] Nelken I., Rotman Y., Bar-Yosef O. (1999) “Responses of auditory cortex neurons to structural features of natural sounds”, Nature 397, 154-157. [16] Phillips D.P. (1988). “Effect of Tone-Pulse Rise Time on Rate-Level Functions of Cat Auditory Cortex Neurons: Excitatory and Inhibitory Processes Shaping Responses to Tone Onset”, J. Neurophysiol. 59, 1524-1539. [17] Phillips D.P., Burkard R. (1999). “Response magnitude and timing of auditory response initiation in the inferior colliculus of the awake chinchilla”, J. Acoust. Soc. Am. 105, 27312737. [18] Phillips D.P., Semple M.N., Kitzes L.M. (1995). “Factors shaping the tone level sensitivity of single neurons in posterior field of cat auditory cortex”, J. Neurophysiol. 73, 674-686. [19] Rosen S. (1992) “Temporal information in speech: acoustic, auditory and linguistic aspects”, Phil. Trans. R. Soc. Lond. B 336, 367-373. [20] Shannon R.V., Zeng F.G., Kamath V., Wygonski J, Ekelid M. (1995) “Speech recognition with primarily temporal cues”, Science 270, 303-304. [21] Turner C.W., Relkin E.M., Doucet J. (1994). “Psychophysical and physiological forward masking studies: probe duration and rise-time effects”, J. Acoust. Soc. Am. 96 (2), 795-800. [22] Yost W.A., Sheft S. (1994) “Modulation detection interference – across-frequency processing and auditory grouping”, Hear. Res. 79, 48-58. [23] Zhang X., Heinz M.G., Bruce I.C., Carney L.H. (2001). “A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression”, J. Acoust. Soc. Am. 109 (2), 648-670.
4 0.82088196 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex
Author: Christian K. Machens, Michael Wehr, Anthony M. Zador
Abstract: How do cortical neurons represent the acoustic environment? This question is often addressed by probing with simple stimuli such as clicks or tone pips. Such stimuli have the advantage of yielding easily interpreted answers, but have the disadvantage that they may fail to uncover complex or higher-order neuronal response properties. Here we adopt an alternative approach, probing neuronal responses with complex acoustic stimuli, including animal vocalizations and music. We have used in vivo whole cell methods in the rat auditory cortex to record subthreshold membrane potential fluctuations elicited by these stimuli. Whole cell recording reveals the total synaptic input to a neuron from all the other neurons in the circuit, instead of just its output—a sparse binary spike train—as in conventional single unit physiological recordings. Whole cell recording thus provides a much richer source of information about the neuron’s response. Many neurons responded robustly and reliably to the complex stimuli in our ensemble. Here we analyze the linear component—the spectrotemporal receptive field (STRF)—of the transformation from the sound (as represented by its time-varying spectrogram) to the neuron’s membrane potential. We find that the STRF has a rich dynamical structure, including excitatory regions positioned in general accord with the prediction of the simple tuning curve. We also find that in many cases, much of the neuron’s response, although deterministically related to the stimulus, cannot be predicted by the linear component, indicating the presence of as-yet-uncharacterized nonlinear response properties.
5 0.5939486 43 nips-2002-Binary Coding in Auditory Cortex
Author: Michael R. Deweese, Anthony M. Zador
Abstract: Cortical neurons have been reported to use both rate and temporal codes. Here we describe a novel mode in which each neuron generates exactly 0 or 1 action potentials, but not more, in response to a stimulus. We used cell-attached recording, which ensured single-unit isolation, to record responses in rat auditory cortex to brief tone pips. Surprisingly, the majority of neurons exhibited binary behavior with few multi-spike responses; several dramatic examples consisted of exactly one spike on 100% of trials, with no trial-to-trial variability in spike count. Many neurons were tuned to stimulus frequency. Since individual trials yielded at most one spike for most neurons, the information about stimulus frequency was encoded in the population, and would not have been accessible to later stages of processing that only had access to the activity of a single unit. These binary units allow a more efficient population code than is possible with conventional rate coding units, and are consistent with a model of cortical processing in which synchronous packets of spikes propagate stably from one neuronal population to the next. 1 Binary coding in auditory cortex We recorded responses of neurons in the auditory cortex of anesthetized rats to pure-tone pips of different frequencies [1, 2]. Each pip was presented repeatedly, allowing us to assess the variability of the neural response to multiple presentations of each stimulus. We first recorded multi-unit activity with conventional tungsten electrodes (Fig. 1a). The number of spikes in response to each pip fluctuated markedly from one trial to the next (Fig. 1e), as though governed by a random mechanism such as that generating the ticks of a Geiger counter. Highly variable responses such as these, which are at least as variable as a Poisson process, are the norm in the cortex [3-7], and have contributed to the widely held view that cortical spike trains are so noisy that only the average firing rate can be used to encode stimuli. Because we were recording the activity of an unknown number of neurons, we could not be sure whether the strong trial-to-trial fluctuations reflected the underlying variability of the single units. We therefore used an alternative technique, cell- a b Single-unit recording method 5mV Multi-unit 1sec Raw cellattached voltage 10 kHz c Single-unit . . . . .. .. ... . . .... . ... . Identified spikes Threshold e 28 kHz d Single-unit 80 120 160 200 Time (msec) N = 29 tones 3 2 1 Poisson N = 11 tones ry 40 4 na bi 38 kHz 0 Response variance/mean (spikes/trial) High-pass filtered 0 0 1 2 3 Mean response (spikes/trial) Figure 1: Multi-unit spiking activity was highly variable, but single units obeyed binomial statistics. a Multi-unit spike rasters from a conventional tungsten electrode recording showed high trial-to-trial variability in response to ten repetitions of the same 50 msec pure tone stimulus (bottom). Darker hash marks indicate spike times within the response period, which were used in the variability analysis. b Spikes recorded in cell-attached mode were easily identified from the raw voltage trace (top) by applying a high-pass filter (bottom) and thresholding (dark gray line). Spike times (black squares) were assigned to the peaks of suprathreshold segments. c Spike rasters from a cell-attached recording of single-unit responses to 25 repetitions of the same tone consisted of exactly one well-timed spike per trial (latency standard deviation = 1.0 msec), unlike the multi-unit responses (Fig. 1a). Under the Poisson assumption, this would have been highly unlikely (P ~ 10 -11). d The same neuron as in Fig. 1c responds with lower probability to repeated presentations of a different tone, but there are still no multi-spike responses. e We quantified response variability for each tone by dividing the variance in spike count by the mean spike count across all trials for that tone. Response variability for multi-unit tungsten recording (open triangles) was high for each of the 29 tones (out of 32) that elicited at least one spike on one trial. All but one point lie above one (horizontal gray line), which is the value produced by a Poisson process with any constant or time varying event rate. Single unit responses recorded in cell-attached mode were far less variable (filled circles). Ninety one percent (10/11) of the tones that elicited at least one spike from this neuron produced no multi-spike responses in 25 trials; the corresponding points fall on the diagonal line between (0,1) and (1,0), which provides a strict lower bound on the variability for any response set with a mean between 0 and 1. No point lies above one. attached recording with a patch pipette [8, 9], in order to ensure single unit isolation (Fig. 1b). This recording mode minimizes both of the main sources of error in spike detection: failure to detect a spike in the unit under observation (false negatives), and contamination by spikes from nearby neurons (false positives). It also differs from conventional extracellular recording methods in its selection bias: With cell- attached recording neurons are selected solely on the basis of the experimenter’s ability to form a seal, rather than on the basis of neuronal activity and responsiveness to stimuli as in conventional methods. Surprisingly, single unit responses were far more orderly than suggested by the multi-unit recordings; responses typically consisted of either 0 or 1 spikes per trial, and not more (Fig. 1c-e). In the most dramatic examples, each presentation of the same tone pip elicited exactly one spike (Fig. 1c). In most cases, however, some presentations failed to elicit a spike (Fig. 1d). Although low-variability responses have recently been observed in the cortex [10, 11] and elsewhere [12, 13], the binary behavior described here has not previously been reported for cortical neurons. a 1.4 N = 3055 response sets b 1.2 1 Poisson 28 kHz - 100 msec 0.8 0.6 0.4 0.2 0 0 ry na bi Response variance/mean (spikes/trial) The majority of the neurons (59%) in our study for which statistical significance could be assessed (at the p<0.001 significance level; see Fig. 2, caption) showed noisy binary behavior—“binary” because neurons produced either 0 or 1 spikes, and “noisy” because some stimuli elicited both single spikes and failures. In a substantial fraction of neurons, however, the responses showed more variability. We found no correlation between neuronal variability and cortical layer (inferred from the depth of the recording electrode), cortical area (inside vs. outside of area A1) or depth of anesthesia. Moreover, the binary mode of spiking was not due to the brevity (25 msec) of the stimuli; responses that were binary for short tones were comparably binary when longer (100 msec) tones were used (Fig. 2b). Not assessable Not significant Significant (p<0.001) 0.2 0.4 0.6 0.8 1 1.2 Mean response (spikes/trial) 28 kHz - 25 msec 1.4 0 40 80 120 160 Time (msec) 200 Figure 2: Half of the neuronal population exhibited binary firing behavior. a Of the 3055 sets of responses to 25 msec tones, 2588 (gray points) could not be assessed for significance at the p<0.001 level, 225 (open circles) were not significantly binary, and 242 were significantly binary (black points; see Identification methods for group statistics below). All points were jittered slightly so that overlying points could be seen in the figure. 2165 response sets contained no multi-spike responses; the corresponding points fell on the line from [0,1] to [1,0]. b The binary nature of single unit responses was insensitive to tone duration, even for frequencies that elicited the largest responses. Twenty additional spike rasters from the same neuron (and tone frequency) as in Fig. 1c contain no multi-spike responses whether in response to 100 msec tones (above) or 25 msec tones (below). Across the population, binary responses were as prevalent for 100 msec tones as for 25 msec tones (see Identification methods for group statistics). In many neurons, binary responses showed high temporal precision, with latencies sometimes exhibiting standard deviations as low as 1 msec (Fig. 3; see also Fig. 1c), comparable to previous observations in the auditory cortex [14], and only slightly more precise than in monkey visual area MT [5]. High temporal precision was positively correlated with high response probability (Fig. 3). a b N = (44 cells)x(32 tones) 14 N = 32 tones 12 30 Jitter (msec) Jitter (msec) 40 10 8 6 20 10 4 2 0 0 0 0.2 0.4 0.6 0.8 Mean response (spikes/trial) 1 0 0.4 0.8 1.2 1.6 Mean response (spikes/trial) 2 Figure 3: Trial-to-trial variability in latency of response to repeated presentations of the same tone decreased with increasing response probability. a Scatter plot of standard deviation of latency vs. mean response for 25 presentations each of 32 tones for a different neuron as in Figs. 1 and 2 (gray line is best linear fit). Rasters from 25 repeated presentations of a low response tone (upper left inset, which corresponds to left-most data point) display much more variable latencies than rasters from a high response tone (lower right inset; corresponds to right-most data point). b The negative correlation between latency variability and response size was present on average across the population of 44 neurons described in Identification methods for group statistics (linear fit, gray). The low trial-to-trial variability ruled out the possibility that the firing statistics could be accounted for by a simple rate-modulated Poisson process (Fig. 4a1,a2). In other systems, low variability has sometimes been modeled as a Poisson process followed by a post-spike refractory period [10, 12]. In our system, however, the range in latencies of evoked binary responses was often much greater than the refractory period, which could not have been longer than the 2 msec inter-spike intervals observed during epochs of spontaneous spiking, indicating that binary spiking did not result from any intrinsic property of the spike generating mechanism (Fig. 4a3). Moreover, a single stimulus-evoked spike could suppress subsequent spikes for as long as hundreds of milliseconds (e.g. Figs. 1d,4d), supporting the idea that binary spiking arises through a circuit-level, rather than a single-neuron, mechanism. Indeed, the fact that this suppression is observed even in the cortex of awake animals [15] suggests that binary spiking is not a special property of the anesthetized state. It seems surprising that binary spiking in the cortex has not previously been remarked upon. In the auditory cortex the explanation may be in part technical: Because firing rates in the auditory cortex tend to be low, multi-unit recording is often used to maximize the total amount of data collected. Moreover, our use of cell-attached recording minimizes the usual bias toward responsive or active neurons. Such explanations are not, however, likely to account for the failure to observe binary spiking in the visual cortex, where spike count statistics have been scrutinized more closely [3-7]. One possibility is that this reflects a fundamental difference between the auditory and visual systems. An alternative interpretation— a1 b Response probability 100 spikes/s 2 kHz Poisson simulation c 100 200 300 400 Time (msec) 500 20 Ratio of pool sizes a2 0 16 12 8 4 0 a3 Poisson with refractory period 0 40 80 120 160 200 Time (msec) d Response probability PSTH 0.2 0.4 0.6 0.8 1 Mean spike count per neuron 1 0.8 N = 32 tones 0.6 0.4 0.2 0 2.0 3.8 7.1 13.2 24.9 46.7 Tone frequency (kHz) Figure 4: a The lack of multi-spike responses elicited by the neuron shown in Fig. 3a were not due to an absolute refractory period since the range of latencies for many tones, like that shown here, was much greater than any reasonable estimate for the neuron’s refractory period. (a1) Experimentally recorded responses. (a2) Using the smoothed post stimulus time histogram (PSTH; bottom) from the set of responses in Fig. 4a, we generated rasters under the assumption of Poisson firing. In this representative example, four double-spike responses (arrows at left) were produced in 25 trials. (a3) We then generated rasters assuming that the neuron fired according to a Poisson process subject to a hard refractory period of 2 msec. Even with a refractory period, this representative example includes one triple- and three double-spike responses. The minimum interspike-interval during spontaneous firing events was less than two msec for five of our neurons, so 2 msec is a conservative upper bound for the refractory period. b. Spontaneous activity is reduced following high-probability responses. The PSTH (top; 0.25 msec bins) of the combined responses from the 25% (8/32) of tones that elicited the largest responses from the same neuron as in Figs. 3a and 4a illustrates a preclusion of spontaneous and evoked activity for over 200 msec following stimulation. The PSTHs from progressively less responsive groups of tones show progressively less preclusion following stimulation. c Fewer noisy binary neurons need to be pooled to achieve the same “signal-to-noise ratio” (SNR; see ref. [24]) as a collection of Poisson neurons. The ratio of the number of Poisson to binary neurons required to achieve the same SNR is plotted against the mean number of spikes elicited per neuron following stimulation; here we have defined the SNR to be the ratio of the mean spike count to the standard deviation of the spike count. d Spike probability tuning curve for the same neuron as in Figs. 1c-e and 2b fit to a Gaussian in tone frequency. and one that we favor—is that the difference rests not in the sensory modality, but instead in the difference between the stimuli used. In this view, the binary responses may not be limited to the auditory cortex; neurons in visual and other sensory cortices might exhibit similar responses to the appropriate stimuli. For example, the tone pips we used might be the auditory analog of a brief flash of light, rather than the oriented moving edges or gratings usually used to probe the primary visual cortex. Conversely, auditory stimuli analogous to edges or gratings [16, 17] may be more likely to elicit conventional, rate-modulated Poisson responses in the auditory cortex. Indeed, there may be a continuum between binary and Poisson modes. Thus, even in conventional rate-modulated responses, the first spike is often privileged in that it carries most of the information in the spike train [5, 14, 18]. The first spike may be particularly important as a means of rapidly signaling stimulus transients. Binary responses suggest a mode that complements conventional rate coding. In the simplest rate-coding model, a stimulus parameter (such as the frequency of a tone) governs only the rate at which a neuron generates spikes, but not the detailed positions of the spikes; the actual spike train itself is an instantiation of a random process (such as a Poisson process). By contrast, in the binomial model, the stimulus parameter (frequency) is encoded as the probability of firing (Fig. 4d). Binary coding has implications for cortical computation. In the rate coding model, stimulus encoding is “ergodic”: a stimulus parameter can be read out either by observing the activity of one neuron for a long time, or a population for a short time. By contrast, in the binary model the stimulus value can be decoded only by observing a neuronal population, so that there is no benefit to integrating over long time periods (cf. ref. [19]). One advantage of binary encoding is that it allows the population to signal quickly; the most compact message a neuron can send is one spike [20]. Binary coding is also more efficient in the context of population coding, as quantified by the signal-to-noise ratio (Fig. 4c). The precise organization of both spike number and time we have observed suggests that cortical activity consists, at least under some conditions, of packets of spikes synchronized across populations of neurons. Theoretical work [21-23] has shown how such packets can propagate stably from one population to the next, but only if neurons within each population fire at most one spike per packet; otherwise, the number of spikes per packet—and hence the width of each packet—grows at each propagation step. Interestingly, one prediction of stable propagation models is that spike probability should be related to timing precision, a prediction born out by our observations (Fig. 3). The role of these packets in computation remains an open question. 2 Identification methods for group statistics We recorded responses to 32 different 25 msec tones from each of 175 neurons from the auditory cortices of 16 Sprague-Dawley rats; each tone was repeated between 5 and 75 times (mean = 19). Thus our ensemble consisted of 32x175=5600 response sets, with between 5 and 75 samples in each set. Of these, 3055 response sets contained at least one spike on at least on trial. For each response set, we tested the hypothesis that the observed variability was significantly lower than expected from the null hypothesis of a Poisson process. The ability to assess significance depended on two parameters: the sample size (5-75) and the firing probability. Intuitively, the dependence on firing probability arises because at low firing rates most responses produce only trials with 0 or 1 spikes under both the Poisson and binary models; only at high firing rates do the two models make different predictions, since in that case the Poisson model includes many trials with 2 or even 3 spikes while the binary model generates only solitary spikes (see Fig. 4a1,a2). Using a stringent significance criterion of p<0.001, 467 response sets had a sufficient number of repeats to assess significance, given the observed firing probability. Of these, half (242/467=52%) were significantly less variable than expected by chance, five hundred-fold higher than the 467/1000=0.467 response sets expected, based on the 0.001 significance criterion, to yield a binary response set. Seventy-two neurons had at least one response set for which significance could be assessed, and of these, 49 neurons (49/72=68%) had at least one significantly sub-Poisson response set. Of this population of 49 neurons, five achieved low variability through repeatable bursty behavior (e.g., every spike count was either 0 or 3, but not 1 or 2) and were excluded from further analysis. The remaining 44 neurons formed the basis for the group statistics analyses shown in Figs. 2a and 3b. Nine of these neurons were subjected to an additional protocol consisting of at least 10 presentations each of 100 msec tones and 25 msec tones of all 32 frequencies. Of the 100 msec stimulation response sets, 44 were found to be significantly sub-Poisson at the p<0.05 level, in good agreement with the 43 found to be significant among the responses to 25 msec tones. 3 Bibliography 1. Kilgard, M.P. and M.M. Merzenich, Cortical map reorganization enabled by nucleus basalis activity. Science, 1998. 279(5357): p. 1714-8. 2. Sally, S.L. and J.B. Kelly, Organization of auditory cortex in the albino rat: sound frequency. J Neurophysiol, 1988. 59(5): p. 1627-38. 3. Softky, W.R. and C. Koch, The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J Neurosci, 1993. 13(1): p. 334-50. 4. Stevens, C.F. and A.M. Zador, Input synchrony and the irregular firing of cortical neurons. Nat Neurosci, 1998. 1(3): p. 210-7. 5. Buracas, G.T., A.M. Zador, M.R. DeWeese, and T.D. Albright, Efficient discrimination of temporal patterns by motion-sensitive neurons in primate visual cortex. Neuron, 1998. 20(5): p. 959-69. 6. Shadlen, M.N. and W.T. Newsome, The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J Neurosci, 1998. 18(10): p. 3870-96. 7. Tolhurst, D.J., J.A. Movshon, and A.F. Dean, The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Res, 1983. 23(8): p. 775-85. 8. Otmakhov, N., A.M. Shirke, and R. Malinow, Measuring the impact of probabilistic transmission on neuronal output. Neuron, 1993. 10(6): p. 1101-11. 9. Friedrich, R.W. and G. Laurent, Dynamic optimization of odor representations by slow temporal patterning of mitral cell activity. Science, 2001. 291(5505): p. 889-94. 10. Kara, P., P. Reinagel, and R.C. Reid, Low response variability in simultaneously recorded retinal, thalamic, and cortical neurons. Neuron, 2000. 27(3): p. 635-46. 11. Gur, M., A. Beylin, and D.M. Snodderly, Response variability of neurons in primary visual cortex (V1) of alert monkeys. J Neurosci, 1997. 17(8): p. 2914-20. 12. Berry, M.J., D.K. Warland, and M. Meister, The structure and precision of retinal spike trains. Proc Natl Acad Sci U S A, 1997. 94(10): p. 5411-6. 13. de Ruyter van Steveninck, R.R., G.D. Lewen, S.P. Strong, R. Koberle, and W. Bialek, Reproducibility and variability in neural spike trains. Science, 1997. 275(5307): p. 1805-8. 14. Heil, P., Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol, 1997. 77(5): p. 2616-41. 15. Lu, T., L. Liang, and X. Wang, Temporal and rate representations of timevarying signals in the auditory cortex of awake primates. Nat Neurosci, 2001. 4(11): p. 1131-8. 16. Kowalski, N., D.A. Depireux, and S.A. Shamma, Analysis of dynamic spectra in ferret primary auditory cortex. I. Characteristics of single-unit responses to moving ripple spectra. J Neurophysiol, 1996. 76(5): p. 350323. 17. deCharms, R.C., D.T. Blake, and M.M. Merzenich, Optimizing sound features for cortical neurons. Science, 1998. 280(5368): p. 1439-43. 18. Panzeri, S., R.S. Petersen, S.R. Schultz, M. Lebedev, and M.E. Diamond, The role of spike timing in the coding of stimulus location in rat somatosensory cortex. Neuron, 2001. 29(3): p. 769-77. 19. Britten, K.H., M.N. Shadlen, W.T. Newsome, and J.A. Movshon, The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci, 1992. 12(12): p. 4745-65. 20. Delorme, A. and S.J. Thorpe, Face identification using one spike per neuron: resistance to image degradations. Neural Netw, 2001. 14(6-7): p. 795-803. 21. Diesmann, M., M.O. Gewaltig, and A. Aertsen, Stable propagation of synchronous spiking in cortical neural networks. Nature, 1999. 402(6761): p. 529-33. 22. Marsalek, P., C. Koch, and J. Maunsell, On the relationship between synaptic input and spike output jitter in individual neurons. Proc Natl Acad Sci U S A, 1997. 94(2): p. 735-40. 23. Kistler, W.M. and W. Gerstner, Stable propagation of activity pulses in populations of spiking neurons. Neural Comp., 2002. 14: p. 987-997. 24. Zohary, E., M.N. Shadlen, and W.T. Newsome, Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 1994. 370(6485): p. 140-3. 25. Abbott, L.F. and P. Dayan, The effect of correlated variability on the accuracy of a population code. Neural Comput, 1999. 11(1): p. 91-101.
6 0.47500679 141 nips-2002-Maximally Informative Dimensions: Analyzing Neural Responses to Natural Signals
7 0.4413071 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex
8 0.42698523 18 nips-2002-Adaptation and Unsupervised Learning
9 0.40712267 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
10 0.37606582 26 nips-2002-An Estimation-Theoretic Framework for the Presentation of Multiple Stimuli
11 0.34477127 28 nips-2002-An Information Theoretic Approach to the Functional Classification of Neurons
12 0.34357882 38 nips-2002-Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
13 0.3301419 187 nips-2002-Spikernels: Embedding Spiking Neurons in Inner-Product Spaces
14 0.31912756 81 nips-2002-Expected and Unexpected Uncertainty: ACh and NE in the Neocortex
15 0.30925822 171 nips-2002-Reconstructing Stimulus-Driven Neural Networks from Spike Times
16 0.28427503 95 nips-2002-Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting
17 0.26321322 160 nips-2002-Optoelectronic Implementation of a FitzHugh-Nagumo Neural Model
18 0.26168552 66 nips-2002-Developing Topography and Ocular Dominance Using Two aVLSI Vision Sensors and a Neurotrophic Model of Plasticity
19 0.25941911 199 nips-2002-Timing and Partial Observability in the Dopamine System
20 0.25071272 136 nips-2002-Linear Combinations of Optic Flow Vectors for Estimating Self-Motion - a Real-World Test of a Neural Model
topicId topicWeight
[(11, 0.018), (23, 0.021), (24, 0.02), (42, 0.048), (54, 0.078), (55, 0.038), (57, 0.014), (64, 0.015), (67, 0.014), (68, 0.035), (69, 0.029), (74, 0.051), (92, 0.018), (98, 0.539)]
simIndex simValue paperId paperTitle
1 0.99067974 129 nips-2002-Learning in Spiking Neural Assemblies
Author: David Barber
Abstract: We consider a statistical framework for learning in a class of networks of spiking neurons. Our aim is to show how optimal local learning rules can be readily derived once the neural dynamics and desired functionality of the neural assembly have been specified, in contrast to other models which assume (sub-optimal) learning rules. Within this framework we derive local rules for learning temporal sequences in a model of spiking neurons and demonstrate its superior performance to correlation (Hebbian) based approaches. We further show how to include mechanisms such as synaptic depression and outline how the framework is readily extensible to learning in networks of highly complex spiking neurons. A stochastic quantal vesicle release mechanism is considered and implications on the complexity of learning discussed. 1
same-paper 2 0.98488289 103 nips-2002-How Linear are Auditory Cortical Responses?
Author: Maneesh Sahani, Jennifer F. Linden
Abstract: By comparison to some other sensory cortices, the functional properties of cells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptive field (STRF), which amounts to a model of the stimulusresponse function (SRF) that is linear in the spectrogram of the stimulus. How well can such a model account for neural responses at the very first stages of auditory cortical processing? To answer this question, we develop a novel methodology for evaluating the fraction of stimulus-related response power in a population that can be captured by a given type of SRF model. We use this technique to show that, in the thalamo-recipient layers of primary auditory cortex, STRF models account for no more than 40% of the stimulus-related power in neural responses.
3 0.96847361 86 nips-2002-Fast Sparse Gaussian Process Methods: The Informative Vector Machine
Author: Ralf Herbrich, Neil D. Lawrence, Matthias Seeger
Abstract: We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on informationtheoretic principles, previously suggested for active learning. Our goal is not only to learn d–sparse predictors (which can be evaluated in O(d) rather than O(n), d n, n the number of training points), but also to perform training under strong restrictions on time and memory requirements. The scaling of our method is at most O(n · d2 ), and in large real-world classification experiments we show that it can match prediction performance of the popular support vector machine (SVM), yet can be significantly faster in training. In contrast to the SVM, our approximation produces estimates of predictive probabilities (‘error bars’), allows for Bayesian model selection and is less complex in implementation. 1
4 0.96554542 92 nips-2002-FloatBoost Learning for Classification
Author: Stan Z. Li, Zhenqiu Zhang, Heung-yeung Shum, Hongjiang Zhang
Abstract: AdaBoost [3] minimizes an upper error bound which is an exponential function of the margin on the training set [14]. However, the ultimate goal in applications of pattern classification is always minimum error rate. On the other hand, AdaBoost needs an effective procedure for learning weak classifiers, which by itself is difficult especially for high dimensional data. In this paper, we present a novel procedure, called FloatBoost, for learning a better boosted classifier. FloatBoost uses a backtrack mechanism after each iteration of AdaBoost to remove weak classifiers which cause higher error rates. The resulting float-boosted classifier consists of fewer weak classifiers yet achieves lower error rates than AdaBoost in both training and test. We also propose a statistical model for learning weak classifiers, based on a stagewise approximation of the posterior using an overcomplete set of scalar features. Experimental comparisons of FloatBoost and AdaBoost are provided through a difficult classification problem, face detection, where the goal is to learn from training examples a highly nonlinear classifier to differentiate between face and nonface patterns in a high dimensional space. The results clearly demonstrate the promises made by FloatBoost over AdaBoost.
5 0.96138823 56 nips-2002-Concentration Inequalities for the Missing Mass and for Histogram Rule Error
Author: Luis E. Ortiz, David A. McAllester
Abstract: This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding’s inequality, the Angluin-Valiant bound, Bernstein’s inequality, Bennett’s inequality, or McDiarmid’s theorem.
6 0.91661257 59 nips-2002-Constraint Classification for Multiclass Classification and Ranking
7 0.87729156 79 nips-2002-Evidence Optimization Techniques for Estimating Stimulus-Response Functions
8 0.84665596 184 nips-2002-Spectro-Temporal Receptive Fields of Subthreshold Responses in Auditory Cortex
9 0.84140038 50 nips-2002-Circuit Model of Short-Term Synaptic Dynamics
10 0.83155441 43 nips-2002-Binary Coding in Auditory Cortex
11 0.81547451 12 nips-2002-A Neural Edge-Detection Model for Enhanced Auditory Sensitivity in Modulated Noise
12 0.80955482 110 nips-2002-Incremental Gaussian Processes
13 0.80411041 102 nips-2002-Hidden Markov Model of Cortical Synaptic Plasticity: Derivation of the Learning Rule
14 0.79002249 116 nips-2002-Interpreting Neural Response Variability as Monte Carlo Sampling of the Posterior
15 0.7822423 180 nips-2002-Selectivity and Metaplasticity in a Unified Calcium-Dependent Model
16 0.78178346 41 nips-2002-Bayesian Monte Carlo
17 0.77550536 199 nips-2002-Timing and Partial Observability in the Dopamine System
18 0.76484883 81 nips-2002-Expected and Unexpected Uncertainty: ACh and NE in the Neocortex
19 0.75327849 7 nips-2002-A Hierarchical Bayesian Markovian Model for Motifs in Biopolymer Sequences
20 0.74888217 148 nips-2002-Morton-Style Factorial Coding of Color in Primary Visual Cortex