nips nips2008 nips2008-172 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer
Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. [sent-11, score-0.454]
2 , word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. [sent-14, score-0.827]
3 We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. [sent-15, score-0.808]
4 We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. [sent-17, score-0.824]
5 Speed-accuracy tradeoffs are due to the fact that evidence supporting the correct response accumulates gradually over time (Rabbitt & Vyas, 1970; Gold & Shadlen, 2002). [sent-27, score-0.432]
6 Recent theories have cast response initiation in terms of optimality (Bogacz et al. [sent-30, score-0.376]
7 We argue that this estimate in turn requires knowledge of the task difficulty, or specifically, the rate at which evidence supporting the correct response accumulates over time. [sent-33, score-0.485]
8 If a task is performed repeatedly, task difficulty can be estimated over a series of trials, suggesting that optimal decision processes should show sequential effects, in which performance on one trial depends on the difficulty of recent trials. [sent-34, score-0.366]
9 We describe an experimental paradigm that offers behavioral evidence of sequential effects in response initiation. [sent-35, score-0.525]
10 ˆ b We summarize key phenomena from this paradigm, and show that these phenomena are predicted by a model of response initiation. [sent-53, score-0.332]
11 Our work achieves two goals: (1) offering a better understanding of and a computational characterization of control processes involved in response initiation, and (2) offering a rational basis for sequential effects in simple stimulus-response tasks. [sent-54, score-0.385]
12 , Gold & Shadlen, 2002; Ratcliff, Cherian, & Segraves, 2003) have provided converging evidence for a theory of cortical decision making, known as the diffusion decision model or DDM (see recent review by Ratcliff & McKoon, 2007). [sent-57, score-0.467]
13 A noisy neural integrator accumulates evidence over time; positive evidence supports one response, negative evidence the other. [sent-59, score-0.414]
14 The variables µ and σ are called the drift and diffusion rates. [sent-61, score-0.534]
15 A response is initiated when the accumulated evidence reaches a positive or negative threshold, i. [sent-62, score-0.361]
16 This evidence supports the target response via a positive drift rate, µR∗ , whereas the drift rates of the other possible color names, {µi | i = R∗ }, are zero. [sent-71, score-1.099]
17 , an aqua patch provides no evidence for the response ’blue’, although our model could be extended in this way. [sent-74, score-0.326]
18 Instead, candidate rules are based on the posterior probability that a particular response is correct given the observed evidence up to the current time, P (R∗ = r|X). [sent-80, score-0.395]
19 The simulations reported here use a decision rule that initiates responding when the accuracy of the response is above a threshold, θ: If ∃r such that P (R∗ = r|X) ≥ θ, then initiate response r. [sent-88, score-0.553]
20 Baum and Veeravalli (1994; see also Bogacz & Gurney, 2007) derive P (R∗ = r|X) for the case where all nontargets have the same drift rate, µnontgt , the target has drift rate µtgt , and µnontgt , µtgt , and σ are known. [sent-93, score-0.786]
21 (We introduce the µtgt and µnontgt notation to refer to these drift rates even in the absence of information about R∗ . [sent-94, score-0.412]
22 The diffusion rate of a random walk, σ 2 , can be determined with arbitrary precision from a single observed trajectory, but the drift rate cannot (see Supplementary Material – available at http://matt. [sent-96, score-0.662]
23 (2) Consider the case where the drift rate of the target is a random variable, µtgt ∼ N (a, b2 ), and the drift rate of all nontargets, µnontgt , is zero. [sent-105, score-0.85]
24 (3) 2σ 2 (σ 2 + T b2 ) The middle panel of Figure 1 shows P (R∗ |X, a, b), as a function of processing time for the diffusion trace in the left panel, when the true drift rate is known (a = µtgt and b = 0). [sent-107, score-0.641]
25 2 Estimating Drift To recap, we have argued that optimal response initiation in nAFC tasks requires calculation of the posterior response distribution, which in turn depends on assumptions about the drift rate of the target response. [sent-109, score-1.048]
26 We proposed a decision rule based on a probabilisitic framework (Equations 1 and 3) that permits uncertainty in the drift rate, but requires a characterization of the prior distribution of this variable. [sent-110, score-0.49]
27 We turn now to the estimation of the model’s drift distribution parameters, a and ˆ Consider a ˆ b. [sent-117, score-0.361]
28 K, in which the same decision task is performed with different stimuli, and the drift rate of the target response on trial k is µ(k). [sent-121, score-0.916]
29 Following each trial, the drift rate can also be estimated: µtgt (k) = ∆xR∗ (Tk )/Tk , where Tk is the time taken to respond on trial k. [sent-122, score-0.582]
30 ˆ If the task environment changes slowly, the drift rates over trials will be autocorrelated, and the drift distribution parameters on trial k can be estimated from past trial history, {ˆtgt (1). [sent-123, score-1.177]
31 5 ˆ , (4) where k is an index over trials, and the {vi (k)} are moment statistics of the drift disribution, updated following each trial using an exponential weighting constant, λ ∈ [0, 1]: vi (k) = λvi (k − 1) + µtgt (k − 1)i . [sent-129, score-0.518]
32 3 3 The Blocking Effect The optimal decision framework we have proposed naturally leads to the prediction that performance on the current trial is influenced by drift rates observed on recent trials. [sent-132, score-0.659]
33 Because drift rates determine the signal-to-noise ratio of the diffusion process, they reflect the difficulty of the task at hand. [sent-133, score-0.617]
34 Thus, the framework predicts that an optimal decision maker should show sequential effects based on recent trial difficulty. [sent-134, score-0.387]
35 By definition, individuals have faster response times (RTs) and lower error rates to easy items. [sent-139, score-0.401]
36 Consider an experimental paradigm consisting of three blocks: just easy items (pure easy), just hard items (pure hard), and a mixture of both in random order (mixed). [sent-141, score-0.684]
37 When presented in a mixed block, easy items slow down relative to a pure block and hard items speed up. [sent-142, score-1.027]
38 This phenomenon, known as the blocking effect (not to be confused with blocking in associative learning), suggests that the response-initiation processes use information not only from the current stimulus, but also from the stimulus environment in which it is operating. [sent-143, score-0.88]
39 Table 1 shows a typical blocking result for a word-reading task, where word frequency is used to manipulate difficulty. [sent-144, score-0.386]
40 They are obtained when stimulus or response characteristics alternate from trial to trial. [sent-150, score-0.421]
41 Thus, the blocking effect is not associated with a specific stimulus or response pathway, but rather is a general phenomenon of response initiation. [sent-151, score-0.97]
42 , lexical decision, priming) significantly more slowdown than speedup can be observed. [sent-158, score-0.327]
43 The RT difference bewteen easy and hard items does not fully disappear in mixed blocks. [sent-160, score-0.586]
44 , the variability in RT on trial k due to trial k − l decreases rapidly with l. [sent-167, score-0.314]
45 Overt responses are necessary for obtaining blocking effects, but overt errors are not. [sent-170, score-0.492]
46 4 Explanations for the Blocking Effect The blocking effect demonstrates that the response time depends not only on information accruing from the current stimulus, but also on recent stimuli in the trial history. [sent-171, score-0.811]
47 Therefore, any explanation of the blocking effect must specify the manner by which response initiation processes are sensitive to the composition of a block. [sent-172, score-0.851]
48 Because of the ubiquity of blocking effects across tasks, domain-specific Table 1: RTs and Error Rates for Blocking study of Lupker, Brown, & Columbo (1997, Expt. [sent-177, score-0.501]
49 Parsimony is achieved only if the adaptation mechanism is localized to a stage of response initiation common across stimulus-response tasks. [sent-185, score-0.378]
50 Simulations of this model have explained the basic blocking effect, but not the complete set of phenomena we listed previously. [sent-189, score-0.446]
51 Of greater concern is the fact that the model predicts the time taken to utter the response (when the response mode is verbal) decreases with increased speed pressure, which does not appear to be true (Damian, 2003). [sent-190, score-0.466]
52 Because a higher criterion produces slower RTs and lower error rates, this leads to slowdown of easy items and speedup of hard items in a mixed block. [sent-194, score-1.155]
53 Nonetheless, there are four reasons for being skeptical about an account of the blocking effect based on adjustment of an evidence criterion. [sent-195, score-0.556]
54 (2) Taylor and Lupker (2001) illustrate that adaptation of an evidence criterion can—at least in some models— yield incorrect predictions concerning the blocking effect. [sent-197, score-0.589]
55 (3) Strayer and Kramer (1994) attempted to model the blocking effect for a 2AFC task using an adaptive response criterion in the DDM. [sent-198, score-0.741]
56 Their account fit data, but had a critical shortcoming: They needed to allow different criteria for easy and hard items in a mixed block, which makes no sense because the trial type was not known in advance, and setting differential criteria depends on knowing the trial type. [sent-199, score-0.9]
57 5 Our Account: Sequential Estimation of Task Difficulty Having argued that existing accounts of the blocking effect are inadequate, we return to our analysis of nAFC tasks, and show that it provides a parsimonious account of blocking effects. [sent-203, score-0.828]
58 Our account is premised on the assumption that response initiation processes are in some sense optimal. [sent-204, score-0.344]
59 Regardless of the specific optimality criterion, optimal response initiation requires an estimate of accuracy, specifically, the probability that a response will be correct conditioned on the evidence accumulated thus far, P (R∗ = r|X). [sent-205, score-0.768]
60 The response posterior, P (R∗ = r|X), under our generative model of the task environment (Equation 3) predicts a blocking effect. [sent-207, score-0.63]
61 How does this fact relate to the blocking effect? [sent-214, score-0.386]
62 Easy items have, by definition, a higher mean drift than hard items; therefore, the estimated drift in the easy condition will be greater than in the hard condition, E[ˆE ] > E[ˆH ]. [sent-215, score-1.301]
63 Any learning rule for a based on recent history will yield an estimated drift in a a ˆ the mixed condition between those of the easy and hard conditions, i. [sent-216, score-0.802]
64 a a a With response times related to a, an easy item will slow down in the mixed condition relative to the ˆ pure, and a hard item will speed up. [sent-219, score-0.692]
65 A blocking effect of any magnitude in the model could therefore be transformed to fit any pattern of data that had the right qualitative features. [sent-225, score-0.442]
66 5 Figure 2: Simulation of the blocking paradigm with random parameter settings. [sent-227, score-0.386]
67 (c) Scatterplot of change in error rate between pure and mixed conditions for easy and hard items. [sent-231, score-0.52]
68 In addition, to simulate the blocking effect, we must specify the true drift distributions for easy and hard items, i. [sent-233, score-0.975]
69 (We might also allow for nonzero drift rates for some or all of the distractor responses. [sent-236, score-0.412]
70 ) To explore the robustness of the model, we performed 1200 replications of a blocking simulation, each with randomly drawn values for the eight free parameters. [sent-237, score-0.386]
71 We discarded from our analysis simulations in which the error rates were grossly unlike those obtained in experimental studies, specifically, where the mean error rate in any condition was above 20%, and where the error rates for easy and hard items differed by more than a factor of 10. [sent-252, score-0.622]
72 Figure 2a shows a scatterplot comparing the speedup of hard items (from pure to mixed conditions) to the slowdown of easy items. [sent-253, score-1.023]
73 Thus, the model shows a key signature of the behavioral data—symmetric blocking effects (Phenomenon P2). [sent-260, score-0.53]
74 This percentage is 100 if easy RTs slow down and hard RTs speed up to become equal; the percentage is 0 if there is no slowdown of easy RTs or speedup of hard RTs. [sent-262, score-0.848]
75 The simulation runs show a 10–30% reduction as a result of the blocking manipulation. [sent-263, score-0.43]
76 Nonetheless, the model shows the key property that easy RTs are still faster than hard RTs in the mixed condition (Phenomenon P3). [sent-269, score-0.358]
77 Figure 2c shows a scatterplot of the change in error rate for easy items (from pure to mixed conditions) versus change in error rate for hard items. [sent-270, score-0.865]
78 Consistent with the behavioral data (Phenomenon P4), a speed-accuracy trade off is observed: When easy items slow down in the mixed versus pure conditions, error rates drop; when hard items speed up, error rates rise. [sent-271, score-1.115]
79 Interestingly, the accuracy 6 620 Response Time Figure 3: Human (black) and simulation (white) RTs for easy and hard items in a mixed block, conditional on the 0, 1, and 2 previous items (Taylor & Lupker, 2001). [sent-274, score-0.858]
80 Last letter in the trial sequence indicates the current trial and trial order is left to right. [sent-275, score-0.471]
81 Although the blocking effect is typically characterized by comparing performance of an item type across blocks, sequential effects within a block have also been examined. [sent-278, score-0.695]
82 Trial k is most influenced by trial k − 1, but trial k − 2 modulates RTs as well. [sent-282, score-0.314]
83 We have addressed all of the key phenomena of the blocking effect except two. [sent-293, score-0.502]
84 The ubiquity of the effect is completely consistent with our focus on general mechanisms of response initiation. [sent-295, score-0.338]
85 The model does not make any claims about the specific domain or the cause of variation in drift rates. [sent-296, score-0.361]
86 Phenomenon P6 states that overt responses are required to obtain the blocking effect. [sent-297, score-0.492]
87 Although the model cannot lay claims to distinctions between overt and covert responses, it does require that a drift estimate, µtgt , be obtained on each trial in order to adjust a and ˆ which leads ˆ ˆ b, to blocking effects. [sent-298, score-0.952]
88 In turn, µtgt is determined at the point in the diffusion process when a response ˆ would be initiated. [sent-299, score-0.385]
89 Thus, the model claims that selecting a response on trial k is key to influencing performance on trial k + 1. [sent-300, score-0.526]
90 6 Conclusions We have argued that optimal response initiation in speeded choice tasks requires advance knowledge about the difficulty of the current decision. [sent-301, score-0.413]
91 Difficulty corresponds to the expected rate of evidence accumulation for the target response relative to distractors. [sent-302, score-0.39]
92 This is consistent with the empirically observed blocking effect, whereby responses are slower to easy items and faster to hard items when those items are interleaved, compared to when item types are presented in separate blocks. [sent-305, score-1.396]
93 According to our model, mixed blocks induce estimates of local difficulty that are intermediate between those in pure easy and pure hard blocks. [sent-306, score-0.554]
94 The resultant overestimation of difficulty for easy items leads to increased decision times, while an opposite effect occurs for hard items. [sent-307, score-0.602]
95 Evidence for each response accrues in a random walk, with positive drift rate µtgt for the correct response and zero drift for distractors. [sent-309, score-1.241]
96 Underestimation of the drift rate, as with easy trials in a mixed block, leads to damping of the computed posterior and response slowdown. [sent-312, score-0.904]
97 Overestimation, as with hard trials in a mixed block, leads to exaggeration of the posterior and response speedup. [sent-313, score-0.561]
98 The model successfully explains the full range of phenomena associated with the blocking effect, including the effects on both RTs and errors, the patterns of slowdown of easy items and speedup of hard items, and the detailed sequential effects of recent trials. [sent-314, score-1.413]
99 , nonzero drift rates for some of the distractors, reflecting the similarity structure of perceptual representations). [sent-318, score-0.412]
100 The diffusion decision model: Theory and data for two-choice decision tasks. [sent-422, score-0.353]
wordName wordTfidf (topN-words)
[('blocking', 0.386), ('drift', 0.361), ('tgt', 0.301), ('items', 0.228), ('rts', 0.226), ('response', 0.212), ('lupker', 0.181), ('slowdown', 0.181), ('diffusion', 0.173), ('trial', 0.157), ('initiation', 0.132), ('mixed', 0.13), ('hard', 0.123), ('evidence', 0.114), ('speedup', 0.105), ('easy', 0.105), ('pure', 0.098), ('bogacz', 0.092), ('decision', 0.09), ('effects', 0.085), ('naming', 0.085), ('ddm', 0.079), ('culty', 0.076), ('block', 0.073), ('ah', 0.066), ('composition', 0.065), ('rate', 0.064), ('gurney', 0.06), ('nafc', 0.06), ('nontgt', 0.06), ('phenomena', 0.06), ('behavioral', 0.059), ('trials', 0.058), ('responses', 0.058), ('effect', 0.056), ('sequential', 0.055), ('taylor', 0.055), ('criterion', 0.055), ('scatterplot', 0.053), ('ratcliff', 0.053), ('phenomenon', 0.052), ('stimulus', 0.052), ('rates', 0.051), ('cognitive', 0.049), ('overt', 0.048), ('ae', 0.048), ('jep', 0.048), ('baum', 0.045), ('coltheart', 0.045), ('veeravalli', 0.045), ('simulation', 0.044), ('history', 0.044), ('panel', 0.043), ('xr', 0.043), ('tradeoffs', 0.043), ('bh', 0.043), ('speed', 0.042), ('lexical', 0.041), ('item', 0.04), ('ms', 0.04), ('mechanisms', 0.04), ('integrator', 0.04), ('strategic', 0.04), ('speeded', 0.04), ('multiresponse', 0.04), ('rule', 0.039), ('posterior', 0.038), ('shadlen', 0.036), ('dif', 0.036), ('accumulated', 0.035), ('adaptation', 0.034), ('brown', 0.034), ('individuals', 0.033), ('control', 0.033), ('optimality', 0.032), ('accumulates', 0.032), ('passive', 0.032), ('percentage', 0.032), ('task', 0.032), ('correct', 0.031), ('cherian', 0.03), ('columbo', 0.03), ('damian', 0.03), ('dorfman', 0.03), ('dragelin', 0.03), ('glanzer', 0.03), ('kello', 0.03), ('kiger', 0.03), ('lmc', 0.03), ('mckoon', 0.03), ('mdm', 0.03), ('plaut', 0.03), ('rabbitt', 0.03), ('rastle', 0.03), ('segraves', 0.03), ('ubiquity', 0.03), ('vyas', 0.03), ('instructions', 0.029), ('tasks', 0.029), ('alternatives', 0.029)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999994 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters
Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer
Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1
2 0.20331098 231 nips-2008-Temporal Dynamics of Cognitive Control
Author: Jeremy Reynolds, Michael C. Mozer
Abstract: Cognitive control refers to the flexible deployment of memory and attention in response to task demands and current goals. Control is often studied experimentally by presenting sequences of stimuli, some demanding a response, and others modulating the stimulus-response mapping. In these tasks, participants must maintain information about the current stimulus-response mapping in working memory. Prominent theories of cognitive control use recurrent neural nets to implement working memory, and optimize memory utilization via reinforcement learning. We present a novel perspective on cognitive control in which working memory representations are intrinsically probabilistic, and control operations that maintain and update working memory are dynamically determined via probabilistic inference. We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing. Moreover, our model provides insight into how task instructions can be directly translated into appropriate behavior and then efficiently refined with subsequent task experience. 1
3 0.11475424 206 nips-2008-Sequential effects: Superstition or rational behavior?
Author: Angela J. Yu, Jonathan D. Cohen
Abstract: In a variety of behavioral tasks, subjects exhibit an automatic and apparently suboptimal sequential effect: they respond more rapidly and accurately to a stimulus if it reinforces a local pattern in stimulus history, such as a string of repetitions or alternations, compared to when it violates such a pattern. This is often the case even if the local trends arise by chance in the context of a randomized design, such that stimulus history has no real predictive power. In this work, we use a normative Bayesian framework to examine the hypothesis that such idiosyncrasies may reflect the inadvertent engagement of mechanisms critical for adapting to a changing environment. We show that prior belief in non-stationarity can induce experimentally observed sequential effects in an otherwise Bayes-optimal algorithm. The Bayesian algorithm is shown to be well approximated by linear-exponential filtering of past observations, a feature also apparent in the behavioral data. We derive an explicit relationship between the parameters and computations of the exact Bayesian algorithm and those of the approximate linear-exponential filter. Since the latter is equivalent to a leaky-integration process, a commonly used model of neuronal dynamics underlying perceptual decision-making and trial-to-trial dependencies, our model provides a principled account of why such dynamics are useful. We also show that parameter-tuning of the leaky-integration process is possible, using stochastic gradient descent based only on the noisy binary inputs. This is a proof of concept that not only can neurons implement near-optimal prediction based on standard neuronal dynamics, but that they can also learn to tune the processing parameters without explicitly representing probabilities. 1
4 0.075023562 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds
Author: Jeremy Lewi, Robert Butera, David M. Schneider, Sarah Woolley, Liam Paninski
Abstract: Sequential optimal design methods hold great promise for improving the efficiency of neurophysiology experiments. However, previous methods for optimal experimental design have incorporated only weak prior information about the underlying neural system (e.g., the sparseness or smoothness of the receptive field). Here we describe how to use stronger prior information, in the form of parametric models of the receptive field, in order to construct optimal stimuli and further improve the efficiency of our experiments. For example, if we believe that the receptive field is well-approximated by a Gabor function, then our method constructs stimuli that optimally constrain the Gabor parameters (orientation, spatial frequency, etc.) using as few experimental trials as possible. More generally, we may believe a priori that the receptive field lies near a known sub-manifold of the full parameter space; in this case, our method chooses stimuli in order to reduce the uncertainty along the tangent space of this sub-manifold as rapidly as possible. Applications to simulated and real data indicate that these methods may in many cases improve the experimental efficiency. 1
5 0.070477001 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation
Author: Adrian Haith, Carl P. Jackson, R. C. Miall, Sethu Vijayakumar
Abstract: Adaptation of visually guided reaching movements in novel visuomotor environments (e.g. wearing prism goggles) comprises not only motor adaptation but also substantial sensory adaptation, corresponding to shifts in the perceived spatial location of visual and proprioceptive cues. Previous computational models of the sensory component of visuomotor adaptation have assumed that it is driven purely by the discrepancy introduced between visual and proprioceptive estimates of hand position and is independent of any motor component of adaptation. We instead propose a unified model in which sensory and motor adaptation are jointly driven by optimal Bayesian estimation of the sensory and motor contributions to perceived errors. Our model is able to account for patterns of performance errors during visuomotor adaptation as well as the subsequent perceptual aftereffects. This unified model also makes the surprising prediction that force field adaptation will elicit similar perceptual shifts, even though there is never any discrepancy between visual and proprioceptive observations. We confirm this prediction with an experiment. 1
6 0.066506945 124 nips-2008-Load and Attentional Bayes
7 0.066021919 109 nips-2008-Interpreting the neural code with Formal Concept Analysis
8 0.065447234 101 nips-2008-Human Active Learning
9 0.052756093 121 nips-2008-Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement
10 0.051825419 223 nips-2008-Structure Learning in Human Sequential Decision-Making
11 0.051590808 67 nips-2008-Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance
12 0.051374197 134 nips-2008-Mixed Membership Stochastic Blockmodels
13 0.050878525 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues
14 0.046853647 138 nips-2008-Modeling human function learning with Gaussian processes
15 0.045003254 59 nips-2008-Dependent Dirichlet Process Spike Sorting
16 0.044844877 139 nips-2008-Modeling the effects of memory on human online sentence processing with particle filters
17 0.044260073 7 nips-2008-A computational model of hippocampal function in trace conditioning
18 0.042849641 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
19 0.042662974 10 nips-2008-A rational model of preference learning and choice prediction by children
20 0.042194076 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction
topicId topicWeight
[(0, -0.141), (1, 0.075), (2, 0.1), (3, 0.008), (4, 0.013), (5, 0.005), (6, -0.027), (7, 0.083), (8, 0.106), (9, 0.06), (10, -0.012), (11, 0.088), (12, -0.153), (13, 0.002), (14, -0.014), (15, 0.148), (16, 0.055), (17, -0.04), (18, 0.021), (19, 0.003), (20, -0.002), (21, -0.025), (22, 0.069), (23, -0.026), (24, -0.078), (25, 0.029), (26, 0.026), (27, 0.049), (28, -0.005), (29, -0.004), (30, -0.099), (31, 0.02), (32, 0.045), (33, -0.015), (34, -0.035), (35, -0.018), (36, -0.119), (37, 0.0), (38, 0.146), (39, 0.101), (40, 0.114), (41, 0.03), (42, 0.005), (43, 0.054), (44, 0.007), (45, -0.036), (46, 0.042), (47, -0.024), (48, -0.013), (49, 0.016)]
simIndex simValue paperId paperTitle
same-paper 1 0.96867174 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters
Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer
Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1
2 0.85861856 231 nips-2008-Temporal Dynamics of Cognitive Control
Author: Jeremy Reynolds, Michael C. Mozer
Abstract: Cognitive control refers to the flexible deployment of memory and attention in response to task demands and current goals. Control is often studied experimentally by presenting sequences of stimuli, some demanding a response, and others modulating the stimulus-response mapping. In these tasks, participants must maintain information about the current stimulus-response mapping in working memory. Prominent theories of cognitive control use recurrent neural nets to implement working memory, and optimize memory utilization via reinforcement learning. We present a novel perspective on cognitive control in which working memory representations are intrinsically probabilistic, and control operations that maintain and update working memory are dynamically determined via probabilistic inference. We show that our model provides a parsimonious account of behavioral and neuroimaging data, and suggest that it offers an elegant conceptualization of control in which behavior can be cast as optimal, subject to limitations on learning and the rate of information processing. Moreover, our model provides insight into how task instructions can be directly translated into appropriate behavior and then efficiently refined with subsequent task experience. 1
3 0.68470031 124 nips-2008-Load and Attentional Bayes
Author: Peter Dayan
Abstract: Selective attention is a most intensively studied psychological phenomenon, rife with theoretical suggestions and schisms. A critical idea is that of limited capacity, the allocation of which has produced continual conflict about such phenomena as early and late selection. An influential resolution of this debate is based on the notion of perceptual load (Lavie, 2005), which suggests that low-load, easy tasks, because they underuse the total capacity of attention, mandatorily lead to the processing of stimuli that are irrelevant to the current attentional set; whereas high-load, difficult tasks grab all resources for themselves, leaving distractors high and dry. We argue that this theory presents a challenge to Bayesian theories of attention, and suggest an alternative, statistical, account of key supporting data. 1
4 0.62521172 121 nips-2008-Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement
Author: Michael T. Todd, Yael Niv, Jonathan D. Cohen
Abstract: Working memory is a central topic of cognitive neuroscience because it is critical for solving real-world problems in which information from multiple temporally distant sources must be combined to generate appropriate behavior. However, an often neglected fact is that learning to use working memory effectively is itself a difficult problem. The Gating framework [14] is a collection of psychological models that show how dopamine can train the basal ganglia and prefrontal cortex to form useful working memory representations in certain types of problems. We unite Gating with machine learning theory concerning the general problem of memory-based optimal control [5-6]. We present a normative model that learns, by online temporal difference methods, to use working memory to maximize discounted future reward in partially observable settings. The model successfully solves a benchmark working memory problem, and exhibits limitations similar to those observed in humans. Our purpose is to introduce a concise, normative definition of high level cognitive concepts such as working memory and cognitive control in terms of maximizing discounted future rewards. 1 I n t ro d u c t i o n Working memory is loosely defined in cognitive neuroscience as information that is (1) internally maintained on a temporary or short term basis, and (2) required for tasks in which immediate observations cannot be mapped to correct actions. It is widely assumed that prefrontal cortex (PFC) plays a role in maintaining and updating working memory. However, relatively little is known about how PFC develops useful working memory representations for a new task. Furthermore, current work focuses on describing the structure and limitations of working memory, but does not ask why, or in what general class of tasks, is it necessary. Borrowing from the theory of optimal control in partially observable Markov decision problems (POMDPs), we frame the psychological concept of working memory as an internal state representation, developed and employed to maximize future reward in partially observable environments. We combine computational insights from POMDPs and neurobiologically plausible models from cognitive neuroscience to suggest a simple reinforcement learning (RL) model of working memory function that can be implemented through dopaminergic training of the basal ganglia and PFC. The Gating framework is a series of cognitive neuroscience models developed to explain how dopaminergic RL signals can shape useful working memory representations [1-4]. Computationally this framework models working memory as a collection of past observations, each of which can occasionally be replaced with the current observation, and addresses the problem of learning when to update each memory element versus maintaining it. In the original Gating model [1-2] the PFC contained a unitary working memory representation that was updated whenever a phasic dopamine (DA) burst occurred (e.g., due to unexpected reward or novelty). That model was the first to connect working memory and RL via the temporal difference (TD) model of DA firing [7-8], and thus to suggest how working memory might serve a normative purpose. However, that model had limited computational flexibility due to the unitary nature of the working memory (i.e., a singleobservation memory controlled by a scalar DA signal). More recent work [3-4] has partially repositioned the Gating framework within the Actor/Critic model of mesostriatal RL [9-10], positing memory updating as but another cortical action controlled by the dorsal striatal
5 0.62507546 7 nips-2008-A computational model of hippocampal function in trace conditioning
Author: Elliot A. Ludvig, Richard S. Sutton, Eric Verbeek, E. J. Kehoe
Abstract: We introduce a new reinforcement-learning model for the role of the hippocampus in classical conditioning, focusing on the differences between trace and delay conditioning. In the model, all stimuli are represented both as unindividuated wholes and as a series of temporal elements with varying delays. These two stimulus representations interact, producing different patterns of learning in trace and delay conditioning. The model proposes that hippocampal lesions eliminate long-latency temporal elements, but preserve short-latency temporal elements. For trace conditioning, with no contiguity between cue and reward, these long-latency temporal elements are necessary for learning adaptively timed responses. For delay conditioning, the continued presence of the cue supports conditioned responding, and the short-latency elements suppress responding early in the cue. In accord with the empirical data, simulated hippocampal damage impairs trace conditioning, but not delay conditioning, at medium-length intervals. With longer intervals, learning is impaired in both procedures, and, with shorter intervals, in neither. In addition, the model makes novel predictions about the response topography with extended cues or post-training lesions. These results demonstrate how temporal contiguity, as in delay conditioning, changes the timing problem faced by animals, rendering it both easier and less susceptible to disruption by hippocampal lesions. The hippocampus is an important structure in many types of learning and memory, with prominent involvement in spatial navigation, episodic and working memories, stimulus configuration, and contextual conditioning. One empirical phenomenon that has eluded many theories of the hippocampus is the dependence of aversive trace conditioning on an intact hippocampus (but see Rodriguez & Levy, 2001; Schmajuk & DiCarlo, 1992; Yamazaki & Tanaka, 2005). For example, trace eyeblink conditioning disappears following hippocampal lesions (Solomon et al., 1986; Moyer, Jr. et al., 1990), induces hippocampal neurogenesis (Gould et al., 1999), and produces unique activity patterns in hippocampal neurons (McEchron & Disterhoft, 1997). In this paper, we present a new abstract computational model of hippocampal function during trace conditioning. We build on a recent extension of the temporal-difference (TD) model of conditioning (Ludvig, Sutton & Kehoe, 2008; Sutton & Barto, 1990) to demonstrate how the details of stimulus representation can qualitatively alter learning during trace and delay conditioning. By gently tweaking this stimulus representation and reducing long-latency temporal elements, trace conditioning is severely impaired, whereas delay conditioning is hardly affected. In the model, the hippocampus is responsible for maintaining these long-latency elements, thus explaining the selective importance of this brain structure in trace conditioning. The difference between trace and delay conditioning is one of the most basic operational distinctions in classical conditioning (e.g., Pavlov, 1927). Figure 1 is a schematic of the two training procedures. In trace conditioning, a conditioned stimulus (CS) is followed some time later by a reward or uncon1 Trace Delay Stimulus Reward Figure 1: Event timelines in trace and delay conditioning. Time flows from left-to-right in the diagram. A vertical bar represents a punctate (short) event, and the extended box is a continuously available stimulus. In delay conditioning, the stimulus and reward overlap, whereas, in trace conditioning, there is a stimulus-free gap between the two punctate events. ditioned stimulus (US); the two stimuli are separated by a stimulus-free gap. In contrast, in delay conditioning, the CS remains on until presentation of the US. Trace conditioning is learned more slowly than delay conditioning, with poorer performance often observed even at asymptote. In both eyeblink conditioning (Moyer, Jr. et al., 1990; Solomon et al., 1986; Tseng et al., 2004) and fear conditioning (e.g., McEchron et al., 1998), hippocampal damage severely impairs the acquisition of conditioned responding during trace conditioning, but not delay conditioning. These selective hippocampal deficits with trace conditioning are modulated by the inter-stimulus interval (ISI) between CS onset and US onset. With very short ISIs (∼300 ms in eyeblink conditioning in rabbits), there is little deficit in the acquisition of responding during trace conditioning (Moyer, Jr. et al., 1990). Furthermore, with very long ISIs (>1000 ms), delay conditioning is also impaired by hippocampal lesions (Beylin et al., 2001). These interactions between ISI and the hippocampaldependency of conditioning are the primary data that motivate the new model. 1 TD Model of Conditioning Our full model of conditioning consists of three separate modules: the stimulus representation, learning algorithm, and response rule. The explanation of hippocampal function relies mostly on the details of the stimulus representation. To illustrate the implications of these representational issues, we have chosen the temporal-difference (TD) learning algorithm from reinforcement learning (Sutton & Barto, 1990, 1998) that has become the sine qua non for modeling reward learning in dopamine neurons (e.g., Ludvig et al., 2008; Schultz, Dayan, & Montague, 1997), and a simple, leaky-integrator response rule described below. We use these for simplicity and consistency with prior work; other learning algorithms and response rules might also yield similar conclusions. 1.1 Stimulus Representation In the model, stimuli are not coherent wholes, but are represented as a series of elements or internal microstimuli. There are two types of elements in the stimulus representation: the first is the presence microstimulus, which is exactly equivalent to the external stimulus (Sutton & Barto, 1990). This microstimulus is available whenever the corresponding stimulus is on (see Fig. 3). The second type of elements are the temporal microstimuli or spectral traces, which are a series of successively later and gradually broadening elements (see Grossberg & Schmajuk, 1989; Machado, 1997; Ludvig et al., 2008). Below, we show how the interaction between these two types of representational elements produces different styles of learning in delay and trace conditioning, resulting in differential sensitivity of these procedures to hippocampal manipulation. The temporal microstimuli are created in the model through coarse coding of a decaying memory trace triggered by stimulus onset. Figure 2 illustrates how this memory trace (left panel) is encoded by a series of basis functions evenly spaced across the height of the trace (middle panel). Each basis function effectively acts as a receptive field for trace height: As the memory trace fades, different basis functions become more or less active, each with a particular temporal profile (right panel). These activity profiles for the temporal microstimuli are then used to generate predictions of the US. For the basis functions, we chose simple Gaussians: 1 (y − µ)2 f (y, µ, σ) = √ exp(− ). 2σ 2 2π 2 (1) 0.4 Microstimulus Level Trace Height 1 0.75 + 0.5 0.25 0 0 20 40 60 Time Step 0.3 0.2 0.1 0 Temporal Basis Functions 0 20 40 60 Time Step Figure 2: Creating Microstimuli. The memory traces for a stimulus (left) are coarsely coded by a series of temporal basis functions (middle). The resultant time courses (right) of the temporal microstimuli are used to predict future occurrence of the US. A single basis function (middle) and approximately corresponding microstimulus (right) have been darkened. The inset in the right panel shows the levels of several microstimuli at the time indicated by the dashed line. Given these basis functions, the microstimulus levels xt (i) at time t are determined by the corresponding memory trace height: xt (i) = f (yt , i/m, σ)yt , (2) where f is the basis function defined above and m is the number of temporal microstimuli per stimulus. The trace level yt was set to 1 at stimulus onset and decreased exponentially, controlled by a single decay parameter, which was allowed to vary to simulate the effects of hippocampal lesions. Every stimulus, including the US, was represented by a single memory trace and resultant microstimuli. 1.2 Hippocampal Damage We propose that hippocampal damage results in the selective loss of the long-latency temporal elements of the stimulus representation. This idea is implemented in the model through a decrease in the memory decay constant from .985 to .97, approximately doubling the decay rate of the memory trace that determines the microstimuli. In effect, we assume that hippocampal damage results in a memory trace that decays more quickly, or, equivalently, is more susceptible to interference. Figure 3 shows the effects of this parameter manipulation on the time course of the elements in the stimulus representation. The presence microstimulus is not affected by this manipulation, but the temporal microstimuli are compressed for both the CS and the US. Each microstimulus has a briefer time course, and, as a group, they cover a shorter time span. Other means for eliminating or reducing the long-latency temporal microstimuli are certainly possible and would likely be compatible with our theory. For example, if one assumes that the stimulus representation contains multiple memory traces with different time constants, each with a separate set of microstimuli, then eliminating the slower memory traces would also remove the long-latency elements, and many of the results below hold (simulations not shown). The key point is that hippocampal damage reduces the number and magnitude of long-latency microstimuli. 1.3 Learning and Responding The model approaches conditioning as a reinforcement-learning prediction problem, wherein the agent tries to predict the upcoming rewards or USs. The model learns through linear TD(λ) (Ludvig et al., 2008; Schultz et al., 1997; Sutton, 1988; Sutton & Barto, 1990, 1998). At each time step, the US prediction (Vt ) is determined by: n T Vt (x) = wt x 0 = wt (i)x(i) i=1 3 , 0 (3) Microstimulus Level Normal Hippocampal 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 500 1000 0 500 1000 Time (ms) Figure 3: Hippocampal effects on the stimulus representation. The left panel presents the stimulus representation in delay conditioning with the normal parameter settings, and the right panel presents the altered stimulus representation following simulated hippocampal damage. In the hippocampal representation, the temporal microstimuli for both CS (red, solid lines) and US (green, dashed lines) are all briefer and shallower. The presence microstimuli (blue square wave and black spike) are not affected by the hippocampal manipulation. where x is a vector of the activation levels x(i) for the various microstimuli, wt is a corresponding vector of adjustable weights wt (i) at time step t, and n is the total number of all microstimuli. The US prediction is constrained to be non-negative, with negative values rectified to 0. As is standard in TD models, this US prediction is compared to the reward received and the previous US prediction to generate a TD error (δt ): δt = rt + γVt (xt ) − Vt (xt−1 ), (4) where γ is a discount factor that determines the temporal horizon of the US prediction. This TD error is then used to update the weight vector based on the following update rule: wt+1 = wt + αδt et , (5) where α is a step-size parameter and et is a vector of eligibility trace levels (see Sutton & Barto, 1998), which together help determine the speed of learning. Each microstimulus has its own corresponding eligibility trace which continuously decays, but accumulates whenever that microstimulus is present: et+1 = γλet + xt , (6) where γ is the discount factor as above and λ is a decay parameter that determines the plasticity window. These US predictions are translated into responses through a simple, thresholded leakyintegrator response rule: at+1 = υat + Vt+1 (xt ) θ , (7) where υ is a decay constant, and θ is a threshold on the value function V . Our model is defined by Equations 1-7 and 7 additional parameters, which were fixed at the following values for the simulations below: λ = .95, α = .005, γ = .97, n = 50, σ = .08, υ = .93, θ = .25. In the simulated experiments, one time step was interpreted as 10 ms. 4 CR Magnitude ISI250 5 4 3 3 Delay!Normal 2 Delay!HPC Trace!Normal 1 Trace!HPC 0 250 500 50 3 2 1 50 ISI1000 5 4 4 0 ISI500 5 2 1 250 500 0 50 250 500 Trials Figure 4: Learning in the model for trace and delay conditioning with and without hippocampal (HPC) damage. The three panels present training with different interstimulus intervals (ISI). 2 Results We simulated 12 total conditions with the model: trace and delay conditioning, both with and without hippocampal damage, for short (250 ms), medium (500 ms), and long (1000 ms) ISIs. Each simulated experiment was run for 500 trials, with every 5th trial an unreinforced probe trial, during which no US was presented. For delay conditioning, the CS lasted the same duration as the ISI and terminated with US presentation. For trace conditioning, the CS was present for 5 time steps (50 ms). The US always lasted for a single time step, and an inter-trial interval of 5000 ms separated all trials (onset to onset). Conditioned responding (CR magnitude) was measured as the maximum height of the response curve on a given trial. 0.8 CR Magnitude US Prediction Figure 4 summarizes our results. The figure depicts how the CR magnitude changed across the 500 trials of acquisition training. In general, trace conditioning produced lower levels of responding than delay conditioning, but this effect was most pronounced with the longest ISI. The effects of simulated hippocampal damage varied with the ISI. With the shortest ISI (250 ms; left panel), there was little effect on responding in either trace or delay conditioning. There was a small deficit early in training with trace conditioning, but this difference disappeared quickly with further training. With the longest ISI (1000 ms; right panel), there was a profound effect on responding in both trace and delay conditioning, with trace conditioning completely eliminated. The intermediate ISI (500 ms; middle panel) produced the most complex and interesting results. With this interval, there was only a minor deficit in delay conditioning, but a substantial drop in trace conditioning, especially early in training. This pattern of results roughly matches the empirical data, capturing the selective deficit in trace conditioning caused by hippocampal lesions (Solomon et al., 1986) as well as the modulation of this deficit by ISI (Beylin et al., 2001; Moyer, Jr. et al., 1990). Delay Trace 0.6 0.4 0.2 0 0 250 500 750 Time (ms) 5 4 3 2 1 0 0 250 500 750 Time (ms) Figure 5: Time course of US prediction and CR magnitude for both trace (red, dashed line) and delay conditioning (blue, solid line) with a 500-ms ISI. 5 These differences in sensitivity to simulated hippocampal damage arose despite similar model performance during normal trace and delay conditioning. Figure 5 shows the time course of the US prediction (left panel) and CR magnitude (right panel) after trace and delay conditioning on a probe trial with a 500-ms ISI. In both instances, the US prediction grew throughout the trial as the usual time of the US became imminent. Note the sharp drop off in US prediction for delay conditioning exactly as the CS terminates. This change reflects the disappearance of the presence microstimulus, which supports much of the responding in delay conditioning (see Fig. 6). In both procedures, even after the usual time of the US (and CS termination in the case of delay conditioning), there was still some residual US prediction. These US predictions were caused by the long-latency microstimuli, which did not disappear exactly at CS offset, and were ordinarily (on non-probe trials) countered by negative weights on the US microstimuli. The CR magnitude tracked the US prediction curve quite closely, peaking around the time the US would have occurred for both trace and delay conditioning. There was little difference in either curve between trace and delay conditioning, yet altering the stimulus representation (see Fig. 3) had a more pronounced effect on trace conditioning. An examination of the weight distribution for trace and delay conditioning explains why hippocampal damage had a more pronounced effect on trace than delay conditioning. Figure 6 depicts some representative microstimuli (left column) as well as their corresponding weights (right columns) following trace or delay conditioning with or without simulated hippocampal damage. For clarity in the figure, we have grouped the weights into four categories: positive (+), large positive (+++), negative (-), and large negative (--). The left column also depicts how the model poses the computational problem faced by an animal during conditioning; the goal is to sum together weighted versions of the available microstimuli to produce the ideal US prediction curve in the bottom row. In normal delay conditioning, the model placed a high positive weight on the presence microstimulus, but balanced that with large negative weights on the early CS microstimuli, producing a prediction topography that roughly matched the ideal prediction (see Fig. 5, left panel). In normal trace conditioning, the model only placed a small positive weight on the presence microstimulus, but supplemented that with large positive weights on both the early and late CS microstimuli, also producing a prediction topography that roughly matched the ideal prediction. Weights Normal HPC Lesion Delay CS Presence Stimulus CS Early Microstimuli CS Late Microstimuli US Early Microstimuli Trace Delay Trace +++ + +++ + -- + -- + + +++ N/A N/A - -- - - Ideal Summed Prediction Figure 6: Schematic of the weights (right columns) on various microstimuli following trace and delay conditioning. The left column illustrates four representative microstimuli: the presence microstimulus, an early CS microstimulus, a late CS microstimulus, and a US microstimulus. The ideal prediction is the expectation of the sum of future discounted rewards. 6 Following hippocampal lesions, the late CS microstimuli were no longer available (N/A), and the system could only use the other microstimuli to generate the best possible prediction profile. In delay conditioning, the loss of these long-latency microstimuli had a small effect, notable only with the longest ISI (1000 ms) with these parameter settings. With trace conditioning, the loss of the long-latency microstimuli was catastrophic, as these microstimuli were usually the major basis for the prediction of the upcoming US. As a result, trace conditioning became much more difficult (or impossible in the case of the 1000-ms ISI), even though delay conditioning was less affected. The most notable (and defining) difference between trace and delay conditioning is that the CS and US overlap in delay conditioning, but not trace conditioning. In our model, this overlap is necessary, but not sufficient, for the the unique interaction between the presence microstimulus and temporal microstimuli in delay conditioning. For example, if the CS were extended to stay on beyond the time of US occurrence, this contiguity would be maintained, but negative weights on the early CS microstimuli would not suffice to suppress responding throughout this extended CS. In this case, the best solution to predicting the US for the model might be to put high weights on the long-latency temporal microstimuli (as in trace conditioning; see Fig 6), which would not persist as long as the now extended presence microstimulus. Indeed, with a CS that was three times as long as the ISI, we found that the US prediction, CR magnitude, and underlying weights were completely indistinguishable from trace conditioning (simulations not shown). Thus, the model predicts that this extended delay conditioning should be equally sensitive to hippocampal damage as trace conditioning for the same ISIs. This empirical prediction is a fundamental test of the representational assumptions underlying the model. The particular mechanism that we chose for simulating the loss of the long-latency microstimuli (increasing the decay rate of the memory trace) also leads to a testable model prediction. If one were to pre-train an animal with trace conditioning and then perform hippocampal lesions, there should be some loss of responding, but, more importantly, those CRs that do occur should appear earlier in the interval because the temporal microstimuli now follow a shorter time course (see Fig. 3). There is some evidence for additional short-latency CRs during trace conditioning in lesioned animals (e.g., Port et al., 1986; Solomon et al., 1986), but, to our knowledge, this precise model prediction has not been rigorously evaluated. 3 Discussion and Conclusion We evaluated a novel computational model for the role of the hippocampus in trace conditioning, based on a reinforcement-learning framework. We extended the microstimulus TD model presented by Ludvig et al. (2008) by suggesting a role for the hippocampus in maintaining long-latency elements of the temporal stimulus representation. The current model also introduced an additional element to the stimulus representation (the presence microstimulus) and a simple response rule for translating prediction into actions; we showed how these subtle innovations yield interesting interactions when comparing trace and delay conditioning. In addition, we adduced a pair of testable model predictions about the effects of extended stimuli and post-training lesions. There are several existing theories for the role of the hippocampus in trace conditioning, including the modulation of timing (Solomon et al., 1986), establishment of contiguity (e.g., Wallenstein et al., 1998), and overcoming of task difficulty (Beylin et al., 2001). Our new model provides a computational mechanism that links these three proposed explanations. In our model, for similar ISIs, delay conditioning requires learning to suppress responding early in the CS, whereas trace conditioning requires learning to create responding later in the trial, near the time of the US (see Fig. 6). As a result, for the same ISI, delay conditioning requires changing weights associated with earlier microstimuli than trace conditioning, though in opposite directions. These early microstimuli reach higher activation levels (see Fig. 2), producing higher eligibility traces, and are therefore learned about more quickly. This differential speed of learning for short-latency temporal microstimuli corresponds with much behavioural data that shorter ISIs tend to improve both the speed and asymptote of learning in eyeblink conditioning (e.g., Schneiderman & Gormerzano, 1964). Thus, the contiguity between the CS and US in delay conditioning alters the timing problem that the animal faces, effectively making the time interval to be learned shorter, and rendering the task easier for most ISIs. In future work, it will be important to characterize the exact mathematical properties that constrain the temporal microstimuli. Our simple Gaussian basis function approach suffices for the datasets 7 examined here (cf. Ludvig et al., 2008), but other related mathematical functions are certainly possible. For example, replacing the temporal microstimuli in our model with the spectral traces of Grossberg & Schmajuk (1989) produces results that are similar to ours, but using sequences of Gamma-shaped functions tends to fail, with longer intervals learned too slowly relative to shorter intervals. One important characteristic of the microstimulus series seems to be that the heights of individual elements should not decay too quickly. Another key challenge for future modeling is reconciling this abstract account of hippocampal function in trace conditioning with approaches that consider greater physiological detail (e.g., Rodriguez & Levy, 2001; Yamazaki & Tanaka, 2005). The current model also contributes to our understanding of the TD models of dopamine (e.g., Schultz et al., 1997) and classical conditioning (Sutton & Barto, 1990). These models have often given short shrift to issues of stimulus representation, focusing more closely on the properties of the learning algorithm (but see Ludvig et al., 2008). Here, we reveal how the interaction of various stimulus representations in conjunction with the TD learning rule produces a viable model of some of the differences between trace and delay conditioning. References Beylin, A. V., Gandhi, C. C, Wood, G. E., Talk, A. C., Matzel, L. D., & Shors, T. J. (2001). The role of the hippocampus in trace conditioning: Temporal discontinuity or task difficulty? Neurobiology of Learning & Memory, 76, 447-61. Gould, E., Beylin, A., Tanapat, P., Reeves, A., & Shors, T. J. (1999). Learning enhances adult neurogenesis in the hippocampal formation. Nature Neuroscience, 2, 260-5. Grossberg, S., & Schmajuk, N. A. (1989). Neural dynamics of adaptive timing and temporal discrimination during associative learning. Neural Networks, 2, 79-102. Ludvig, E. A., Sutton, R. S., & Kehoe, E. J. (2008). Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Computation, 20, 3034-54. Machado, A. (1997). Learning the temporal dynamics of behavior. Psychological Review, 104, 241-265. McEchron, M. D., Bouwmeester, H., Tseng, W., Weiss, C., & Disterhoft, J. F. (1998). Hippocampectomy disrupts auditory trace fear conditioning and contextual fear conditioning in the rat. Hippocampus, 8, 63846. McEchron, M. D., Disterhoft, J. F. (1997). Sequence of single neuron changes in CA1 hippocampus of rabbits during acquisition of trace eyeblink conditioned responses. Journal of Neurophysiology, 78, 1030-44. Moyer, J. R., Jr., Deyo, R. A., & Disterhoft, J. F. (1990). Hippocampectomy disrupts trace eye-blink conditioning in rabbits. Behavioral Neuroscience, 104, 243-52. Pavlov, I. P. (1927). Conditioned Reflexes. London: Oxford University Press. Port, R. L., Romano, A. G., Steinmetz, J. E., Mikhail, A. A., & Patterson, M. M. (1986). Retention and acquisition of classical trace conditioned responses by rabbits with hippocampal lesions. Behavioral Neuroscience, 100, 745-752. Rodriguez, P., & Levy, W. B. (2001). A model of hippocampal activity in trace conditioning: Where’s the trace? Behavioral Neuroscience, 115, 1224-1238. Schmajuk, N. A., & DiCarlo, J. J. (1992). Stimulus configuration, classical conditioning, and hippocampal function. Psychological Review, 99, 268-305. Schneiderman, N., & Gormezano, I. (1964). Conditioning of the nictitating membrane of the rabbit as a function of CS-US interval. Journal of Comparative and Physiological Psychology, 57, 188-195. Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593-9. Solomon, P. R., Vander Schaaf, E. R., Thompson, R. F., & Weisz, D. J. (1986). Hippocampus and trace conditioning of the rabbit’s classically conditioned nictitating membrane response. Behavioral Neuroscience, 100, 729-744. Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9-44. Sutton, R. S., & Barto, A. G. (1990). Time-derivative models of Pavlovian reinforcement. In M. Gabriel & J. Moore (Eds.), Learning and Computational Neuroscience: Foundations of Adaptive Networks (pp. 497-537). Cambridge, MA: MIT Press. Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press. Tseng, W., Guan, R., Disterhoft, J. F., & Weiss, C. (2004). Trace eyeblink conditioning is hippocampally dependent in mice. Hippocampus, 14, 58-65. Wallenstein, G., Eichenbaum, H., & Hasselmo, M. (1998). The hippocampus as an associator of discontiguous events. Trends in Neuroscience, 21, 317-323. Yamazaki, T., & Tanaka, S. (2005). A neural network model for trace conditioning. International Journal of Neural Systems, 15, 23-30. 8
6 0.60055238 187 nips-2008-Psychiatry: Insights into depression through normative decision-making models
7 0.58010149 67 nips-2008-Effects of Stimulus Type and of Error-Correcting Code Design on BCI Speller Performance
8 0.57371503 100 nips-2008-How memory biases affect information transmission: A rational analysis of serial reproduction
9 0.55795902 222 nips-2008-Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
10 0.54098153 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds
11 0.53756642 206 nips-2008-Sequential effects: Superstition or rational behavior?
12 0.49175271 46 nips-2008-Characterizing response behavior in multisensory perception with conflicting cues
13 0.48732257 109 nips-2008-Interpreting the neural code with Formal Concept Analysis
14 0.47997797 11 nips-2008-A spatially varying two-sample recombinant coalescent, with applications to HIV escape response
15 0.43858564 3 nips-2008-A Massively Parallel Digital Learning Processor
16 0.42595851 240 nips-2008-Tracking Changing Stimuli in Continuous Attractor Neural Networks
17 0.40389037 94 nips-2008-Goal-directed decision making in prefrontal cortex: a computational framework
18 0.3966668 244 nips-2008-Unifying the Sensory and Motor Components of Sensorimotor Adaptation
19 0.38866711 101 nips-2008-Human Active Learning
20 0.36966431 90 nips-2008-Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity
topicId topicWeight
[(6, 0.049), (7, 0.088), (12, 0.038), (15, 0.015), (18, 0.012), (28, 0.192), (57, 0.052), (59, 0.038), (63, 0.019), (71, 0.02), (74, 0.324), (77, 0.024), (78, 0.017), (83, 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.80271363 172 nips-2008-Optimal Response Initiation: Why Recent Experience Matters
Author: Matt Jones, Sachiko Kinoshita, Michael C. Mozer
Abstract: In most cognitive and motor tasks, speed-accuracy tradeoffs are observed: Individuals can respond slowly and accurately, or quickly yet be prone to errors. Control mechanisms governing the initiation of behavioral responses are sensitive not only to task instructions and the stimulus being processed, but also to the recent stimulus history. When stimuli can be characterized on an easy-hard dimension (e.g., word frequency in a naming task), items preceded by easy trials are responded to more quickly, and with more errors, than items preceded by hard trials. We propose a rationally motivated mathematical model of this sequential adaptation of control, based on a diffusion model of the decision process in which difficulty corresponds to the drift rate for the correct response. The model assumes that responding is based on the posterior distribution over which response is correct, conditioned on the accumulated evidence. We derive this posterior as a function of the drift rate, and show that higher estimates of the drift rate lead to (normatively) faster responding. Trial-by-trial tracking of difficulty thus leads to sequential effects in speed and accuracy. Simulations show the model explains a variety of phenomena in human speeded decision making. We argue this passive statistical mechanism provides a more elegant and parsimonious account than extant theories based on elaborate control structures. 1
2 0.74666679 248 nips-2008-Using matrices to model symbolic relationship
Author: Ilya Sutskever, Geoffrey E. Hinton
Abstract: We describe a way of learning matrix representations of objects and relationships. The goal of learning is to allow multiplication of matrices to represent symbolic relationships between objects and symbolic relationships between relationships, which is the main novelty of the method. We demonstrate that this leads to excellent generalization in two different domains: modular arithmetic and family relationships. We show that the same system can learn first-order propositions such as (2, 5) ∈ +3 or (Christopher, Penelope) ∈ has wife, and higher-order propositions such as (3, +3) ∈ plus and (+3, −3) ∈ inverse or (has husband, has wife) ∈ higher oppsex. We further demonstrate that the system understands how higher-order propositions are related to first-order ones by showing that it can correctly answer questions about first-order propositions involving the relations +3 or has wife even though it has not been trained on any first-order examples involving these relations. 1
3 0.73789358 114 nips-2008-Large Margin Taxonomy Embedding for Document Categorization
Author: Kilian Q. Weinberger, Olivier Chapelle
Abstract: Applications of multi-class classification, such as document categorization, often appear in cost-sensitive settings. Recent work has significantly improved the state of the art by moving beyond “flat” classification through incorporation of class hierarchies [4]. We present a novel algorithm that goes beyond hierarchical classification and estimates the latent semantic space that underlies the class hierarchy. In this space, each class is represented by a prototype and classification is done with the simple nearest neighbor rule. The optimization of the semantic space incorporates large margin constraints that ensure that for each instance the correct class prototype is closer than any other. We show that our optimization is convex and can be solved efficiently for large data sets. Experiments on the OHSUMED medical journal data base yield state-of-the-art results on topic categorization. 1
4 0.58660716 197 nips-2008-Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
Author: Indraneel Mukherjee, David M. Blei
Abstract: Hierarchical probabilistic modeling of discrete data has emerged as a powerful tool for text analysis. Posterior inference in such models is intractable, and practitioners rely on approximate posterior inference methods such as variational inference or Gibbs sampling. There has been much research in designing better approximations, but there is yet little theoretical understanding of which of the available techniques are appropriate, and in which data analysis settings. In this paper we provide the beginnings of such understanding. We analyze the improvement that the recently proposed collapsed variational inference (CVB) provides over mean field variational inference (VB) in latent Dirichlet allocation. We prove that the difference in the tightness of the bound on the likelihood of a document decreases as O(k − 1) + log m/m, where k is the number of topics in the model and m is the number of words in a document. As a consequence, the advantage of CVB over VB is lost for long documents but increases with the number of topics. We demonstrate empirically that the theory holds, using simulated text data and two text corpora. We provide practical guidelines for choosing an approximation. 1
5 0.57252574 4 nips-2008-A Scalable Hierarchical Distributed Language Model
Author: Andriy Mnih, Geoffrey E. Hinton
Abstract: Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs is their extremely long training and testing times. Morin and Bengio have proposed a hierarchical language model built around a binary tree of words, which was two orders of magnitude faster than the nonhierarchical model it was based on. However, it performed considerably worse than its non-hierarchical counterpart in spite of using a word tree created using expert knowledge. We introduce a fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data. We then show that the resulting models can outperform non-hierarchical neural models as well as the best n-gram models. 1
7 0.5705297 218 nips-2008-Spectral Clustering with Perturbed Data
8 0.57001746 231 nips-2008-Temporal Dynamics of Cognitive Control
9 0.56801796 106 nips-2008-Inferring rankings under constrained sensing
10 0.56791449 129 nips-2008-MAS: a multiplicative approximation scheme for probabilistic inference
11 0.56734818 71 nips-2008-Efficient Sampling for Gaussian Process Inference using Control Variables
12 0.56728685 118 nips-2008-Learning Transformational Invariants from Natural Movies
13 0.56655699 101 nips-2008-Human Active Learning
14 0.56613046 138 nips-2008-Modeling human function learning with Gaussian processes
15 0.56522352 179 nips-2008-Phase transitions for high-dimensional joint support recovery
16 0.56483239 34 nips-2008-Bayesian Network Score Approximation using a Metagraph Kernel
17 0.56466597 211 nips-2008-Simple Local Models for Complex Dynamical Systems
18 0.56451017 79 nips-2008-Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
19 0.56442142 60 nips-2008-Designing neurophysiology experiments to optimally constrain receptive field models along parametric submanifolds
20 0.56415761 247 nips-2008-Using Bayesian Dynamical Systems for Motion Template Libraries