nips nips2001 nips2001-123 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Aaron C. Courville, David S. Touretzky
Abstract: The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1 , 2, 3], which other models are unable to explain. 1
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. [sent-5, score-0.701]
2 We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. [sent-6, score-0.148]
3 This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [sent-7, score-0.479]
4 The well-known phenomena of latent learning and sensory preconditioning indicate that animals learn about stimuli in their environment before any reinforcement is supplied. [sent-10, score-0.481]
5 Miller and colleagues has demonstrated that in classical conditioning paradigms, animals appear to learn the temporal structure of the stimuli [8]. [sent-13, score-0.84]
6 We then present a model of conditioning based on a constrained hidden Markov model , using quasiBayes estimation to adjust the model parameters online. [sent-15, score-0.445]
7 Simulation results confirm that the model reproduces the experimental observations, suggesting that this approach is a viable alternative to earlier models of classical conditioning which cannot account for the Miller et al. [sent-16, score-0.385]
8 Responding to a conditioned stimulus (CS) is impaired when it is presented simultaneously with the unconditioned stimulus (US) rather than preceding the US. [sent-21, score-0.611]
9 The failure of the simultaneous conditioning procedure to demonstrate a conditioned response (CR) is a well established result in the classical conditioning literature [9]. [sent-22, score-0.686]
10 The plus sign (+ ) indicates simultaneous presentation of stimuli; the short arrow (-+) indicates one stimulus immediately following another; and the long arrow (-----+) indicates a 5 sec gap between stimulus offset and the following stimulus onset. [sent-30, score-1.126]
11 1, the tone T, click train C, and footshock US were all of 5 sec duration. [sent-32, score-0.634]
12 2, the tone and click train durations were 5 sec and the footshock US lasted 0. [sent-34, score-0.689]
13 3, the light L , buzzer E , and auditory stimulus X (either a tone or white noise) were all of 30 sec duration, while the footshock US lasted 1 sec. [sent-37, score-0.945]
14 While a tone CS presented simultaneously with a footshock results in a minimal CR to the tone, a click train preceding the tone (in phase 2) does acquire associative strength, as indicated by a CR. [sent-40, score-1.039]
15 [2] exposed rats to a tone T immediately followed by a click train C. [sent-44, score-0.502]
16 In a second phase, the tone was paired with a footshock US that either immediately followed tone offset (variant A), or occurred 5 sec after tone offset (variant B). [sent-45, score-1.27]
17 They found that when C and US both immediately follow T , little conditioned response is elicited by the presentation of C. [sent-46, score-0.185]
18 However, when the US occurs 5 sec after tone offset, so that it occurs later than C (measured relative to T), then C does come to elicit a CR. [sent-47, score-0.401]
19 [3], rats were presented with a flashing light L followed by a footshock US, followed by an auditory stimulus X (either a tone or white noise). [sent-51, score-0.87]
20 Testing revealed that while X did not elicit a CR (in fact, it became a conditioned inhibitor), X did impart an excitatory association to B. [sent-53, score-0.172]
21 2 Existing Models of Classical Conditioning The Rescorla-Wagner model [11] is still the best-known model of classical conditioning, but as a trial-level model, it cannot account for within-trial effects such as second order conditioning or sensitivity to stimulus timing. [sent-54, score-0.686]
22 And using a memory buffer representation (what Sutton and Barto call a complete serial compound), TD can represent the temporal structure of a trial. [sent-58, score-0.344]
23 However, TD cannot account for the empirical data in Experiments 1- 3 because it does not make inferences about temporal relationships among stimuli; it focuses solely on predicting the US. [sent-59, score-0.205]
24 In Experiment 1, some versions of TD can account for the reduced associative strength of a CS when its onset occurs simultaneously with the US, but no version of TD can explain why the second-order stimulus C should acquire greater associative strength than T. [sent-60, score-0.588]
25 In Experiment 3, TD fails to predict the results because X is not predictive of the US; thus X acquires no associative strength to pass on to B in the second phase. [sent-63, score-0.148]
26 Even models that predict future stimuli have trouble accounting for Miller et al. [sent-64, score-0.242]
27 Dayan's "successor representation" [4], the world model of Sutton and Pinette [15], and the basal ganglia model of Suri and Schultz [13] all attempt to predict future stimulus vectors. [sent-66, score-0.341]
28 Temporal Coding Hypothesis The temporal coding hypothesis (TCH) [7] posits that temporal contiguity is sufficient to produce an association between stimuli. [sent-69, score-0.478]
29 Instead, information about the temporal relationships among stimuli is encoded implicitly and automatically in the memory representation of the trial. [sent-72, score-0.515]
30 Most importantly, TCH claims that memory representations of trials with similar stimuli become integrated in such a way as to preserve the relative temporal information [3]. [sent-73, score-0.556]
31 If we apply the concept of memory integration to Experiment 1, we get the memory representation, C ---+ T + US. [sent-74, score-0.234]
32 Integrating the hypothesized memory representations of the two phases of Experiment 2 results in: A) T ---+ C+US and B) T ---+ C ---+ US. [sent-76, score-0.302]
33 The stimulus C is only predictive ofthe US in variant B, consistent with the experimental findings. [sent-77, score-0.261]
34 For Experiment 3, an integrated memory representation of the two phases produces L+ B ---+ US ---+ X. [sent-78, score-0.261]
35 Thus, the temporal coding hypothesis is able to account for the results of each of the three experiments by associating stimuli with a timeline. [sent-80, score-0.515]
36 3 A Computational Model of Temporal Coding A straightforward formalization of a timeline is a Markov chain of states. [sent-81, score-0.167]
37 For this initial version of our model, state transitions within the chain are fixed and deterministic. [sent-82, score-0.353]
38 Each state represents one instant of time, and at each timestep a transition is made to the next state in the chain. [sent-83, score-0.38]
39 Multiple timelines (or Markov chains) emanate from a single holding state. [sent-85, score-0.327]
40 The transitions out of this holding state are the only probabilistic and adaptive transitions in the simplified model. [sent-86, score-0.451]
41 These transition probabilities represent the frequency with which the timelines are experienced. [sent-87, score-0.277]
42 Our goal is to show that our model successfully integrates the timelines of the two training phases of each experiment. [sent-89, score-0.403]
43 In the context of a collection of Markov chains, integrating timelines amounts to both phases of training becoming associated with a single Markov chain. [sent-90, score-0.404]
44 Figure 1: A depiction of the state and observation structure of the model. [sent-93, score-0.218]
45 Shown are two timelines, one headed by state j and the other headed by state k. [sent-94, score-0.432]
46 State i, the holding state, transitions to states j and k with probabilities aij and aik respectively. [sent-95, score-0.335]
47 Below the timeline representations are a sequence of observations represented here as the symbols T, C and US. [sent-96, score-0.233]
48 The T and C stimuli appear for two time steps each to simulate their presentation for an extended duration in the experiment. [sent-97, score-0.293]
49 During the second phase of the experiments, the second Markov chain (shown in Figure 1 starting with state k) offers an alternative to the chain associated with the first phase of learning. [sent-98, score-0.555]
50 As suggested in Figure 1, associated with each state is a stimulus observation. [sent-100, score-0.422]
51 "Stimulus space" is an n-dimensional continuous space, where n is the number of distinct stimuli that can be observed (tone, light, shock, etc. [sent-101, score-0.242]
52 ) Each state has an expectation concerning the stimuli that should be observed when that state is occupied. [sent-102, score-0.564]
53 The probability density at stimulus observation xt in state i at time tis , where Wi is a mixture coefficient for the two Gaussians associated with state i. [sent-104, score-0.751]
54 The Gaussian means /tiD and /til and variances ufo and ufl are vectors of the same dimension as the stimulus vector xt. [sent-105, score-0.371]
55 Given knowledge of the state, the stimulus components are assumed to be mutually independent (covariance terms are zero). [sent-106, score-0.261]
56 We chose a continuous model of observations over a discrete observation model to capture stimulus generalization effects. [sent-107, score-0.508]
57 For each state, the first Gaussian pdf is non-adaptive, meaning /tiO is fixed about a point in stimulus space representing the absence of stimuli. [sent-109, score-0.299]
58 This mixture of one fixed and one adaptive Gaussian is an approximation to the animal's belief distribution about stimuli, reflecting the observed tolerance animals have to absent expected stimuli. [sent-112, score-0.249]
59 Put another way, animals seem to be less surprised by the absence of an expected stimulus than by the presence of an unexpected stimulus. [sent-113, score-0.369]
60 We assume that knowledge of the current state st is inaccessible to the learner. [sent-114, score-0.217]
61 In the case of a Markov chain, learning with hidden state is exactly the problem of parameter estimation in hidden Markov models. [sent-116, score-0.377]
62 In a model of classical conditioning, this is an unrealistic assumption about animals' memory capabilities. [sent-119, score-0.236]
63 We therefore require an online learning scheme for the hidden Markov model, with only limited memory requirements. [sent-120, score-0.225]
64 It offers the appealing property of combining prior beliefs about the world with current observations through the recursive application of Bayes' theorem, p(Alxt) IX p(xt lx t - 1 , A)p(AIXt - 1 ). [sent-122, score-0.177]
65 Unfortunately, the implementation of exact recursive Bayesian inference for a continuous density hidden Markov model (CDHMM) is computationally intractable. [sent-131, score-0.215]
66 This is a consequence of there being missing data in the form of hidden state. [sent-132, score-0.154]
67 With hidden state, the posterior distribution over the model parameters, after the observation, is given by N p(Alxt) IX LP(xtlst = i, X t - 1 , A)p(st = iIX t - 1 , A)p(AIXt - 1 ), (2) i=1 where we have summed over the N hidden states. [sent-133, score-0.256]
68 where missing data such as the state sequence is taken to be known). [sent-139, score-0.207]
69 Estimating the missing data (hidden state) involves estimating transition probabilities between states, ~0 = Pr(sT = i, ST+1 = jlXt , A), and joint state and mixture component label probabilities ([k = Pr(sT = i, IT = klX t , A). [sent-144, score-0.324]
70 Here zr = k is the mixture component label indicating which Gaussian, k E {a, I}, is the source of the stimulus observation at time T. [sent-145, score-0.377]
71 The forward pass computes the joint probability over state occupancy (taken to be both the state value and the mixture component label) at time T and the sequence of observations up to time T. [sent-148, score-0.617]
72 The backward pass computes the probability of the observations in a memory buffer from time T to the present time t given the state occupancy at time T. [sent-149, score-0.686]
73 The forward and backward passes over state/observation sequences are combined to give an estimate of the state occupancy at time T given the observations up to the present time t. [sent-150, score-0.459]
74 In the simulations reported here the memory buffer was 7 time steps long (t - T = 6). [sent-151, score-0.188]
75 We use the estimates from the forward-backward algorithm together with the observations to update the hyperparameters. [sent-152, score-0.15]
76 For example, "'ij is the number of expected transitions observed from state i to state j, and is used to update the estimate of parameter aij. [sent-154, score-0.431]
77 The hyperparameter Vik estimates the number of stimulus observations in state i credited to Gaussian k , and is used to update the mixture parameter Wi. [sent-155, score-0.631]
78 The remaining hyperparameters 'Ij;, Ă‚Ë˜, and () serve to define the pdfs over Mil and afl' The variable d in the equations below indexes over stimulus dimensions. [sent-156, score-0.402]
79 T_ Wi - 4 v[1- 1 vio + viI -2 Results and Discussion The model contained two timelines (Markov chains). [sent-164, score-0.259]
80 Let i denote the holding state and j, k the initial states of the two chains. [sent-165, score-0.308]
81 The transition probabilities were initialized as aij = aik = 0. [sent-166, score-0.177]
82 The model was run continuously through both phases of the experiments with a random intertrial interval. [sent-174, score-0.184]
83 1 "Qi 0 a: g;1 0 trr trr T C Experiment 1 0 noCR trr (A)C (B)C Experiment 2 0 X B Experiment 3 Figure 2: Results from 20 runs of the model simulation with each experimental paradigm. [sent-178, score-0.286]
84 The CR predictions are the result of the model integrating t he two phases of learning into one t imeline. [sent-183, score-0.225]
85 At the t ime of the presentation of the Phase 2 stimuli, the states forming the timeline describing the Phase 1 pattern of stimuli were judged more likely to have produced the Phase 2 stimuli than states in the other t imeline, which served as a null hypothesis. [sent-184, score-0.695]
86 In another experiment, not shown here , we trained the model on disjoint stimuli in the two phases. [sent-185, score-0.282]
87 We have shown that under the assumption t hat observation probabilities are modeled by a mixture of Gaussians, and a very restrictive state transition structure, a hidden Markov model can integrate the memory representations of similar temporal stimulus patterns. [sent-187, score-1.098]
88 We propose t his model as a mechanism for the integration of memory representations postulated in the Temporal Coding Hypothesis. [sent-189, score-0.198]
89 The current version assumes t hat event chains are long enough to represent an entire trial, but short enough that the model will return to the holding state before the start of the next trial. [sent-191, score-0.384]
90 We are also exploring a generalization of the model to the semi-Markov domain, where state occupancy duration is modeled explicitly as a pdf. [sent-193, score-0.288]
91 Finally, we are experiment ing with mechanisms that allow new chains to be split off from old ones when the model determines that current stimuli differ consistently from t he closest matching t imeline. [sent-195, score-0.433]
92 Fitting stimuli into existing t imelines serves to maximize the likelihood of current observations in light of past experience. [sent-196, score-0.456]
93 But why should animals learn the temporal structure of stimuli as t imelines? [sent-197, score-0.506]
94 A collection of timelines may be a reasonable model of the natural world. [sent-198, score-0.259]
95 If t his is true, t hen learning with such a strong inductive bias may help t he animal to bring experience of related phenomena to bear in novel sit uations- a desirable characteristic for an adaptive system in a changing world. [sent-199, score-0.15]
96 Simultaneous conditioning demonstrated in second-order conditioning: Evidence for similar associative structure in forward and simultaneous conditioning. [sent-210, score-0.371]
97 Conditioned excitation and conditioned inhibition acquired through backward conditioning. [sent-226, score-0.19]
98 Improving generalization for temporal difference learning: the successor representation. [sent-230, score-0.156]
99 On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate. [sent-236, score-0.259]
100 Information and the expression of simultaneous and backward associations: Implications for contiguity theory. [sent-251, score-0.24]
wordName wordTfidf (topN-words)
[('tone', 0.262), ('stimulus', 0.261), ('stimuli', 0.242), ('timelines', 0.219), ('conditioning', 0.217), ('cr', 0.178), ('footshock', 0.164), ('state', 0.161), ('temporal', 0.156), ('phases', 0.144), ('miller', 0.123), ('memory', 0.117), ('td', 0.116), ('phase', 0.112), ('observations', 0.11), ('click', 0.109), ('holding', 0.108), ('hidden', 0.108), ('animals', 0.108), ('markov', 0.106), ('backward', 0.101), ('sec', 0.099), ('cole', 0.095), ('conditioned', 0.089), ('pdfs', 0.087), ('occupancy', 0.087), ('chain', 0.085), ('simultaneous', 0.084), ('barnet', 0.082), ('imeline', 0.082), ('nocr', 0.082), ('suri', 0.082), ('timeline', 0.082), ('trr', 0.082), ('classical', 0.079), ('experiment', 0.076), ('chains', 0.075), ('aij', 0.072), ('buffer', 0.071), ('cdhmm', 0.071), ('associative', 0.07), ('transitions', 0.069), ('coding', 0.068), ('recursive', 0.067), ('oj', 0.064), ('offset', 0.064), ('sutton', 0.062), ('acquire', 0.06), ('animal', 0.06), ('mixture', 0.059), ('transition', 0.058), ('us', 0.057), ('observation', 0.057), ('st', 0.056), ('cs', 0.055), ('afl', 0.055), ('aixt', 0.055), ('alxt', 0.055), ('buzzer', 0.055), ('contiguity', 0.055), ('courville', 0.055), ('forgetting', 0.055), ('headed', 0.055), ('huo', 0.055), ('imelines', 0.055), ('lasted', 0.055), ('mil', 0.055), ('tch', 0.055), ('ufl', 0.055), ('ufo', 0.055), ('hyperparameters', 0.054), ('xt', 0.052), ('presentation', 0.051), ('ik', 0.05), ('light', 0.049), ('account', 0.049), ('followed', 0.048), ('aik', 0.047), ('aaron', 0.047), ('missing', 0.046), ('reinforcement', 0.046), ('phenomena', 0.046), ('immediately', 0.045), ('adaptive', 0.044), ('schultz', 0.043), ('association', 0.043), ('representations', 0.041), ('integrating', 0.041), ('elicit', 0.04), ('pavlovian', 0.04), ('model', 0.04), ('update', 0.04), ('integrate', 0.04), ('states', 0.039), ('sensory', 0.039), ('pass', 0.039), ('strength', 0.039), ('fixed', 0.038), ('rats', 0.038), ('colleagues', 0.038)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999958 123 nips-2001-Modeling Temporal Structure in Classical Conditioning
Author: Aaron C. Courville, David S. Touretzky
Abstract: The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1 , 2, 3], which other models are unable to explain. 1
2 0.14646392 126 nips-2001-Motivated Reinforcement Learning
Author: Peter Dayan
Abstract: The standard reinforcement learning view of the involvement of neuromodulatory systems in instrumental conditioning includes a rather straightforward conception of motivation as prediction of sum future reward. Competition between actions is based on the motivating characteristics of their consequent states in this sense. Substantial, careful, experiments reviewed in Dickinson & Balleine, 12,13 into the neurobiology and psychology of motivation shows that this view is incomplete. In many cases, animals are faced with the choice not between many different actions at a given state, but rather whether a single response is worth executing at all. Evidence suggests that the motivational process underlying this choice has different psychological and neural properties from that underlying action choice. We describe and model these motivational systems, and consider the way they interact.
3 0.14480118 160 nips-2001-Reinforcement Learning and Time Perception -- a Model of Animal Experiments
Author: Jonathan L. Shapiro, J. Wearden
Abstract: Animal data on delayed-reward conditioning experiments shows a striking property - the data for different time intervals collapses into a single curve when the data is scaled by the time interval. This is called the scalar property of interval timing. Here a simple model of a neural clock is presented and shown to give rise to the scalar property. The model is an accumulator consisting of noisy, linear spiking neurons. It is analytically tractable and contains only three parameters. When coupled with reinforcement learning it simulates peak procedure experiments, producing both the scalar property and the pattern of single trial covariances. 1
4 0.14145924 183 nips-2001-The Infinite Hidden Markov Model
Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen
Abstract: We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite— consider, for example, symbols being possible words appearing in English text.
5 0.14097348 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data
Author: Jens Kohlmorgen, Steven Lemm
Abstract: We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. 1
6 0.13413629 174 nips-2001-Spike timing and the coding of naturalistic sounds in a central auditory area of songbirds
7 0.12958777 43 nips-2001-Bayesian time series classification
8 0.12387697 48 nips-2001-Characterizing Neural Gain Control using Spike-triggered Covariance
9 0.1109118 87 nips-2001-Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway
10 0.10749594 3 nips-2001-ACh, Uncertainty, and Cortical Inference
11 0.10572093 115 nips-2001-Linear-time inference in Hierarchical HMMs
12 0.10095804 152 nips-2001-Prodding the ROC Curve: Constrained Optimization of Classifier Performance
13 0.095666513 151 nips-2001-Probabilistic principles in unsupervised learning of visual structure: human data and a model
14 0.08787591 145 nips-2001-Perceptual Metamers in Stereoscopic Vision
15 0.087418318 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
16 0.086396277 131 nips-2001-Neural Implementation of Bayesian Inference in Population Codes
17 0.08524248 18 nips-2001-A Rational Analysis of Cognitive Control in a Speeded Discrimination Task
18 0.084645718 11 nips-2001-A Maximum-Likelihood Approach to Modeling Multisensory Enhancement
19 0.08218877 161 nips-2001-Reinforcement Learning with Long Short-Term Memory
20 0.079825744 148 nips-2001-Predictive Representations of State
topicId topicWeight
[(0, -0.237), (1, -0.19), (2, 0.014), (3, -0.044), (4, -0.155), (5, -0.013), (6, 0.079), (7, 0.016), (8, -0.038), (9, -0.027), (10, -0.036), (11, 0.156), (12, -0.177), (13, 0.068), (14, -0.013), (15, -0.174), (16, 0.096), (17, 0.136), (18, 0.242), (19, 0.068), (20, -0.08), (21, -0.105), (22, -0.049), (23, -0.018), (24, -0.016), (25, 0.009), (26, 0.03), (27, 0.072), (28, -0.027), (29, -0.117), (30, 0.055), (31, 0.075), (32, -0.062), (33, 0.016), (34, 0.03), (35, -0.143), (36, 0.051), (37, -0.028), (38, -0.019), (39, -0.027), (40, 0.052), (41, 0.01), (42, 0.018), (43, -0.012), (44, -0.004), (45, -0.006), (46, -0.01), (47, 0.033), (48, -0.08), (49, -0.063)]
simIndex simValue paperId paperTitle
same-paper 1 0.96210647 123 nips-2001-Modeling Temporal Structure in Classical Conditioning
Author: Aaron C. Courville, David S. Touretzky
Abstract: The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1 , 2, 3], which other models are unable to explain. 1
2 0.6876483 18 nips-2001-A Rational Analysis of Cognitive Control in a Speeded Discrimination Task
Author: Michael C. Mozer, Michael D. Colagrosso, David E. Huber
Abstract: We are interested in the mechanisms by which individuals monitor and adjust their performance of simple cognitive tasks. We model a speeded discrimination task in which individuals are asked to classify a sequence of stimuli (Jones & Braver, 2001). Response conflict arises when one stimulus class is infrequent relative to another, resulting in more errors and slower reaction times for the infrequent class. How do control processes modulate behavior based on the relative class frequencies? We explain performance from a rational perspective that casts the goal of individuals as minimizing a cost that depends both on error rate and reaction time. With two additional assumptions of rationality—that class prior probabilities are accurately estimated and that inference is optimal subject to limitations on rate of information transmission—we obtain a good fit to overall RT and error data, as well as trial-by-trial variations in performance. Consider the following scenario: While driving, you approach an intersection at which the traffic light has already turned yellow, signaling that it is about to turn red. You also notice that a car is approaching you rapidly from behind, with no indication of slowing. Should you stop or speed through the intersection? The decision is difficult due to the presence of two conflicting signals. Such response conflict can be produced in a psychological laboratory as well. For example, Stroop (1935) asked individuals to name the color of ink on which a word is printed. When the words are color names incongruous with the ink color— e.g., “blue” printed in red—reaction times are slower and error rates are higher. We are interested in the control mechanisms underlying performance of high-conflict tasks. Conflict requires individuals to monitor and adjust their behavior, possibly responding more slowly if errors are too frequent. In this paper, we model a speeded discrimination paradigm in which individuals are asked to classify a sequence of stimuli (Jones & Braver, 2001). The stimuli are letters of the alphabet, A–Z, presented in rapid succession. In a choice task, individuals are asked to press one response key if the letter is an X or another response key for any letter other than X (as a shorthand, we will refer to non-X stimuli as Y). In a go/no-go task, individuals are asked to press a response key when X is presented and to make no response otherwise. We address both tasks because they elicit slightly different decision-making behavior. In both tasks, Jones and Braver (2001) manipulated the relative frequency of the X and Y stimuli; the ratio of presentation frequency was either 17:83, 50:50, or 83:17. Response conflict arises when the two stimulus classes are unbalanced in frequency, resulting in more errors and slower reaction times. For example, when X’s are frequent but Y is presented, individuals are predisposed toward producing the X response, and this predisposition must be overcome by the perceptual evidence from the Y. Jones and Braver (2001) also performed an fMRI study of this task and found that anterior cingulate cortex (ACC) becomes activated in situations involving response conflict. Specifically, when one stimulus occurs infrequently relative to the other, event-related fMRI response in the ACC is greater for the low frequency stimulus. Jones and Braver also extended a neural network model of Botvinick, Braver, Barch, Carter, and Cohen (2001) to account for human performance in the two discrimination tasks. The heart of the model is a mechanism that monitors conflict—the posited role of the ACC—and adjusts response biases accordingly. In this paper, we develop a parsimonious alternative account of the role of the ACC and of how control processes modulate behavior when response conflict arises. 1 A RATIONAL ANALYSIS Our account is based on a rational analysis of human cognition, which views cognitive processes as being optimized with respect to certain task-related goals, and being adaptive to the structure of the environment (Anderson, 1990). We make three assumptions of rationality: (1) perceptual inference is optimal but is subject to rate limitations on information transmission, (2) response class prior probabilities are accurately estimated, and (3) the goal of individuals is to minimize a cost that depends both on error rate and reaction time. The heart of our account is an existing probabilistic model that explains a variety of facilitation effects that arise from long-term repetition priming (Colagrosso, in preparation; Mozer, Colagrosso, & Huber, 2000), and more broadly, that addresses changes in the nature of information transmission in neocortex due to experience. We give a brief overview of this model; the details are not essential for the present work. The model posits that neocortex can be characterized by a collection of informationprocessing pathways, and any act of cognition involves coordination among pathways. To model a simple discrimination task, we might suppose a perceptual pathway to map the visual input to a semantic representation, and a response pathway to map the semantic representation to a response. The choice and go/no-go tasks described earlier share a perceptual pathway, but require different response pathways. The model is framed in terms of probability theory: pathway inputs and outputs are random variables and microinference in a pathway is carried out by Bayesian belief revision. To elaborate, consider a pathway whose input at time is a discrete random variable, denoted , which can assume values corresponding to alternative input states. Similarly, the output of the pathway at time is a discrete random variable, denoted , which can assume values . For example, the input to the perceptual pathway in the discrimination task is one of visual patterns corresponding to the letters of the alphabet, and the output is one of letter identities. (This model is highly abstract: the visual patterns are enumerated, but the actual pixel patterns are not explicitly represented in the model. Nonetheless, the similarity structure among inputs can be captured, but we skip a discussion of this issue because it is irrelevant for the current work.) To present a particular input alternative, , to the model for time steps, we clamp for . The model computes a probability distribution over given , i.e., P . ¡ # 4 0 ©2' & 0 ' ! 1)(
3 0.67424393 126 nips-2001-Motivated Reinforcement Learning
Author: Peter Dayan
Abstract: The standard reinforcement learning view of the involvement of neuromodulatory systems in instrumental conditioning includes a rather straightforward conception of motivation as prediction of sum future reward. Competition between actions is based on the motivating characteristics of their consequent states in this sense. Substantial, careful, experiments reviewed in Dickinson & Balleine, 12,13 into the neurobiology and psychology of motivation shows that this view is incomplete. In many cases, animals are faced with the choice not between many different actions at a given state, but rather whether a single response is worth executing at all. Evidence suggests that the motivational process underlying this choice has different psychological and neural properties from that underlying action choice. We describe and model these motivational systems, and consider the way they interact.
4 0.65083796 160 nips-2001-Reinforcement Learning and Time Perception -- a Model of Animal Experiments
Author: Jonathan L. Shapiro, J. Wearden
Abstract: Animal data on delayed-reward conditioning experiments shows a striking property - the data for different time intervals collapses into a single curve when the data is scaled by the time interval. This is called the scalar property of interval timing. Here a simple model of a neural clock is presented and shown to give rise to the scalar property. The model is an accumulator consisting of noisy, linear spiking neurons. It is analytically tractable and contains only three parameters. When coupled with reinforcement learning it simulates peak procedure experiments, producing both the scalar property and the pattern of single trial covariances. 1
5 0.61735708 3 nips-2001-ACh, Uncertainty, and Cortical Inference
Author: Peter Dayan, Angela J. Yu
Abstract: Acetylcholine (ACh) has been implicated in a wide variety of tasks involving attentional processes and plasticity. Following extensive animal studies, it has previously been suggested that ACh reports on uncertainty and controls hippocampal, cortical and cortico-amygdalar plasticity. We extend this view and consider its effects on cortical representational inference, arguing that ACh controls the balance between bottom-up inference, influenced by input stimuli, and top-down inference, influenced by contextual information. We illustrate our proposal using a hierarchical hidden Markov model.
6 0.57983547 11 nips-2001-A Maximum-Likelihood Approach to Modeling Multisensory Enhancement
7 0.56330377 183 nips-2001-The Infinite Hidden Markov Model
8 0.55097628 115 nips-2001-Linear-time inference in Hierarchical HMMs
9 0.54078627 148 nips-2001-Predictive Representations of State
10 0.51361704 174 nips-2001-Spike timing and the coding of naturalistic sounds in a central auditory area of songbirds
11 0.51060897 48 nips-2001-Characterizing Neural Gain Control using Spike-triggered Covariance
12 0.50543702 43 nips-2001-Bayesian time series classification
13 0.48834351 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data
14 0.48069972 145 nips-2001-Perceptual Metamers in Stereoscopic Vision
15 0.436584 161 nips-2001-Reinforcement Learning with Long Short-Term Memory
16 0.42526165 87 nips-2001-Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway
17 0.42359066 14 nips-2001-A Neural Oscillator Model of Auditory Selective Attention
18 0.40859011 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's
19 0.38129041 151 nips-2001-Probabilistic principles in unsupervised learning of visual structure: human data and a model
20 0.37785023 57 nips-2001-Correlation Codes in Neuronal Populations
topicId topicWeight
[(14, 0.023), (16, 0.214), (17, 0.028), (19, 0.026), (27, 0.094), (30, 0.088), (38, 0.014), (52, 0.012), (59, 0.023), (63, 0.016), (72, 0.056), (79, 0.073), (83, 0.048), (91, 0.18)]
simIndex simValue paperId paperTitle
same-paper 1 0.8739208 123 nips-2001-Modeling Temporal Structure in Classical Conditioning
Author: Aaron C. Courville, David S. Touretzky
Abstract: The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1 , 2, 3], which other models are unable to explain. 1
2 0.84036499 97 nips-2001-Information-Geometrical Significance of Sparsity in Gallager Codes
Author: Toshiyuki Tanaka, Shiro Ikeda, Shun-ichi Amari
Abstract: We report a result of perturbation analysis on decoding error of the belief propagation decoder for Gallager codes. The analysis is based on information geometry, and it shows that the principal term of decoding error at equilibrium comes from the m-embedding curvature of the log-linear submanifold spanned by the estimated pseudoposteriors, one for the full marginal, and K for partial posteriors, each of which takes a single check into account, where K is the number of checks in the Gallager code. It is then shown that the principal error term vanishes when the parity-check matrix of the code is so sparse that there are no two columns with overlap greater than 1. 1
3 0.73104429 183 nips-2001-The Infinite Hidden Markov Model
Author: Matthew J. Beal, Zoubin Ghahramani, Carl E. Rasmussen
Abstract: We show that it is possible to extend hidden Markov models to have a countably infinite number of hidden states. By using the theory of Dirichlet processes we can implicitly integrate out the infinitely many transition parameters, leaving only three hyperparameters which can be learned from data. These three hyperparameters define a hierarchical Dirichlet process capable of capturing a rich set of transition dynamics. The three hyperparameters control the time scale of the dynamics, the sparsity of the underlying state-transition matrix, and the expected number of distinct hidden states in a finite sequence. In this framework it is also natural to allow the alphabet of emitted symbols to be infinite— consider, for example, symbols being possible words appearing in English text.
4 0.72716165 7 nips-2001-A Dynamic HMM for On-line Segmentation of Sequential Data
Author: Jens Kohlmorgen, Steven Lemm
Abstract: We propose a novel method for the analysis of sequential data that exhibits an inherent mode switching. In particular, the data might be a non-stationary time series from a dynamical system that switches between multiple operating modes. Unlike other approaches, our method processes the data incrementally and without any training of internal parameters. We use an HMM with a dynamically changing number of states and an on-line variant of the Viterbi algorithm that performs an unsupervised segmentation and classification of the data on-the-fly, i.e. the method is able to process incoming data in real-time. The main idea of the approach is to track and segment changes of the probability density of the data in a sliding window on the incoming data stream. The usefulness of the algorithm is demonstrated by an application to a switching dynamical system. 1
5 0.7203505 40 nips-2001-Batch Value Function Approximation via Support Vectors
Author: Thomas G. Dietterich, Xin Wang
Abstract: We present three ways of combining linear programming with the kernel trick to find value function approximations for reinforcement learning. One formulation is based on SVM regression; the second is based on the Bellman equation; and the third seeks only to ensure that good moves have an advantage over bad moves. All formulations attempt to minimize the number of support vectors while fitting the data. Experiments in a difficult, synthetic maze problem show that all three formulations give excellent performance, but the advantage formulation is much easier to train. Unlike policy gradient methods, the kernel methods described here can easily 'adjust the complexity of the function approximator to fit the complexity of the value function. 1
6 0.71832037 161 nips-2001-Reinforcement Learning with Long Short-Term Memory
7 0.71812296 100 nips-2001-Iterative Double Clustering for Unsupervised and Semi-Supervised Learning
8 0.71748817 160 nips-2001-Reinforcement Learning and Time Perception -- a Model of Animal Experiments
9 0.71721601 162 nips-2001-Relative Density Nets: A New Way to Combine Backpropagation with HMM's
10 0.71593744 169 nips-2001-Small-World Phenomena and the Dynamics of Information
11 0.71072048 3 nips-2001-ACh, Uncertainty, and Cortical Inference
12 0.70759779 89 nips-2001-Grouping with Bias
13 0.70732057 56 nips-2001-Convolution Kernels for Natural Language
14 0.70726049 66 nips-2001-Efficiency versus Convergence of Boolean Kernels for On-Line Learning Algorithms
15 0.70684928 182 nips-2001-The Fidelity of Local Ordinal Encoding
16 0.70676738 13 nips-2001-A Natural Policy Gradient
17 0.70653474 132 nips-2001-Novel iteration schemes for the Cluster Variation Method
18 0.70469505 95 nips-2001-Infinite Mixtures of Gaussian Process Experts
19 0.70200193 39 nips-2001-Audio-Visual Sound Separation Via Hidden Markov Models
20 0.70176053 150 nips-2001-Probabilistic Inference of Hand Motion from Neural Activity in Motor Cortex