nips nips2007 nips2007-100 nips2007-100-reference knowledge-graph by maker-knowledge-mining

100 nips-2007-Hippocampal Contributions to Control: The Third Way


Source: pdf

Author: Máté Lengyel, Peter Dayan

Abstract: Recent experimental studies have focused on the specialization of different neural structures for different types of instrumental behavior. Recent theoretical work has provided normative accounts for why there should be more than one control system, and how the output of different controllers can be integrated. Two particlar controllers have been identified, one associated with a forward model and the prefrontal cortex and a second associated with computationally simpler, habitual, actor-critic methods and part of the striatum. We argue here for the normative appropriateness of an additional, but so far marginalized control system, associated with episodic memory, and involving the hippocampus and medial temporal cortices. We analyze in depth a class of simple environments to show that episodic control should be useful in a range of cases characterized by complexity and inferential noise, and most particularly at the very early stages of learning, long before habitization has set in. We interpret data on the transfer of control from the hippocampus to the striatum in the light of this hypothesis. 1


reference text

[1] Dudai, Y. & Carruthers, M. The Janus face of Mnemosyne. Nature 434, 567 (2005).

[2] K´ li, S. & Dayan, P. Off-line replay maintains declarative memories in a model of hippocampala neocortical interactions. Nat. Neurosci. 7, 286–294 (2004).

[3] McClelland, J.L., McNaughton, B.L. & O’Reilly, R.C. Why there are complementary learning systems in the hippocampus and neocortex: insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev. 102, 419–457 (1995).

[4] White, N.M. & McDonald, R.J. Multiple parallel memory systems in the brain of the rat. Neurobiol Learn Mem 77, 125–184 (2002).

[5] Daw, N.D., Niv, Y. & Dayan, P. Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nat. Neurosci. 8, 1704–1711 (2005).

[6] Sutton, R.S. & Barto, A.G. Reinforcement Learning (MIT Press, 1998).

[7] Watkins, C.J.C.H. Learning from Delayed Rewards. PhD thesis, Cambridge University, (1989).

[8] Lengyel, M. & Dayan, P. Uncertainty, phase and oscillatory hippocampal recall. in Advances in Neural Information Processing Systems 19 (eds. Sch¨ lkopf, B., Platt, J. & Hoffman, T.) 833–840 (MIT Press, o Cambridge, MA, 2007).

[9] Packard, M.G. & McGaugh, J.L. Double dissociation of fornix and caudate nucleus lesions on acquisition of two water maze tasks: further evidence for multiple memory systems. Behav. Neurosci. 106, 439–446 (1992).

[10] Poldrack, R.A. et al. Interactive memory systems in the human brain. Nature 414, 546–550 (2001).

[11] Kearns, M. & Singh, S. Finite-sample convergence rates for Q-learning and indirect algorithms. in Advances in Neural Information Processing Systems Vol. 11 (eds. Kearns, M.S., Solla, S.A. & Cohn, D.A.), Vol. 11, 996–1002 (MIT Press, Cambridge, MA, 1999).

[12] Owen, D.B. & Steck, G.P. Moments of order statistics from the equicorrelated multivariate normal distribution. Ann Math Stat 33, 1286–1291 (1962).

[13] Kording, K.P., Tenenbaum, J.B. & Shadmehr, R. The dynamics of memory as a consequence of optimal adaptation to a changing body. Nat. Neurosci. 10, 779–786 (2007).

[14] Poldrack, R.A. & Rodriguez, P. How do memory systems interact? Evidence from human classification learning. Neurobiol Learn Mem 82, 324–332 (2004).

[15] Clayton, N.S. & Dickinson, A. Episodic-like memory during cache recovery by scrub jays. Nature 395, 272–274 (1998).

[16] Clayton, N.S., Dally, J.M. & Emery, N.J. Social cognition by food-caching corvids. the western scrub-jay as a natural psychologist. Philos. Trans. R. Soc. Lond. B Biol. Sci. 362, 507–522 (2007).

[17] Stanfill, C. & Waltz, D. Toward memory-based reasoning. Communications of the ACM 29, 1213–1228 (1986).

[18] Hintzman, D.L. MINERVA 2: A simulation model of human memory. Behav Res Methods Instrum Comput 16, 96–101 (1984). 8