nips nips2001 nips2001-161 nips2001-161-reference knowledge-graph by maker-knowledge-mining

161 nips-2001-Reinforcement Learning with Long Short-Term Memory


Source: pdf

Author: Bram Bakker

Abstract: This paper presents reinforcement learning with a Long ShortTerm Memory recurrent neural network: RL-LSTM. Model-free RL-LSTM using Advantage(,x) learning and directed exploration can solve non-Markovian tasks with long-term dependencies between relevant events. This is demonstrated in a T-maze task, as well as in a difficult variation of the pole balancing task. 1


reference text

[1] B. Bakker. Reinforcement learning with LSTM in non-Markovian tasks with longterm dependencies. Technical report, Dept. of Psychology, Leiden University, 2001.

[2] L. Chrisman. Reinforcement learning with perceptual aliasing: The perceptual distinctions approach. In Proc. of the 10th National Conf. on AI AAAI Press, 1992.

[3] F. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: Continual prediction with LSTM. Neural Computation, 12 (10):2451-2471, 2000.

[4] M. E. Harmon and L. C. Baird. Multi-player residual advantage learning with general function approximation. Technical report, Wright-Patterson Air Force Base, 1996.

[5] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9 (8):1735-1780, 1997.

[6] L.-J. Lin and T. Mitchell. Reinforcement learning with hidden states. In Proc. of the 2nd Int. Conf. on Simulation of Adaptive Behavior. MIT Press, 1993.

[7] J. Loch and S. Singh. Using eligibility traces to find the best memoryless policy in Partially Observable Markov Decision Processes. In Proc. of ICML'98, 1998.

[8] R. A. McCallum. Learning to use selective attention and short-term memory in sequential tasks. In Proc. 4th Int. Conf. on Simulation of Adaptive Behavior, 1996.

[9] L. Peshkin, N. Meuleau, and L. P. Kaelbling. Learning policies with external memory. In Proc. of the 16th Int. Conf. on Machine Learning, 1999.

[10] J. Schmidhuber. Networks adjusting networks. In Proc. of Distributed Adaptive Neural Information Processing, St. Augustin, 1990.

[11] J. Schmidhuber. Curious model-building control systems. In Proc. of IJCNN'91, volume 2, pages 1458-1463, Singapore, 1991.

[12] R. S. Sutton and A. G. Barto. Reinforcement learning: An introduction. MIT Press, Cambridge; MA, 1998.