nips nips2004 nips2004-88 nips2004-88-reference knowledge-graph by maker-knowledge-mining

88 nips-2004-Intrinsically Motivated Reinforcement Learning


Source: pdf

Author: Nuttapong Chentanez, Andrew G. Barto, Satinder P. Singh

Abstract: Psychologists call behavior intrinsically motivated when it is engaged in for its own sake rather than as a step toward solving a specific problem of clear practical value. But what we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. In this paper we present initial results from a computational study of intrinsically motivated reinforcement learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. 1


reference text

[1] A. G. Barto, S. Singh, and N. Chentanez. Intrinsically motivated learning of hierarchical collections of skills. In Proceedings of the 3rd International Conference on Developmental Learning (ICDL ’04), LaJolla CA, 2004.

[2] P. Dayan and B. W. Balleine. Reward, motivation and reinforcement learning. Neuron, 36:285– 298, 2002.

[3] S. Kakade and P. Dayan. Dopamine: Generalization and bonuses. Neural Networks, 15:549– 559, 2002.

[4] F. Kaplan and P.-Y. Oudeyer. Motivational principles for visual know-how development. In C. G. Prince, L. Berthouze, H. Kozima, D. Bullock, G. Stojanov, and C. Balkenius, editors, Proceedings of the Third International Workshop on Epigenetic Robotics : Modeling Cognitive Development in Robotic Systems, pages 73–80, Edinburgh, Scotland, 2003. Lund University Cognitive Studies.

[5] A. McGovern. Autonomous Discovery of Temporal Abstractions from Interaction with An Environment. PhD thesis, University of Massachusetts, 2002.

[6] A. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. In Proceedings of the Sixteenth ICML. Morgan Kaufmann, 1999.

[7] P. Reed, C. Mitchell, and T. Nokes. Intrinsic reinforcing properties of putatively neutral stimuli in an instrumental two-lever discrimination task. Animal Learning and Behavior, 24:38–45, 1996.

[8] J. Schmidhuber. A possibility for implementing curiosity and boredom in model-building neural controllers. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, pages 222–227, Cambridge, MA, 1991. MIT Press.

[9] W. Schultz. Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80:1– 27, 1998.

[10] R. S. Sutton. Integrated modeling and control based on reinforcement learning and dynamic programming. In Proceedings of NIPS, pages 471–478, San Mateo, CA, 1991.

[11] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.

[12] R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181–211, 1999.

[13] J. Wang, J. McClelland, A. Pentland, O. Sporns, I. Stockman, M. Sur, and E. Thelen. Autonomous mental develoopment by robots and animals. Science, 291:599–600, 2001.

[14] R. W. White. Motivation reconsidered: The concept of competence. Psychological Review, 66:297–333, 1959.