nips nips2000 nips2000-28 nips2000-28-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Christian R. Shelton
Abstract: For many problems which would be natural for reinforcement learning, the reward signal is not a single scalar value but has multiple scalar components. Examples of such problems include agents with multiple goals and agents with multiple users. Creating a single reward value by combining the multiple components can throwaway vital information and can lead to incorrect solutions. We describe the multiple reward source problem and discuss the problems with applying traditional reinforcement learning. We then present an new algorithm for finding a solution and results on simulated environments.
[1] 1. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Froc. of the 15th International Con! on Machine Learning, pages 242- 250, 1998.
[2] C. L. Isbell, C. R. Shelton, M. Kearns, S. Singh, and P. Stone. A social reinforcement learning agent. 2000. submitted to Autonomous Agents 2001.
[3] 1. Karlsson. Learning to Solve Multiple Goals. PhD thesis, University of Rochester, 1997.
[4] M. Kearns, Y. Mansouor, and S. Singh. Fast planning in stochastic games. In Proc. of the 16th Conference on Uncertainty in Artificial Intelligence , 2000.
[5] M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proc. of the 11th International Conference on Machine Learning, pages 157-163, 1994.
[6] G. Owen. Game Theory. Academic Press, UK, 1995.
[7] S. Singh, M. Kearns, and Y. Mansour. Nash convergence of gradient dynamics in general-sum games. In Proc. of the 16th Conference on Uncertainty in Artificial Intelligence , 2000.
[8] S. P. Singh. The efficient learning of multiple task sequences. In NIPS, volume 4, 1992.