nips nips2011 nips2011-163 nips2011-163-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jaedeug Choi, Kee-eung Kim
Abstract: The difficulty in inverse reinforcement learning (IRL) arises in choosing the best reward function since there are typically an infinite number of reward functions that yield the given behaviour data as optimal. Using a Bayesian framework, we address this challenge by using the maximum a posteriori (MAP) estimation for the reward function, and show that most of the previous IRL algorithms can be modeled into our framework. We also present a gradient method for the MAP estimation based on the (sub)differentiability of the posterior distribution. We show the effectiveness of our approach by comparing the performance of the proposed method to those of the previous algorithms. 1
[1] S. Russell. Learning agents for uncertain environments (extended abstract). In Proceedings of COLT, 1998.
[2] P. R. Montague and G. S. Berns. Neural economics and the biological substrates of valuation. Neuron, 36(2), 2002.
[3] B. D. Argall, S. Chernova, M. Veloso, and B. Browning. A survey of robot learning from demonstration. Robotics and Autonomous Systems, 57(5), 2009.
[4] Y. Niv. Reinforcement learning in the brain. Journal of Mathematical Psychology, 53(3), 2009.
[5] E. Hopkins. Adaptive learning models of consumer behavior. Journal of Economic Behavior and Organization, 64(3–4), 2007.
[6] A. Y. Ng and S. Russell. Algorithms for inverse reinforcement learning. In Proceedings of ICML, 2000.
[7] P. Abbeel and A. Y. Ng. Apprenticeship learning via inverse reinforcement learning. In Proceedings of ICML, 2004.
[8] N. D. Ratliff, J. A. Bagnell, and M. A. Zinkevich. Maximum margin planning. In Proceedings of ICML, 2006.
[9] G. Neu and C. Szepesv´ ri. Apprenticeship learning using inverse reinforcement learning and gradient a methods. In Proceedings of UAI, 2007.
[10] U. Syed and R. E. Schapire. A game-theoretic approach to apprenticeship learning. In Proceedings of NIPS, 2008.
[11] B. D. Ziebart, A. Maas, J. A. Bagnell, and A. K. Dey. Maximum entropy inverse reinforcement learning. In Proceedings of AAAI, 2008.
[12] G. Neu and C. Szepesv´ ri. Training parsers by inverse reinforcement learning. Machine Learning, 77(2), a 2009.
[13] D. Ramachandran and E. Amir. Bayesian inverse reinforcement learning. In Proceedings of IJCAI, 2007.
[14] A. Boularias and B. Chaib-Draa. Bootstrapping apprenticeship learning. In Proceedings of NIPS, 2010.
[15] J. Choi and K. Kim. Inverse reinforcement learning in partially observable environments. In Proceedings of IJCAI, 2009. 9