nips nips2011 nips2011-41 nips2011-41-reference knowledge-graph by maker-knowledge-mining

41 nips-2011-Autonomous Learning of Action Models for Planning


Source: pdf

Author: Neville Mehta, Prasad Tadepalli, Alan Fern

Abstract: This paper introduces two new frameworks for learning action models for planning. In the mistake-bounded planning framework, the learner has access to a planner for the given model representation, a simulator, and a planning problem generator, and aims to learn a model with at most a polynomial number of faulty plans. In the planned exploration framework, the learner does not have access to a problem generator and must instead design its own problems, plan for them, and converge with at most a polynomial number of planning attempts. The paper reduces learning in these frameworks to concept learning with one-sided error and provides algorithms for successful learning in both frameworks. A specific family of hypothesis spaces is shown to be efficiently learnable in both the frameworks. 1


reference text

[1] R. Brafman and M. Tennenholtz. R-MAX — A General Polynomial Time Algorithm for NearOptimal Reinforcement Learning. Journal of Machine Learning Research, 3:213–231, 2002.

[2] M. Kearns and L. Valiant. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata. In Annual ACM Symposium on Theory of Computing, 1989.

[3] L. Li. A Unifying Framework for Computational Reinforcement Learning Theory. PhD thesis, Rutgers University, 2009.

[4] L. Li, M. Littman, and T. Walsh. Knows What It Knows: A Framework for Self-Aware Learning. In ICML, 2008.

[5] N. Littlestone. Mistake Bounds and Logarithmic Linear-Threshold Learning Algorithms. PhD thesis, U.C. Santa Cruz, 1989.

[6] B. Marthi, S. Russell, and J. Wolfe. Angelic Semantics for High-Level Actions. In ICAPS, 2007.

[7] B. K. Natarajan. On Learning Boolean Functions. In Annual ACM Symposium on Theory of Computing, 1987.

[8] T. Walsh and M. Littman. Efficient Learning of Action Schemas and Web-Service Descriptions. In AAAI, 2008. 9