nips nips2003 nips2003-62 nips2003-62-reference knowledge-graph by maker-knowledge-mining

62 nips-2003-Envelope-based Planning in Relational MDPs

Source: pdf

Author: Natalia H. Gardiol, Leslie P. Kaelbling

Abstract: A mobile robot acting in the world is faced with a large amount of sensory data and uncertainty in its action outcomes. Indeed, almost all interesting sequential decision-making domains involve large state spaces and large, stochastic action sets. We investigate a way to act intelligently as quickly as possible in domains where ﬁnding a complete policy would take a hopelessly long time. This approach, Relational Envelopebased Planning (REBP) tackles large, noisy problems along two axes. First, describing a domain as a relational MDP (instead of as an atomic or propositionally-factored MDP) allows problem structure and dynamics to be captured compactly with a small set of probabilistic, relational rules. Second, an envelope-based approach to planning lets an agent begin acting quickly within a restricted part of the full state space and to judiciously expand its envelope as resources permit. 1

reference text

[1] Avrim L. Blum and Merrick L. Furst. Fast plannning through planning graph analysis. Artiﬁcial Intelligence, 90:281–300, 1997.

[2] Avrim L. Blum and John C. Langford. Probabilistic planning in the graphplan framework. In 5th European Conference on Planning, 1999.

[3] Craig Boutilier, Raymond Reiter, and Bob Price. Symbolic dynamic programming for ﬁrstorder MDPs. In IJCAI, 2001.

[4] Thomas Dean, Leslie Pack Kaelbling, Jak Kirman, and Ann Nicholson. Planning under time constraints in stochastic domains. Artiﬁcial Intelligence, 76, 1995.

[5] Kurt Driessens, Jan Ramon, and Hendrik Blockeel. Speeding up relational reinforcement learning through the use of an incremental ﬁrst order decision tree learner. In European Conference on Machine Learning, 2001.

[6] B. Cenk Gazen and Craig A. Knoblock. Combining the expressivity of UCPOP with the efﬁciency of graphplan. In Proc. European Conference on Planning (ECP-97), 1997.

[7] H. Geffner and B. Bonet. High-level planning and control with incomplete information using POMDPs. In Fall AAAI Symposium on Cognitive Robotics, 1998.

[8] C. Guestrin, D. Koller, C. Gearhart, and N. Kanodia. Generalizing plans to new environments in relational MDPs. In International Joint Conference on Artiﬁcial Intelligence, 2003.

[9] Jesse Hoey, Robert St-Aubin, Alan Hu, and Craig Boutilier. Spudd: Stochastic planning using decision diagrams. In Fifteenth Conference on Uncertainty in Artiﬁcial Intelligence, 1999.

[10] J. Koehler, B. Nebel, J. Hoffmann, and Y. Dimopoulos. Extending planning graphs to an ADL subset. In Proc. European Conference on Planning (ECP-97), 1997.

[11] B. Nebel, J. Koehler, and Y. Dimopoulos. Ignoring irrelevant facts and operators in plan generation. In Proc. European Conference on Planning (ECP-97), 1997.

[12] Daniel S. Weld. Recent advances in AI planning. AI Magazine, 20(2):93–123, 1999.

[13] Daniel S. Weld, Corin R. Anderson, and David E. Smith. Extending graphplan to handle uncertainty and sensing actions. In Proceedings of AAAI ’98, 1998.

[14] SungWook Yoon, Alan Fern, and Robert Givan. Inductive policy selection for ﬁrst-order MDPs. In 18th International Conference on Uncertainty in Artiﬁcial Intelligence, 2002.