jmlr jmlr2008 jmlr2008-49 jmlr2008-49-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sungwook Yoon, Alan Fern, Robert Givan
Abstract: A number of today’s state-of-the-art planners are based on forward state-space search. The impressive performance can be attributed to progress in computing domain independent heuristics that perform well across many domains. However, it is easy to find domains where such heuristics provide poor guidance, leading to planning failure. Motivated by such failures, the focus of this paper is to investigate mechanisms for learning domain-specific knowledge to better control forward search in a given domain. While there has been a large body of work on inductive learning of control knowledge for AI planning, there is a void of work aimed at forward-state-space search. One reason for this may be that it is challenging to specify a knowledge representation for compactly representing important concepts across a wide range of domains. One of the main contributions of this work is to introduce a novel feature space for representing such control knowledge. The key idea is to define features in terms of information computed via relaxed plan extraction, which has been a major source of success for non-learning planners. This gives a new way of leveraging relaxed planning techniques in the context of learning. Using this feature space, we describe three forms of control knowledge—reactive policies (decision list rules and measures of progress) and linear heuristics—and show how to learn them and incorporate them into forward state-space search. Our empirical results show that our approaches are able to surpass state-of-the-art nonlearning planners across a wide range of planning competition domains. Keywords: planning, machine learning, knowledge representation, search
Ricardo Aler, Daniel Borrajo, and Pedro Isasi. Using genetic programming to learn and improve control knowledge. Artificial Intelligence Journal, 141(1-2):29–56, 2002. Jose Luis Ambite and Craig Knoblock. Planning by rewriting. Journal of Artificial Intelligence Research, 15:207–261, 2001. Fahiem Bacchus and Froduald Kabanza. Using temporal logics to express search control knowledge for planning. Artificial Intelligence Journal, 16:123–191, 2000. Andy Barto, Steven Bradtke, and Satinder Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72:81–138, 1995. J. Benton, Minh Do, and Subbarao Kambhampati. Oversubscription planning with metric goals. In Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling, 2006. Dimitri P. Bertsekas and John N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996. Avrim Blum and Merrick Furst. Fast planning through planning graph analysis. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1636–1642, 1995. 715 YOON , F ERN AND G IVAN Blai Bonet and Hector Geffner. Planning as heuristic search. Artificial Intelligence, 129(1-2):5–33, 2001. Adi Botea, Markus Enzenberger, Martin Muller, and Jonathan Schaeffer. Macro-FF: Improving AI planning with automatically learned macro-operators. Journal of Artificial Intelligence Research, 24:581–621, 2005. Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996. Daniel Bryce and Subbarao Kambhampati. Planning graph heuristics for belief space search. Journal of Artificial Intelligence Research, (26):35–99, 2006. Tom Bylander. The computational complexity of propositional STRIPS planning. Artificial Intelligence, 69(1-2):165–204, 1994. Andrew I. Coles and Amanda J. Smith. Marvin: A heuristic search planner with online macroaction learning. Journal of Artificial Intelligence Research, 28:119–156, February 2007. ISSN 11076-9757. Minh Do and Subbarao Kambhampati. Sapa: A scalable multi-objective heuristic metric temporal planner. Journal of Artificial Intelligence Research, (20):155–194, 2003. Saso Dzeroski, Luc De Raedt, and Kurt Driessens. Relational reinforcement learning. Machine Learning Journal, 43:7–52, 2001. Tara A. Estlin and Rymond J. Mooney. Multi-strategy learning of search control for partial-order planning. In Proceedings of 13th National Conference on Artificial Intelligence, 1996. Alan Fern, Sungwook Yoon, and Robert Givan. Approximate policy iteration with a policy language bias: Solving relational markov decision processes. Journal of Artificial Intelligence Research, 25:85–118, 2006. Richard Fikes, Peter Hart, and Nils J. Nilsson. Learning and executing generalized robot plans. Artificial Intelligence Journal, 3(1–3):251–288, 1972. Maria Fox and Derek Long. The automatic inference of state invariants in TIM. Journal of Artificial Intelligence Research, 9:367–421, 1998. Johannes Furnkranz and Peter A. Flach. An analysis of rule evaluation metrics. In Proceedings of Twentieth International Conference on Machine Learning, 2003. Alfonso Gerevini and Lenhart K. Schubert. Discovering state constraints in DISCOPLAN: Some new results. In Proceedings of National Conference on Artificial Intelligence, pages 761–767. AAAI Press / The MIT Press, 2000. William D. Harvey and Matthew L. Ginsberg. Limited discrepancy search. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1995, August 20-25 1995. 716 L EARNING C ONTROL K NOWLEDGE Malte Helmert, Patrik Haslum, and Jrg Hoffmann. Flexible abstraction heuristics for optimal sequential planning. In Proceedings of the 17th International Conference on Automated Planning and Scheduling, 9 2007. J¨ rg Hoffmann and Bernhard Nebel. The FF planning system: Fast plan generation through heuristic o search. Journal of Artificial Intelligence Research, 14:263–302, 2001. Yi-Cheng Huang, Bart Selman, and Henry Kautz. Learning declarative control rules for constraintbased planning. In Proceedings of Seventeenth International Conference on Machine Learning, pages 415–422, 2000. Roni Khardon. Learning action strategies for planning domains. Artificial Intelligence Journal, 113 (1-2):125–148, 1999. Mario Martin and Hector Geffner. Learning generalized policies in planning domains using concept languages. In Proceedings of Seventh International Conference on Principles of Knowledge Representation and Reasoning, 2000. David McAllester and Robert Givan. Taxonomic syntax for first-order inference. Journal of the ACM, 40:246–283, 1993. Steve Minton, Jaime Carbonell, Craig A. Knoblock, Daniel R. Kuokka, Oren Etzioni, and Yolanda Gil. Explanation-based learning: A problem solving perspective. Artificial Intelligence Journal, 40:63–118, 1989. Steven Minton. Quantitative results concerning the utility of explanation-based learning. In Proceedings of National Conference on Artificial Intelligence, 1988. Steven Minton, editor. Machine Learning Methods for Planning. Morgan Kaufmann Publishers, 1993. Dana Nau, Yue Cao, Amnon Lotem, and H´ ctor Munoz-Avila. Shop: Simple hierarchical ordered e planner. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 968–973, 1999. XuanLong Nguyen, Subbarao Kambhampati, and Romeo Sanchez Nigenda. Planning graph as the basis for deriving heuristics for plan synthesis by state space and CSP search. Artificial Intelligence, 135(1-2):73–123, 2002. Aarati Parmar. A logical measure of progress for planning. In Proceedings of Eighteenth National Conference on Artificial Intelligence, pages 498–505. AAAI Press, July 2002. Ronald L. Rivest. Learning decision lists. Machine Learning, 2(3):229–246, 1987. Vincent Vidal. A lookahead strategy for heuristic search planning. In International Conference on Automated Planning and Scheduling, 2004. Yuehua Xu, Alan Fern, and Sungwook Yoon. Discriminative learning of beam-search heuristics for planning. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence, 2007. 717 YOON , F ERN AND G IVAN Sungwook Yoon, Alan Fern, and Robert Givan. Inductive policy selection for first-order MDPs. In Proceedings of Eighteenth Conference in Uncertainty in Artificial Intelligence, 2002. Sungwook Yoon, Alan Fern, and Robert Givan. Learning measures of progress for planning domains. In Proceedings of Twentieth National Conference on Artificial Intelligence, 2005. Terry Zimmerman and Subbarao Kambhampati. Learning-assisted automated planning: Looking back, taking stock, going forward. AI Magazine, 24(2)(2):73–96, 2003. 718