acl acl2012 acl2012-129 acl2012-129-reference knowledge-graph by maker-knowledge-mining

129 acl-2012-Learning High-Level Planning from Text

Source: pdf

Author: S.R.K. Branavan ; Nate Kushman ; Tao Lei ; Regina Barzilay

Abstract: Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline’s 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline successfully completing 80% of planning tasks as compared to 69% for the baseline.1 –

reference text

Fahiem Bacchus and Qiang Yang. 1994. Downward refinement and the efficiency of hierarchical problem solving. Artificial Intell., 71(1):43–100. Jennifer L. Barry, Leslie Pack Kaelbling, and Toms Lozano-Prez. 2011. DetH*: Approximate hierarchical solution of large markov decision processes. In IJCAI’11, pages 1928–1935. Brandon Beamer and Roxana Girju. 2009. Using a bigram event model to predict causal potential. In Proceedings of CICLing, pages 430–441 . Eduardo Blanco, Nuria Castell, and Dan Moldovan. 2008. Causal relation extraction. In Proceedings of the LREC’08. S.R.K Branavan, Harr Chen, Luke Zettlemoyer, and Regina Barzilay. 2009. Reinforcement learning for mapping instructions to actions. In Proceedings of ACL, pages 82–90. S.R.K Branavan, Luke Zettlemoyer, and Regina Barzilay. 2010. Reading between the lines: Learning to map high-level instructions to commands. In Proceedings of ACL, pages 1268–1277. S. R. K. Branavan, David Silver, and Regina Barzilay. 2011. Learning to win by reading manuals in a montecarlo framework. In Proceedings of ACL, pages 268– 277. Du-Seong Chang and Key-Sun Choi. 2006. Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities. Inf. Process. Manage., 42(3):662–678. Marie-Catherine de Marneffe, Bill MacCartney, and Christopher D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In LREC 2006. Q. Do, Y. Chan, and D. Roth. 2011. Minimally supervised event causality identification. In EMNLP, 7. Michael Fleischman and Deb Roy. 2005. Intentional context in situated natural language learning. In Proceedings of CoNLL, pages 104–1 11. Maria Fox and Derek Long. 2003. Pddl2. 1: An extension to pddl for expressing temporal planning domains. Journal of Artificial Intelligence Research, 20:2003. Malik Ghallab, Dana S. Nau, and Paolo Traverso. 2004. Automated Planning: theory and practice. Morgan Kaufmann. Roxana Girju and Dan I. Moldovan. 2002. Text mining for causal relations. In Proceedigns of FLAIRS, pages 360–364. J o¨rg Hoffmann and Bernhard Nebel. 2001. The FF planning system: Fast plan generation through heuristic search. JAIR, 14:253–302. 134 Thorsten Joachims. 1999. Advances in kernel methods. chapter Making large-scale support vector machine learning practical, pages 169–184. MIT Press. Anders Jonsson and Andrew Barto. 2005. A causal approach to hierarchical decomposition of factored mdps. In Advances in Neural Information Processing Systems, 13:10541060, page 22. Press. Mari a´n Lekav ´y and Pavol N ´avrat. 2007. Expressivity of strips-like and htn-like planning. Lecture Notes in Artificial Intelligence, 4496: 121–130. Percy Liang, Michael I. Jordan, and Dan Klein. 2009. Learning semantic correspondences with less supervision. In Proceedings of ACL, pages 91–99. Neville Mehta, Soumya Ray, Prasad Tadepalli, and Thomas Dietterich. 2008. Automatic discovery and transfer of maxq hierarchies. In Proceedings of the 25th international conference on Machine learning, ICML ’08, pages 648–655. Raymond J. Mooney. 2008a. Learning language from its perceptual context. In Proceedings of ECML/PKDD. Raymond J. Mooney. 2008b. Learning to connect language and perception. In Proceedings of AAAI, pages 1598–1601. A. Newell, J.C. Shaw, and H.A. Simon. 1959. The processes of creative thinking. Paper P-1320. Rand Corporation. James Timothy Oates. 2001. Grounding knowledge in sensors: Unsupervised learning for language and planning. Ph.D. thesis, University of Massachusetts Amherst. Avirup Sil and Alexander Yates. 2011. Extracting STRIPS representations of actions and events. In Recent Advances in Natural Language Learning (RANLP). Avirup Sil, Fei Huang, and Alexander Yates. 2010. Extracting action and event semantics from web text. In AAAI 2010 Fall Symposium on Commonsense Knowledge (CSK). Jeffrey Mark Siskind. 2001. Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. Journal of Artificial Intelligence Research, 15:3 1–90. Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. The MIT Press. Richard S. Sutton, David McAllester, Satinder Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. In Advances in NIPS, pages 1057–1063. Adam Vogel and Daniel Jurafsky. 2010. Learning to follow navigational directions. In Proceedings of the ACL, pages 806–814. Ronald J Williams. 1992. Simple statistical gradientfollowing algorithms for connectionist reinforcement learning. Machine Learning, 8. Alicia P. Wolfe and Andrew G. Barto. 2005. Identifying useful subgoals in reinforcement learning by local graph partitioning. In In Proceedings of the TwentySecond International Conference on Machine Learning, pages 816–823. Chen Yu and Dana H. Ballard. 2004. On the integration of grounding language and learning objects. In Proceedings of AAAI, pages 488–493. 135