emnlp emnlp2013 emnlp2013-106 emnlp2013-106-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Ioannis Konstas ; Mirella Lapata
Abstract: In a language generation system, a content planner selects which elements must be included in the output text and the ordering between them. Recent empirical approaches perform content selection without any ordering and have thus no means to ensure that the output is coherent. In this paper we focus on the problem of generating text from a database and present a trainable end-to-end generation system that includes both content selection and ordering. Content plans are represented intuitively by a set of grammar rules that operate on the document level and are acquired automatically from training data. We develop two approaches: the first one is inspired from Rhetorical Structure Theory and represents the document as a tree of discourse relations between database records; the second one requires little linguistic sophistication and uses tree structures to represent global patterns of database record sequences within a document. Experimental evaluation on two domains yields considerable improvements over the state of the art for both approaches.
Gabor Angeli, Percy Liang, and Dan Klein. 2010. A simple domain-independent probabilistic approach to generation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 502–5 12, Cambridge, MA. Regina Barzilay and Mirella Lapata. 2005. Collective content selection for concept-to-text generation. In Proceedings of Human Language Technology and Empirical Methods in Natural Language Processing, pages 331–338, Vancouver, British Columbia. Anja Belz. 2008. Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models. Natural Language Engineering, 14(4):43 1–455. S.R.K. Branavan, Harr Chen, Luke Zettlemoyer, and Regina Barzilay. 2009. Reinforcement learning for mapping instructions to actions. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 82–90, Suntec, Singapore. L. Carlson and D. Marcu. 2001 . Discourse tagging reference manual. Technical report, Univ. of Southern California / Information Sciences Institute. Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. 2001 . Building a discourse-tagged corpus in the framework of rhetorical structure theory. In Proceedings of the Second SIGdial Workshop on Discourse and Dialogue - Volume 16, SIGDIAL ’01, pages 1–10, Stroudsburg, PA, USA. Association for Computational Linguistics. David L. Chen and Raymond J. Mooney. 2008. Learning to sportscast: A test of grounded language acquisition. In Proceedings of International Conference on Machine Learning, pages 128–135, Helsinki, Finland. Trevor Cohn, Phil Blunsom, and Sharon Goldwater. 2010. Inducing tree-substitution grammars. Journal of Machine Learning Research, 11(November):3053– 3096. M. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania. Robert Dale. 1988. Generating referring expressions in a domain of objects and processes. Ph.D. thesis, University of Edinburgh. A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society, series B, 39(1): 1–38. Pablo A. Duboue and Kathleen R. McKeown. 2001 . Empirically estimating order constraints for content planning in generation. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pages 172–179. Pablo A. Duboue and Kathleen R. McKeown. 2002. Content planner construction via evolutionary algorithms and a corpus-based fitness function. In Proceedings of International Natural Language Generation, pages 89–96, Ramapo Mountains, NY. Vanessa Wei Feng and Graeme Hirst. 2012. Text-level discourse parsing with rich linguistic features. In Pro1513 ceedings of the 50th Annual Meeting of the Associa- tion for Computational Linguistics, pages 60–68, Jeju Island, Korea. Eduard Hovy. 1993. Automated discourse generation using discourse structure relations. Artificial Intelligence, 63:341–385. Blake Howald, Ravikumar Kondadadi, and Frank Schilder. 2013. Domain adaptable semantic clustering in statistical nlg. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers, pages 143–154, Potsdam, Germany, March. Association for Computational Linguistics. Mark Johnson. 1998. Pcfg models of linguistic tree representations. Computational Linguistics, 24(4):613– 632, December. Nikiforos Karamanis. 2003. Entity Coherence for Descriptive Text Structuring. Ph.D. thesis, University of Edinburgh. Tadao Kasami. 1965. An efficient recognition and syntax analysis algorithm for context-free languages. Technical Report AFCRL-65-758, Air Force Cambridge Research Lab, Bedford, MA. Rodger Kibble and Richard Power. 2004. Optimising referential coherence in text generation. Computational Linguistics, 30(4):401–416. Joohyun Kim and Raymond Mooney. 2010. Generative alignment and semantic parsing for learning from am- biguous supervision. In Proceedings of the 23rd Conference on Computational Linguistics, pages 543–551, Beijing, China. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, pages 423–430. Association for Computational Linguistics Morristown, NJ, USA. Ioannis Konstas and Mirella Lapata. 2012. Unsupervised concept-to-text generation with hypergraphs. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 752– 761, Montr ´eal, Canada. Percy Liang, Michael Jordan, and Dan Klein. 2009. Learning semantic correspondences with less supervision. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 91–99, Suntec, Singapore. William C. Mann and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8(3):243–281. William C. Mann and Sandra A. Thomson. 1988. Rhetorical structure theory. Text, 8(3):243–281 . Chris Mellish, Alisdair Knott, Jon Oberlander, and Mick 1998. Experiments using stochastic search for text planning. In Proceedings of International Natural Language Generation, pages 98–107, New Brunswick, NJ. O’Donnell. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318, Philadelphia, Pennsylvania. Ehud Reiter and Robert Dale. 2000. Building natural language generation systems. Cambridge University Press, New York, NY. Ehud Reiter, Somayajulu Sripada, Jim Hunter, and Ian Davy. 2005. Choosing words in computer-generated weather forecasts. Artificial Intelligence, 167: 137– 169. Frank Schilder, Blake Howald, and Ravi Kondadadi. 2013. Gennext: A consolidated domain adaptable nlg system. In Proceedings of the 14th European Workshop on Natural Language Generation, pages 178– 182, Sofia, Bulgaria, August. Association for Computational Linguistics. Donia Scott and Clarisse Sieckenius de Souza. 1990. Getting the message across in RST-based text generation. In Robert Dale, Chris Mellish, and Michael Zock, editors, Current Research in Natural Language Generation, pages 47–73. Academic Press, New York. Amanda Stent, Rashmi Prasad, and Marilyn Walker. 2004. Trainable sentence planning for complex information presentation in spoken dialog systems. In Proceedings of Association for Computational Linguistics, pages 79–86, Barcelona, Spain. Sandra Williams and Richard Power. 2008. Deriving rhetorical complexity data from the rst-dt corpus. In Proceedings of the Sixth International Language Resources and Evaluation (LREC’08), May. Yuk Wah Wong and Raymond Mooney. 2007. Generation by inverting a semantic parser that uses statistical machine translation. In Proceedings of the Human Language Technology and the Conference of the North American Chapter of the Association for Computational Linguistics, pages 172–179, Rochester, NY. Daniel H Younger. 1967. Recognition and parsing for context-free languages in time n3 . Information and Control, 10(2): 189–208. 1514