emnlp emnlp2011 emnlp2011-132 emnlp2011-132-reference knowledge-graph by maker-knowledge-mining

132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search

Source: pdf

Author: Yue Zhang ; Stephen Clark

Abstract: Machine-produced text often lacks grammaticality and fluency. This paper studies grammaticality improvement using a syntax-based algorithm based on CCG. The goal of the search problem is to find an optimal parse tree among all that can be constructed through selection and ordering of the input words. The search problem, which is significantly harder than parsing, is solved by guided learning for best-first search. In a standard word ordering task, our system gives a BLEU score of 40. 1, higher than the previous result of 33.7 achieved by a dependency-based system.

reference text

Srinivas Bangalore, Owen Rambow, and Steve Whittaker. 2000. Evaluation metrics for generation. In Proceedings of the First International Natural Language Generation Conference (INLG2000), Mitzpe, pages 1–8. Graeme Blackwood, Adri a` de Gispert, and William Byrne. 2010. Fluency constraints for minimumbayesrisk decoding of statistical machine translation lattices. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 71–79, Beijing, China, August. Coling 2010 Organizing Committee. Sharon A. Caraballo and Eugene Charniak. 1998. New figures of merit for best-first probabilistic chart parsing. Comput. Linguist., 24:275–298, June. Stephen Clark and James R. Curran. 2007. Widecoverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics, 33(4):493–552. Michael Collins. 2002. Discriminative training methods for hidden Markov models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP, pages 1–8, Philadelphia, USA. Hal Daum e´ III and Daniel Marcu. 2005. Learning as search optimization: approximate large margin methods for structured prediction. In ICML, pages 169– 176. Dominic Espinosa, Michael White, and Dennis Mehay. 2008. Hypertagging: Supertagging for surface realization with CCG. In Proceedings of ACL-08: HLT, pages 183–191, Columbus, Ohio, June. Association for Computational Linguistics. Yoav Goldberg and Michael Elhadad. 2010. An efficient algorithm for easy-first non-directional dependency parsing. In Human Language Technologies: The 2010 Annual Conference of the North American 1156 Chapter ofthe Associationfor ComputationalLinguis- tics, pages 742–750, Los Angeles, California, June. Association for Computational Linguistics. Julia Hockenmaier and Mark Steedman. 2007. CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3):355–396. Julia Hockenmaier. 2003. Parsing with generative models of predicate-argument structure. In Proceedings of the 41st Meeting of the ACL, pages 359–366, Sapporo, Japan. Kevin Knight. 2007. Automatic language translation generation help needs badly. In MT Summit XI Workshop on Using Corpora for NLG: Keynote Address. Phillip Koehn. 2010. Statistical Machine Translation. Cambridge University Press. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 3 11–3 18, Philadelphia, Pennsylvania, USA, July. Association for Computational Linguistics. Stefan Riezler, Tracy H. King, Ronald M. Kaplan, Richard Crouch, John T. III Maxwell, and Mark Johnson. 2002. Parsing the Wall Street Journal using a lexical-functional grammar and discriminative estimation techniques. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 27 1–278, Philadelphia, Pennsylvania, USA, July. Association for Computational Linguistics. F. Rosenblatt. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65:386–408. Libin Shen and Aravind Joshi. 2008. LTAG dependency parsing with bidirectional incremental construction. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 495– 504, Honolulu, Hawaii, October. Association for Computational Linguistics. Libin Shen, Giorgio Satta, and Aravind Joshi. 2007. Guided learning for bidirectional sequence classification. In Proceedings of ACL, pages 760–767, Prague, Czech Republic, June. Mark Steedman. 2000. The Syntactic Process. The MIT Press, Cambridge, Mass. Sebastian Varges and Chris Mellish. 2010. Instancebased natural language generation. Natural Language Engineering, 16(3):309–346. Stephen Wan, Mark Dras, Robert Dale, and C ´ecile Paris. 2009. Improving grammaticality in statistical sentence generation: Introducing a dependency spanning tree algorithm with an argument satisfaction model. In Proceedings pean Chapter of the 12th Conference of the of the Euro- 2009), pages 852– March. Association for Compu- ACL (EACL 860, Athens, Greece, tational Linguistics. Michael White and Rajakrishnan Rajkumar. 2009. Per- ceptron reranking for CCG realization. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 410–419, Singapore, August. Association for Computational Linguistics. Michael White. 2004. Reining in CCG chart realization. In Proc. INLG-04, pages 182–191. 1157