acl acl2010 acl2010-69 acl2010-69-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Haitao Mi ; Qun Liu
Abstract: Tree-to-string systems (and their forestbased extensions) have gained steady popularity thanks to their simplicity and efficiency, but there is a major limitation: they are unable to guarantee the grammaticality of the output, which is explicitly modeled in string-to-tree systems via targetside syntax. We thus propose to combine the advantages of both, and present a novel constituency-to-dependency translation model, which uses constituency forests on the source side to direct the translation, and dependency trees on the target side (as a language model) to ensure grammaticality. Medium-scale experiments show an absolute and statistically significant improvement of +0.7 BLEU points over a state-of-the-art forest-based tree-to-string system even with fewer rules. This is also the first time that a treeto-tree model can surpass tree-to-string counterparts.
Sylvie Billot and Bernard Lang. 1989. The structure of shared forests in ambiguous parsing. In Proceedings of ACL ’89, pages 143–15 1. Eugene Charniak. 2000. A maximum-entropy inspired parser. In Proceedings of NAACL, pages 132–139. David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of ACL, pages 263–270, Ann Arbor, Michigan, June. David Chiang. 2007. Hierarchical phrase-based translation. Comput. Linguist., 33(2):201–228. Michael Collins, Philipp Koehn, and Ivona Kucerova. 2005. Clause restructuring for statistical machine translation. In Proceedings of ACL, pages 53 1–540. Yuan Ding and Martha Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In Proceedings of ACL, pages 541–548, June. Heidi J. Fox. 2002. Phrasal cohesion and statistical machine translation. In In Proceedings of EMNLP02. Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a translation rule? In Proceedings of HLT/NAACL. Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of COLING-ACL, pages 961–968, July. Peter Hellwig. 2006. Parsing with Dependency Grammars, volume II. An International Handbook of Contemporary Research. Liang Huang and David Chiang. 2005. Better k-best parsing. In Proceedings of IWPT. Liang Huang and David Chiang. 2007. Forest rescor- ing: Faster decoding with integrated language models. In Proceedings of ACL, pages 144–15 1, June. Liang Huang, Kevin Knight, and Aravind Joshi. 2006. Statistical syntax-directed translation with extended domain of locality. In Proceedings of AMTA. 1441 Liang Huang, Hao Zhang, Daniel Gildea, , and Kevin Knight. 2009. Binarization of synchronous contextfree grammars. Comput. Linguist. Liang Huang. 2008. Forest reranking: Discriminative parsing with non-local features. In Proceedings of ACL. Philipp Koehn, Franz Joseph Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of HLT-NAACL, pages 127–133, Edmonton, Canada, May. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL, pages 177–180, June. Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree- to-string alignment template for statistical machine translation. In Proceedings of COLING-ACL, pages 609–616, Sydney, Australia, July. Yang Liu, Yun Huang, Qun Liu, and Shouxun Lin. 2007. Forest-to-string statistical translation rules. In Proceedings of ACL, pages 704–71 1, June. Yang Liu, Yajuan L u¨, and Qun Liu. 2009. Improving tree-to-tree translation with packed forests. In Proceedings of ACL/IJCNLP, August. David M. Magerman. 1995. Statistical decision-tree models for parsing. In Proceedings of ACL, pages 276–283, June. Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight. 2006. Spmt: Statistical machine translation with syntactified target language phrases. In Proceedings of EMNLP, pages 44–52, July. Haitao Mi and Liang Huang. 2008. Forest-based translation rule extraction. In Proceedings of EMNLP 2008, pages 206–214, Honolulu, Hawaii, October. Haitao Mi, Liang Huang, and Qun Liu. 2008. Forestbased translation. In Proceedings of ACL-08:HLT, pages 192–199, Columbus, Ohio, June. Franz J. Och and Hermann Ney. 2000. Improved statistical alignment models. In Proceedings of ACL, pages 440–447. Franz J. Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of ACL, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of ACL, pages 3 11–3 18, Philadephia, USA, July. Chris Quirk, Arul Menezes, and Colin Cherry. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of ACL, pages 271–279, June. Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of ACL-08: HLT, June. Andreas Stolcke. 2002. SRILM - an extensible language modeling toolkit. In Proceedings of ICSLP, volume 30, pages 901–904. Deyi Xiong, Shuanglong Li, Qun Liu, and Shouxun Lin. 2005. Parsing the Penn Chinese Treebank with Semantic Knowledge. In Proceedings of IJCNLP 2005, pages 70–81. Deyi Xiong, Qun Liu, and Shouxun Lin. 2007. A dependency treelet string correspondence model for statistical machine translation. In Proceedings of SMT, pages 40–47. Hao Zhang, Liang Huang, Daniel Gildea, and Kevin Knight. 2006. Synchronous binarization for machine translation. In Proc. of HLT-NAACL. Min Zhang, Hongfei Jiang, Aiti Aw, Jun Sun, Sheng Li, and Chew Lim Tan. 2007. A tree-to-tree alignmentbased model for statistical machine translation. In Proceedings of MT-Summit. Hui Zhang, Min Zhang, Haizhou Li, Aiti Aw, and Chew Lim Tan. 2009. Forest-based tree sequence to string translation model. In Proceedings of the ACL/IJCNLP 2009. 1442