acl acl2011 acl2011-266 acl2011-266-reference knowledge-graph by maker-knowledge-mining

266 acl-2011-Reordering with Source Language Collocations


Source: pdf

Author: Zhanyi Liu ; Haifeng Wang ; Hua Wu ; Ting Liu ; Sheng Li

Abstract: This paper proposes a novel reordering model for statistical machine translation (SMT) by means of modeling the translation orders of the source language collocations. The model is learned from a word-aligned bilingual corpus where the collocated words in source sentences are automatically detected. During decoding, the model is employed to softly constrain the translation orders of the source language collocations, so as to constrain the translation orders of those source phrases containing these collocated words. The experimental results show that the proposed method significantly improves the translation quality, achieving the absolute improvements of 1.1~1.4 BLEU score over the baseline methods. 1


reference text

Yaser Al-Onaizan and Kishore Papineni. 2006. Distortion Models for Statistical Machine Translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp. 529-536. Adam L. Berger, Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, Andrew S. Kehler, and Robert L. Mercer. 1996. Language Translation Apparatus and Method of Using Context-Based Translation Models. United States Patent, Patent Number 5510981, April. Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert. L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter estimation. Computational Linguistics, 19(2): 2633 11. David Chiang. 2005. A Hierarchical Phrase-based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 263-270. Niyu Ge. 2010. A Direct Syntax-Driven Reordering Model for Phrase-Based Machine Translation. In Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 849-857. Philipp Koehn. 2004. Statistical Significance Tests for Machine Translation Evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 388-395. Philipp Koehn, Franz Joseph Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of the Joint Conference on Human Language Technologies and the Annual Meeting of the North American Chapter of the Association of Computational Linguistics, pp. 127-1 33. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the ACL, Poster and Demonstration Sessions, pp. 177-1 80. Philipp Koehn, Amittai Axelrod, Alexandra Birch Mayne, Chris Callison-Burch, Miles Osborne, and David Talbot. 2005. Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation. In Proceedings of International Workshop on Spoken Language Translation. Dekang Lin. 1998. Extracting Collocations from Text Corpora. In Proceedings of the 1st Workshop on Computational Terminology, pp. 57-63. Zhanyi Liu, Haifeng Wang, Hua Wu, and Sheng Li. 2009. Collocation Extraction Using Monolingual Word Alignment Method. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 487-495. Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing, Cambridge, MA; London, U.K.: Bradford Book & MIT Press. Yuval Marton and Philip Resnik. 2008. Soft Syntactic Constraints for Hierarchical Phrased-based Translation. In Proceedings of the 46st Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1003-101 1. Kathleen R. McKeown and Dragomir R. Radev. 2000. Collocations. In Robert Dale, Hermann Moisl, and Harold Somers (Ed.), A Handbook of Natural Language Processing, pp. 507-523. Franz Josef Och. 2003. Minimum Error Rate Training in Statistical Machine Translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 160-167. Franz Josef Och and Hermann Ney. 2003. A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 29(1) : 19-5 1. Kishore Papineni, Salim Roukos, Todd Ward, and Weij- ing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 3 11-3 18. Andreas Stolcke. 2002. SRILM - An Extensible Language Modeling Toolkit. In Proceedings for the International Conference on Spoken Language Processing, pp. 901-904. Christoph Tillmann. 2004. A Unigram Orientation Model for Statistical Machine Translation. In Proceedings of the Joint Conference on Human Language Technologies and the Annual Meeting of the North American Chapter of the Association of Computational Linguistics, pp. 101-104. Christoph Tillmann and Tong Zhang. 2005. A Localized Prediction Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 557-564. Karthik Visweswariah, Jiri Navratil, Jeffrey Sorensen, Vijil Chenthamarakshan, and Nanda Kambhatla. 2010. Syntax Based Reordering with Automatically 1044 Derived Rules for Improved Statistical Machine Translation. In Proceedings of the 23rd International Conference on Computational Linguistics, pp. 11191127. Dekai Wu. 1997. Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora. Computational Linguistics, 23(3):377-403. Deyi Xiong, Qun Liu, and Shouxun Lin. 2006. Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 52 1-528. Richard Zens and Herman Ney. 2003. A Comparative Study on Reordering Constraints in Statistical Machine Translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pp. 192-202. Richard Zens and Herman Ney. 2006. Discriminative Reordering Models for Statistical Machine Translation. In Proceedings of the Workshop on Statistical Machine Translation, pp. 55-63. Dongdong Zhang, Mu Li, Chi-Ho Li, and Ming Zhou. 2007. Phrase Reordering Model Integrating Syntactic Knowledge for SMT. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 533-540.