acl acl2011 acl2011-265 acl2011-265-reference knowledge-graph by maker-knowledge-mining

265 acl-2011-Reordering Modeling using Weighted Alignment Matrices

Source: pdf

Author: Wang Ling ; Tiago Luis ; Joao Graca ; Isabel Trancoso ; Luisa Coheur

Abstract: In most statistical machine translation systems, the phrase/rule extraction algorithm uses alignments in the 1-best form, which might contain spurious alignment points. The usage ofweighted alignment matrices that encode all possible alignments has been shown to generate better phrase tables for phrase-based systems. We propose two algorithms to generate the well known MSD reordering model using weighted alignment matrices. Experiments on the IWSLT 2010 evaluation datasets for two language pairs with different alignment algorithms show that our methods produce more accurate reordering models, as can be shown by an increase over the regular MSD models of 0.4 BLEU points in the BTEC French to English test set, and of 1.5 BLEU points in the DIALOG Chinese to English test set.

reference text

Christopher Dyer, Smaranda Muresan, and Philip Resnik. 2008. Generalizing Word Lattice Translation. Technical Report LAMP-TR-149, University of Maryland, College Park, February. Kuzman Ganchev, Jo˜ ao V. Gra ¸ca, and Ben Taskar. 2008. Better alignments = better translations? In Proceedings of ACL-08: HLT, pages 986–993, Columbus, Ohio, June. Association for Computational Linguistics. 2http://code.google.com/p/geppetto/ 454 Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, pages 48–54, Morristown, NJ, USA. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-burch, Richard Zens, Rwth Aachen, Alexandra Constantin, Marcello Federico, Nicola Bertoldi, Chris Dyer, Brooke Cowan, Wade Shen, Christine Moran, and Ondrej Bojar. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177– 180, Prague, Czech Republic, June. Association for Computational Linguistics. Wang Ling, Tiago Lu´ ıs, Joao Gra ¸ca, Lu´ ısa Coheur, and Isabel Trancoso. 2010. Towards a general and extensible phrase-extraction algorithm. In IWSLT ’10: International Workshop on Spoken Language Translation, pages 3 13–320, Paris, France. Yang Liu, Tian Xia, Xinyan Xiao, and Qun Liu. 2009. Weighted alignment matrices for statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP ’09, pages 1017–1026, Morristown, NJ, USA. Association for Computational Linguistics. Haitao Mi, Liang Huang, and Qun Liu. 2008. Forestbased translation. In Proceedings of ACL-08: HLT, pages 192–199, Columbus, Ohio, June. Association for Computational Linguistics. Michael Paul, Marcello Federico, and Sebastian St¨ uker. 2010. Overview of the iwslt 2010 evaluation campaign. In IWSLT ’10: International Workshop on Spoken Language Translation, pages 3–27. Jo˜ ao V. Gra ¸ca, Kuzman Ganchev, and Ben Taskar. 2010. Learning Tractable Word Alignment Models with Complex Constraints. Comput. Linguist., 36:481–504. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. 2009. Wider pipelines: N-best alignments and parses in MT training. David Vilar, Maja Popovic, and Hermann Ney. 2006. Aer: Do we need to ”improve” our alignments? In International Workshop on Spoken Language Translation (IWSLT), pages 205–212.