emnlp emnlp2012 emnlp2012-42 emnlp2012-42-reference knowledge-graph by maker-knowledge-mining

42 emnlp-2012-Entropy-based Pruning for Phrase-based Machine Translation

Source: pdf

Author: Wang Ling ; Joao Graca ; Isabel Trancoso ; Alan Black

Abstract: Phrase-based machine translation models have shown to yield better translations than Word-based models, since phrase pairs encode the contextual information that is needed for a more accurate translation. However, many phrase pairs do not encode any relevant context, which means that the translation event encoded in that phrase pair is led by smaller translation events that are independent from each other, and can be found on smaller phrase pairs, with little or no loss in translation accuracy. In this work, we propose a relative entropy model for translation models, that measures how likely a phrase pair encodes a translation event that is derivable using smaller translation events with similar probabilities. This model is then applied to phrase table pruning. Tests show that considerable amounts of phrase pairs can be excluded, without much impact on the transla- . tion quality. In fact, we show that better translations can be obtained using our pruned models, due to the compression of the search space during decoding.

reference text

Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: parameter estimation. Comput. Linguist., 19:263–3 11, June. George Foster, Roland Kuhn, and Howard Johnson. 2006. Phrasetable smoothing for statistical machine translation. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pages 53–61, Stroudsburg, PA, USA. Association for Computational Linguistics. J Howard Johnson and Joel Martin. 2007. Improving translation quality by discarding most of the phrasetable. In In Proceedings of EMNLP-CoNLL’07, pages 967–975. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, pages 48–54, Morristown, NJ, USA. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-burch, Richard Zens, Rwth Aachen, Alexandra Constantin, Marcello Federico, Nicola Bertoldi, Chris Dyer, Brooke Cowan, Wade Shen, Christine Moran, and Ondrej Bojar. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177– 180, Prague, Czech Republic, June. Association for Computational Linguistics. Philipp Koehn. 2005. Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, pages 79–86, Phuket, Thailand. AAMT, AAMT. Wang Ling, Tiago Lu´ ıs, Jo a˜o Gra ¸ca, Lu´ ısa Coheur, and Isabel Trancoso. 2010. Towards a general and extensible phrase-extraction algorithm. In IWSLT ’10: International Workshop on Spoken Language Translation, pages 3 13–320, Paris, France. Stephen Vogal Matthias Eck and Alex Waibel. 2007. Estimating phrase pair relevance for translation model pruning. MTSummit XI. Robert C. Moore and Chris Quirk. 2009. Less is more: significance-based n-gram selection for smaller, better language models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2, EMNLP ’09, pages 746–755, Stroudsburg, PA, USA. Association for Computational Linguistics. 971 Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL ’03, pages 160– 167, Stroudsburg, PA, USA. Association for Computational Linguistics. Michael Paul, Marcello Federico, and Sebastian St¨ uker. 2010. Overview of the iwslt 2010 evaluation campaign. In IWSLT ’10: International Workshop on Spoken Language Translation, pages 3–27. Lane Schwartz. 2008. Multi-source translation methods. In Proceedings of AMTA, pages 279–288. Kristie Seymore and Ronald Rosenfeld. 1996. Scalable backoff language models. In In Proceedings of ICSLP, pages 232–235. Andreas Stolcke. 1998. Entropy-based pruning of backoff language models. In In Proc. DARPA Broadcast News Transcription and Understanding Workshop, pages 270–274. Nadi Tomeh, Nicola Cancedda, and Marc Dymetman. 2009. Complexity-based phrase-table filtering for statistical machine translation. MTSummit XII, Aug. Jo˜ ao V. Gra ¸ca, Kuzman Ganchev, and Ben Taskar. 2010. Learning Tractable Word Alignment Models with Complex Constraints. Comput. Linguist., 36:481–504. S. Vogel, H. Ney, and C. Tillmann. 1996. Hmmbased word alignment in statistical translation. In Proceedings of the 16th conference on Computational linguistics-Volume 2, pages 836–841. Association for Computational Linguistics.