acl acl2011 acl2011-220 acl2011-220-reference knowledge-graph by maker-knowledge-mining

220 acl-2011-Minimum Bayes-risk System Combination


Source: pdf

Author: Jesus Gonzalez-Rubio ; Alfons Juan ; Francisco Casacuberta

Abstract: We present minimum Bayes-risk system combination, a method that integrates consensus decoding and system combination into a unified multi-system minimum Bayes-risk (MBR) technique. Unlike other MBR methods that re-rank translations of a single SMT system, MBR system combination uses the MBR decision rule and a linear combination of the component systems’ probability distributions to search for the minimum risk translation among all the finite-length strings over the output vocabulary. We introduce expected BLEU, an approximation to the BLEU score that allows to efficiently apply MBR in these conditions. MBR system combination is a general method that is independent of specific SMT models, enabling us to combine systems with heterogeneous structure. Experiments show that our approach bring significant improvements to single-system-based MBR decoding and achieves comparable results to different state-of-the-art system combination methods.


reference text

Peter J. Bickel and Kjell A Doksum. 1977. Mathematical statistics : basic ideas and selected topics. Holden-Day, San Francisco. Chris Callison-Burch, Philipp Koehn, Christof Monz, Kay Peterson, Mark Przybocki, and Omar F. Zaidan. 2010. Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 17–53, Morristown, NJ, USA. Association for Computational Linguistics. John DeNero, David Chiang, and Kevin Knight. 2009. Fast consensus decoding over translation forests. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2, pages 567–575, Morristown, NJ, USA. Association for Computational Linguistics. John DeNero, Shankar Kumar, Ciprian Chelba, and Franz Och. 2010. Model combination for machine translation. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 975–983, Morristown, NJ, USA. Association for Computational Linguistics. Nan Duan, Mu Li, Dongdong Zhang, and Ming Zhou. 2010. Mixture model-based minimum bayes risk decoding using multiple machine translation systems. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 3 13– 321, Beijing, China, August. Coling 2010 Organizing Committee. Nicola Ehling, Richard Zens, and Hermann Ney. 2007. Minimum bayes risk decoding for bleu. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 101– 104, Morristown, NJ, USA. Association for Computational Linguistics. Robert Frederking and Sergei Nirenburg. 1994. Three heads are better than one. In Proceedings of the fourth conference on Applied natural language processing, pages 95–100, Morristown, NJ, USA. Association for Computational Linguistics. 1276 K.S. Fu. 1982. Syntactic Pattern Recognition and Applications. Prentice Hall. Vaibhava Goel and William J. Byrne. 2000. Minimum bayes-risk automatic speech recognition. Computer Speech & Language, 14(2): 115–135. Jes u´s Gonz a´lez-Rubio and Francisco Casacuberta. 2010. On the use of median string for multi-source translation. In In Proceedings of the International Conference on Pattern Recognition (ICPR2010), pages 4328– 4331. Xiaodong He and Kristina Toutanova. 2009. Joint optimization for machine translation system combination. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3, pages 1202–121 1, Morristown, NJ, USA. Association for Computational Linguistics. Xiaodong He, Mei Yang, Jianfeng Gao, Patrick Nguyen, and Robert Moore. 2008. Indirect-hmm-based hy- pothesis alignment for combining outputs from machine translation systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 98–107, Morristown, NJ, USA. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ond ˇrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pages 177– 180, Morristown, NJ, USA. Association for Computational Linguistics. Shankar Kumar and William J. Byrne. 2004. Minimum bayes-risk decoding for statistical machine translation. In HLT-NAACL, pages 169–176. Shankar Kumar, Wolfgang Macherey, Chris Dyer, and Franz Och. 2009. Efficient minimum error rate training and minimum bayes-risk decoding for translation hypergraphs and lattices. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1- Volume 1, pages 163–171, Morristown, NJ, USA. Association for Computational Linguistics. Leusch, Aur e´lien Max, Josep Maria Crego, and Hermann Ney. 2010. Multi-pivot translation by system combination. In International Workshop on Spoken Language Translation, Paris, France, December. Zhifei Li, Jason Eisner, and Sanjeev Khudanpur. 2009. Variational decoding for statistical machine translation. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language ProcessGregor ing of the AFNLP: Volume 2 - Volume 2, pages 593– 601, Morristown, NJ, USA. Association for Computational Linguistics. C. D. Mart ı´nez, A. Juan, and F. Casacuberta. 2000. Use of Median String for Classification. In Proceedings of the 15th International Conference on Pattern Recognition, volume 2, pages 907–910, Barcelona (Spain), September. John A. Nelder and Roger Mead. 1965. A Simplex Method for Function Minimization. The Computer Journal, 7(4):308–313, January. Franz Josef Och and Hermann Ney. 2001. Statistical multi-source translation. In In Machine Translation Summit, pages 253–258. Franz J. Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, pages 160–167, Morristown, NJ, USA. Association for Computational Linguistics. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 311–318, Morristown, NJ, USA. Association for Computational Linguistics. Antti-Veikko Rosti, Necip Fazil Ayan, Bing Xiang, Spyros Matsoukas, Richard Schwartz, and Bonnie Dorr. 2007. Combining outputs from multiple machine translation systems. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 228–235, Rochester, New York, April. Association for Computational Linguistics. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and Ralph Weischedel. 2006. A study of translation error rate with targeted human annotation. In In Proceedings of the Association for Machine Transaltion in the Americas. Roy W. Tromble, Shankar Kumar, Franz Och, and Wolfgang Macherey. 2008. Lattice minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 620–629, Morristown, NJ, USA. Association for Computational Linguistics. Ying Zhang and Stephan Vogel. 2004. Measuring confidence intervals for the machine translation evaluation metrics. In In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI-2004, pages 4–6. 1277