acl acl2012 acl2012-118 acl2012-118-reference knowledge-graph by maker-knowledge-mining

118 acl-2012-Improving the IBM Alignment Models Using Variational Bayes

Source: pdf

Author: Darcey Riley ; Daniel Gildea

Abstract: Bayesian approaches have been shown to reduce the amount of overfitting that occurs when running the EM algorithm, by placing prior probabilities on the model parameters. We apply one such Bayesian technique, variational Bayes, to the IBM models of word alignment for statistical machine translation. We show that using variational Bayes improves the performance of the widely used GIZA++ software, as well as improving the overall performance of the Moses machine translation system in terms of BLEU score.

reference text

Maptr hoxeiwma J.te BBe a ly.e s20ia0n3. In Vfear eiantcieo.na Plh A.Dlg.o trhiethsims,s U fonriv Aepr- sity College London. Phil Blunsom, Trevor Cohn, and Miles Osborne. 2008. Bayesian synchronous grammar induction. In Neural Information Processing Systems (NIPS). Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263–31 1. David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of ACL-05, pages 263–270, Ann Arbor, MI. A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1): 1–21. John DeNero, Alexandre Bouchard-C oˆt´ e, and Dan Klein. 2008. Sampling alignment structure under a Bayesian translation model. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 3 14–323, Honolulu, Hawaii, October. Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a translation rule? In Proceedings of NAACL-04, pages 273–280, Boston. Mark Johnson. 2007. Why doesn’t EM find good HMM POS-taggers? In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 296–305, Prague, Czech Republic, June. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Con- Stephan Vogel, Hermann Ney, and Christoph Tillmann. l1a9ti9o6n. H InM CMO-LbaINseGd-9 w6o,rd pa agliegsn 8m3e6n–t8 4in1. statistical trans- stantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL, Demonstration Session, pages 177–180. Coskun Mermer and Murat Saraclar. 2011. Bayesian word alignment for statistical machine translation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL-11), pages 182–187. Robert C. Moore. 2004. Improving IBM word alignment Model 1. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL’04), Main Volume, pages 518–525, Barcelona, Spain, July. Franz Josef Och and Hermann Ney. 2000. Improved statistical alignment models. In Proceedings of ACL00, pages 440–447, Hong Kong, October. 310