acl acl2011 acl2011-100 acl2011-100-reference knowledge-graph by maker-knowledge-mining

100 acl-2011-Discriminative Feature-Tied Mixture Modeling for Statistical Machine Translation

Source: pdf

Author: Bing Xiang ; Abraham Ittycheriah

Abstract: In this paper we present a novel discriminative mixture model for statistical machine translation (SMT). We model the feature space with a log-linear combination ofmultiple mixture components. Each component contains a large set of features trained in a maximumentropy framework. All features within the same mixture component are tied and share the same mixture weights, where the mixture weights are trained discriminatively to maximize the translation performance. This approach aims at bridging the gap between the maximum-likelihood training and the discriminative training for SMT. It is shown that the feature space can be partitioned in a variety of ways, such as based on feature types, word alignments, or domains, for various applications. The proposed approach improves the translation performance significantly on a large-scale Arabic-to-English MT task.

reference text

Phil Blunsom, Trevor Cohn, and Miles Osborne. 2008. A discriminative latent variable model for statistical machine translation. In Proceedings of ACL-08:HLT. David Chiang, Kevin Knight, and Wei Wang. 2009. 11,001 new features for statistical machine translation. In Proceedings of NAACL-HLT. Stephen Della Pietra, Vincent Della Pietra, and John Lafferty. 1997. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence. George Foster, Cyril Goutte, and Roland Kuhn. 2010. Discriminative instance weighting for domain adaptation in satistical machine translation. In Proceedings of EMNLP. Abraham Ittycheriah and Salim Roukos. 2005. A maximum entropy word aligner for arabic-english machine translation. In Proceedings of HLT/EMNLP, pages 89–96, October. Abraham Ittycheriah and Salim Roukos. 2007. Direct translation model 2. In Proceedings HLT/NAACL, pages 57–64, April. Philipp Koehn, Franz Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of NAACL/HLT. Percy Liang, Alexandre Bouchard-C oˆt´ e, Dan Klein, and Ben Taskar. 2006. An end-to-end discriminative approach to machine translation. In Proceedings of ACL/COLING, pages 761–768, Sydney, Australia. 428 Franz Josef Och and Hermann Ney. 2000. Improved statistical alignment models. In Proceedings of ACL, pages 440–447, Hong Kong, China, October. Franz Josef Och and Hermann Ney. 2002. Discriminative training and maximum entropy models for statistical machine translations. In Proceedings of ACL, pages 295–302, Philadelphia, PA, July. Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proceedings ofACL, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and Weijing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of ACL, pages 3 11–3 18. Christoph Tillmann and Tong Zhang. 2006. A discriminative global training algorithm for statistical mt. In Proceedings of ACL/COLING, pages 721–728, Sydney, Australia. Stephan Vogel, Hermann Ney, and Christoph Tillmann. 1996. Hmm-based word alignment in statistical translation. In Proceedings of COLING, pages 836–841 . Ying Zhang and Stephan Vogel. 2004. Measuring confidence intervals for the machine translation evaluation metrics. In Proceedings of The 10th International Conference on Theoretical and Methodological Issues in Machine Translation. Bing Zhao and Shengyuan Chen. 2009. A simplex armijo downhill algorithm for optimizing statistical machine translation decoding parameters. In Proceed- ings of NAACL-HLT.