acl acl2011 acl2011-146 acl2011-146-reference knowledge-graph by maker-knowledge-mining

146 acl-2011-Goodness: A Method for Measuring Machine Translation Confidence

Source: pdf

Author: Nguyen Bach ; Fei Huang ; Yaser Al-Onaizan

Abstract: State-of-the-art statistical machine translation (MT) systems have made significant progress towards producing user-acceptable translation output. However, there is still no efficient way for MT systems to inform users which words are likely translated correctly and how confident it is about the whole sentence. We propose a novel framework to predict wordlevel and sentence-level MT errors with a large number of novel features. Experimental results show that the MT error prediction accuracy is increased from 69.1 to 72.2 in F-score. The Pearson correlation between the proposed confidence measure and the human-targeted translation edit rate (HTER) is 0.6. Improve- ments between 0.4 and 0.9 TER reduction are obtained with the n-best list reranking task using the proposed confidence measure. Also, we present a visualization prototype of MT errors at the word and sentence levels with the objective to improve post-editor productivity.

reference text

Nguyen Bach, Matthias Eck, Paisarn Charoenpornsawat, Thilo Khler, Sebastian Stker, ThuyLinh Nguyen, Roger Hsiao, Alex Waibel, Stephan Vogel, Tanja Schultz, and Alan Black. 2007. The CMU TransTac 2007 Eyes-free and Hands-free Two-way Speech-to-Speech Translation System. In Proceedings of the IWSLT’07, Trento, Italy. Nguyen Bach, Qin Gao, and Stephan Vogel. 2009. Sourceside dependency tree reordering models with subtree movements and constraints. In Proceedings of the MTSummit-XII, Ottawa, Canada, August. International Association for Machine Translation. John Blatz, Erin Fitzgerald, George Foster, Simona Gandrabur, Cyril Goutte, Alex Kulesza, Alberto Sanchis, and Nicola Ueffing. 2004. Confidence estimation for machine translation. In The JHU Workshop Final Report, Baltimore, Maryland, USA, April. David Chiang, Kevin Knight, and Wei Wang. 2009. 11,001 new features for statistical machine translation. In Proceedings of HLT-ACL, pages 218–226, Boulder, Colorado, June. Association for Computational Linguistics. Koby Crammer and Yoram Singer. 2003. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 3:951–991. Niyu Ge. 2004. Max-posterior HMM alignment for machine translation. In Presentation given at DARPA/TIDES NIST MT Evaluation workshop. Nizar Habash and Jun Hu. 2009. Improving arabic-chinese statistical machine translation using english as pivot language. In Proceedings of the 4th Workshop on Statistical Machine Translation, pages 173–181, Morristown, NJ, USA. Association for Computational Linguistics. Almut Silja Hildebrand and Stephan Vogel. 2008. Combination of machine translation systems via hypothesis selection from combined n-best lists. In Proceedings of the 8th Conference of the AMTA, pages 254–261, Waikiki, Hawaii, October. Fei Huang. 2009. Confidence measure for word alignment. In Proceedings of the ACL-IJCNLP ’09, pages 932–940, Morristown, NJ, USA. Association for Computational Linguistics. Abraham Ittycheriah and Salim Roukos. 2005. A maximum entropy word aligner for arabic-english machine translation. In Proceedings of the HTL-EMNLP’05, pages 89– 96, Morristown, NJ, USA. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL’07, pages 177–180, Prague, Czech Republic, June. Yanjun Ma, Sylwia Ozdowska, Yanli Sun, and Andy Way. 2008. Improving word alignment using syntactic dependencies. In Proceedings of the ACL-08: HLT SSST-2, pages 69–77, Columbus, OH. Marie-Catherine Marneffe, Bill MacCartney, and Christopher Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC’06, Genoa, Italy. Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Flexible text segmentation with structured multilabel classification. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 987– 994, Vancouver, British Columbia, Canada, October. Association for Computational Linguistics. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation 3of11 m–3ac1h8i,ne Ph trialandselalptihoina., P IAn, P Jruolcy.eedings of ACL’02, pag2e1s9 Chris Quirk. 2004. Training a sentence-level machine translation confidence measure. In Proceedings of the 4th LREC. Sylvain Raybaud, Caroline Lavecchia, David Langlois, and Kamel Smaili. 2009. Error detection for statistical machine translation using linguistic features. In Proceedings of the 13th EAMT, Barcelona, Spain, May. Binyamin Rozenfeld, Ronen Feldman, and Moshe Fresko. 2006. A systematic cross-comparison of sequence classifiers. In Proceedings of the SDM, pages 563–567, Bethesda, MD, USA, April. Alberto Sanchis, Alfons Juan, and Enrique Vidal. 2007. Estimation of confidence measures for machine translation. In Proceedings of the MT Summit XI, Copenhagen, Denmark. Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of ACL-08: HLT, pages 577–585, Columbus, Ohio, June. Association for Computational Linguistics. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of AMTA’06, pages 223–23 1, August. Radu Soricut and Abdessamad Echihabi. 2010. Trustrank: Inducing trust in automatic translations via ranking. In Proceedings of the 48th ACL, pages 612–621, Uppsala, Sweden, July. Association for Computational Linguistics. Lucia Specia, Zhuoran Wang, Marco Turchi, John ShaweTaylor, and Craig Saunders. 2009. Improving the confidence of machine translation quality estimates. In Proceedings of the MT Summit XII, Ottawa, Canada. Christoph Tillmann. 2006. Efficient dynamic programming search algorithms for phrase-based SMT. In Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing, pages 9–16, Morristown, NJ, USA. Association for Computational Linguistics. Nicola Ueffing and Hermann Ney. 2007. Word-level confidence estimation for machine translation. Computational Linguistics, 33(1):9–40. Taro Watanabe, Jun Suzuki, Hajime Tsukada, and Hideki Isozaki. 2007. Online large-margin training for statistical machine translation. In Proceedings of the EMNLPCoNLL, pages 764–773, Prague, Czech Republic, June. Association for Computational Linguistics. Deyi Xiong, Min Zhang, and Haizhou Li. 2010. Error detection for statistical machine translation using linguistic features. In Proceedings of the 48th ACL, pages 604– 611, Uppsala, Sweden, July. Association for Computational Linguistics.