acl acl2013 acl2013-135 acl2013-135-reference knowledge-graph by maker-knowledge-mining

135 acl-2013-English-to-Russian MT evaluation campaign


Source: pdf

Author: Pavel Braslavski ; Alexander Beloborodov ; Maxim Khalilov ; Serge Sharoff

Abstract: This paper presents the settings and the results of the ROMIP 2013 MT shared task for the English→Russian language directfioorn. t Teh Een quality Rofu generated utraagnsel datiiroencswas assessed using automatic metrics and human evaluation. We also discuss ways to reduce human evaluation efforts using pairwise sentence comparisons by human judges to simulate sort operations.


reference text

Ron Artstein and Massimo Poesio. 2008. Inter-coder agreement for computational linguistics. Computa- tional Linguistics, 34(4):555–596. Bogdan Babych, Anthony Hartley, Serge Sharoff, and Olga Mudraya. 2007. Assisting translators in indirect lexical transfer. In Proc. of 45th ACL, pages 739–746, Prague. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 65–72, Ann Arbor, Michigan, June. Hanna Béchar, Raphaël Rubino, Yifan He, Yanjun Ma, and Josef van Genabith. 2012. An evaluation of statistical post-editing systems applied to RBMT and SMT systems. In Proceedings of COLING’12, Mumbai. Igor Boguslavsky. 1995. A bi-directional Russian-toEnglish machine translation system (ETAP-3). In Proceedings of the Machine Translation Summit V, Luxembourg. Chris Callison-Burch, Philipp Koehn, Christof Monz, and Omar F Zaidan. 2011. Findings of the 2011 workshop on statistical machine translation. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 22–64. Association for Computational Linguistics. Chris Callison-Burch, Philipp Koehn, Christof Monz, Matt Post, Radu Soricut, and Lucia Specia. 2012. Findings of the 2012 workshop on statistical machine translation. In Proceedings of the Seventh Workshop on Statistical Machine Translation, pages 10–51, Montréal, Canada, June. George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram cooccurrence statistics. In Proceedings of the second international conference on Human Language Technology, pages 138–145, San Diego, CA. 266 Marcelo Federico, Mauro Cettolo, Luisa Bentivogli, Michael Paul, and Sebastian Stuker. 2012. Overview of the IWSLT 2012 evaluation campaign. In Proceedings of the International Workshop on Spoken Language Translation (IWSLT), pages 12– 34, Hong Kong, December. Peter J. Huber. 1996. Robust Statistical Procedures. Society for Industrial and Applied Mathematics. John Hutchins, editor. 2000. Early years in machine translation: Memoirs and biographies of pioneers. John Benjamins, Amsterdam, Philadelphia. http : / /www .hut chinsweb .me .uk / EarlyYears -2 0 0 0 -TOC .htm. Philippe Langlais. 2002. Improving a general-purpose statistical translation engine by terminological lexicons. In Proceedings of Second international workshop on computational terminology (COMPUTERM 2002), pages 1–7, Taipei, Taiwan. http : / / acl . ldc .upenn .edu /W/W0 2 /W0 2 -1 0 5 .pdf. 4 Sharon O’Brien. 2011. post-editing productivity. 25(3): 197–215. Towards predicting Machine translation, Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2001. BLEU: a method for automatic evaluation of machine translation. Technical Report RC22176 (W0109-022), IBM Thomas J. Watson Research Center. Serge Sharoff. 2010. In the garden and in the jungle: Comparing genres in the BNC and Internet. In Alexander Mehler, Serge Sharoff, and Marina Santini, editors, Genres on the Web: Computational Models and Empirical Studies, pages 149– 166. Springer, Berlin/New York. Matthew Snover, Nitin Madnani, Bonnie Dorr, and Richard Schwartz. 2009. Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 259–268, Athens, Greece, March. Joseph Turian, Luke Shen, and I. Dan Melamed. 2003. Evaluation of machine translation and its evaluation. In Proceedings of Machine Translation Summit IX, New Orleans, LA, USA, September. John S. White, Theresa O’Connell, and Francis O’Mara. 1994. The ARPA MT evaluation methodologies: Evolution, lessons, and further approaches. In Proceedings of AMTA’94, pages 193–205. 267