emnlp emnlp2013 emnlp2013-175 emnlp2013-175-reference knowledge-graph by maker-knowledge-mining

175 emnlp-2013-Source-Side Classifier Preordering for Machine Translation

Source: pdf

Author: Uri Lerner ; Slav Petrov

Abstract: We present a simple and novel classifier-based preordering approach. Unlike existing preordering models, we train feature-rich discriminative classifiers that directly predict the target-side word order. Our approach combines the strengths of lexical reordering and syntactic preordering models by performing long-distance reorderings using the structure of the parse tree, while utilizing a discriminative model with a rich set of features, including lexical features. We present extensive experiments on 22 language pairs, including preordering into English from 7 other languages. We obtain improvements of up to 1.4 BLEU on language pairs in the WMT 2010 shared task. For languages from different families the improvements often exceed 2 BLEU. Many of these gains are also significant in human evaluations.

reference text

A. Abeill´ e, L. Cl´ ement, and F. Toussenel. 2003. Building a Treebank for French. In A. Abeill´ e, editor, Treebanks: Building and Using Parsed Corpora, chapter 10. Kluwer. P. F. Brown, J. Cocke, S. A. Della Pietra, V. J. Della Pietra, F. Jelinek, J. D. Lafferty, R. L. Mercer, and P. S. Roossin. 1990. A statistical approach to machine translation. Computational Linguistics, 16(2). P. F. Brown, V. J. Della Pietra, S. A. Della Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19. S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proc. of CoNLL ’06. C. Callison-Burch, P. Koehn, C. Monz, K. Peterson, M. Przybocki, and O. Zaidan. 2010. Findings of the 2010 joint workshop on statistical machine translation and metrics for machine translation. In Proc. of ACL’05 WMT. M. Collins, P. Koehn, and I. Kuˇ cerov a´. 2005. Clause restructuring for statistical machine translation. In Proc. of ACL ’05. M. Collins. 1997. Three generative, lexicalised models for statistical parsing. In ACL ’97. M.-C. de Marneffe, B. MacCartney, and C. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proc. of LREC ’06. J. DeNero and J. Uszkoreit. 2011. Inducing sentence structure from parallel corpora for reordering. In Proc. of EMNLP ’11. J. Duchi and Y. Singer. 2009. Boosting with structural sparsity. In Proc. of ICML ’09. C. Dyer and P. Resnik. 2010. Context-free reordering, finite-state translation. In Proc. of NAACL-HLT ’10. M. Galley, M. Hopkins, K. Knight, and D. Marcu. 2004. What’s in a translation rule? In Proc. of NAACL-HLT ’04. D. Genzel. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proc. of COLING ’10. N. Habash. 2007. Syntactic preprocessing for statistical machine translation. In Proc. of MTS ’07. L. Huang, K. Knight, and A. Joshi. 2006. Statistical syntax-directed translation with extended domain of locality. In Proc. of AMTA ’06. J. Judge, A. Cahill, and J. v. Genabith. 2006. QuestionBank: creating a corpus of parse-annotated questions. In Proc. of ACL ’06. P. Koehn, F. J. Och, and D. Marcu. 2003. Statistical phrase based translation. In Proc. of NAACL-HLT ’03. T. Koo, X. Carreras, and M. Collins. 2008. Simple semisupervised dependency parsing. In Proc. of ACL-HLT ’08. S. Kumar, W. Macherey, C. Dyer, and F. Och. 2009. Efficient minimum error rate training and minimum bayesrisk decoding for translation hypergraphs and lattices. In Proc. of ACL ’09. J. Lafferty, A. McCallum, and F. Pereira. 2001 . Con- ditional Random Fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML ’01. C. H. Li, M. Li, D. Zhang, M. Li, M. Zhou, and Y. Guan. 2007. A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation. In Proc. of ACL ’07. M. Marcus, B. Santorini, and M. Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. In Computational Linguistics. R. McDonald, J. Nivre, Y. Quirmbach-Brundagez, Y. Goldberg, D. Das, K. Ganchev, K. Hall, S. Petrov, H. Zhang, O. T¨ ackstr o¨m, C. Bedini, N. Bertomeu Castell o´, and J. Lee. 2013. Universal dependency annotation for multilingual parsing. In Proc. of ACL ’13. G. Neubig, T. Watanabe, and S. Mori. 2012. Inducing a discriminative parser to optimize machine translation reordering. In Proc. of EMNLP-CoNLL ’12. J. Nivre and J. Nilsson. 2005. Pseudo-projective dependency parsing. In Proc. of ACL ’05. J. Nivre, J. Hall, S. K ¨ubler, R. McDonald, J. Nilsson, S. Riedel, and D. Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In Proc. EMNLPCoNLL ’07. 523 F. J. Och and H. Ney. 2004. The alignment template approach to statistical machine translation. Computa- tional Linguistics, 30(4). K. Papineni, S. Roukos, T. Ward, and W. Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proc. of ACL ’02. S. Petrov and R. McDonald. 2012. Overview of the 2012 shared task on parsing the web. In Proc. of NAACL ’12 SANCL. S. Petrov, D. Das, and R. McDonald. 2012. A universal part-of-speech tagset. In Proc. of LREC ’12. D. Talbot, H. Kazawa, H. Ichikawa, J. Katz-Brown, M. Seno, and F. Och. 2011. A lightweight evaluation framework for machine translation reordering. In Proc. of EMNLP ’11 WMT. C. Tillmann. 2004. A unigram orientation model for statistical machine translation. In Proc. of NAACL-HLT ’04. R. Tromble and J. Eisner. 2009. Learning linear ordering problems for better translation. In Proc. of EMNLP ’09. J. Uszkoreit and T. Brants. 2008. Distributed word clustering for large scale class-based language modeling in machine translation. In Proc. of ACL-HLT ’08. S. Vogel, H. Ney, and C. Tillmann. 1996. HMM-based word alignment in statistical translation. In In Proc. of COLING ’96. C. Wang, M. Collins, and P. Koehn. 2007. Chinese syntactic reordering for statistical machine translation. In Proc. of EMNLP-CoNLL ’07. X. Wu, K. Sudoh, K. Duh, H. Tsukada, and M. Nagata. 2011. Extracting pre-ordering rules from predicateargument structures. In Proc. of IJCNLP ’11. F. Xia and M. McCord. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proc. of COLING ’04. P. Xu, J. Kang, M. Ringgaard, and F. Och. 2009. Using a dependency parser to improve SMT for subject-objectverb languages. In Proc. of NAACL-HLT ’09. K. Yamada and K. Knight. 2001 . A syntax-based statistical translation model. In Proc. of ACL ’01. N. Yang, M. Li, D. Zhang, and N. Yu. 2012. A rankingbased approach to word reordering for statistical machine translation. In Proc. of ACL ’12. R. Zens and H. Ney. 2006. Discriminative reordering models for statistical machine translation. In Proc. of NAACL ’06 WMT. Y. Zhang and J. Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proc. of ACL-HLT ’11. H. Zhang, L. Fang, P. Xu, and X. Wu. 2011. Binarized forest to string translation. In Proc. of ACL-HLT ’11.