acl acl2010 acl2010-87 acl2010-87-reference knowledge-graph by maker-knowledge-mining

87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation

Source: pdf

Author: John DeNero ; Dan Klein

Abstract: We present a discriminative model that directly predicts which set ofphrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate features on phrase pairs, in addition to word links. Second, we can optimize for an extraction-based loss function that relates directly to the end task of generating translations. Our model gives improvements in alignment quality relative to state-of-the-art unsupervised and supervised baselines, as well as providing up to a 1.4 improvement in BLEU score in Chinese-to-English translation experiments.

reference text

Necip Fazil Ayan and Bonnie J. Dorr. 2006. Going beyond AER: An extensive analysis of word alignments and their impact on MT. In Proceedings of 1461 the Annual Conference of the Association for Computational Linguistics. Necip Fazil Ayan, Bonnie J. Dorr, and Christof Monz. 2005. Neuralign: combining word alignments using neural networks. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Alexandra Birch, Chris Callison-Burch, and Miles Osborne. 2006. Constraining the phrase-based, joint probability statistical translation model. In Proceedings of the Conference for the Association for Machine Translation in the Americas. Phil Blunsom, Trevor borne. 2009. A chronous grammar Annual Conference tional Linguistics. Cohn, Chris Dyer, and Miles OsGibbs sampler for phrasal syninduction. In Proceedings of the of the Association for Computa- Eugene Charniak and Sharon Caraballo. 1998. New figures of merit for best-first probabilistic chart parsing. In Computational Linguistics. Colin Cherry and Dekang Lin. 2006. Soft syntactic constraints for word alignment through discriminative training. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Colin Cherry and Dekang Lin. 2007. Inversion transduction grammar for joint phrasal translation modeling. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics Workshop on Syntax and Structure in Statistical Translation. David Chiang, Yuval Marton, and Philip Resnik. 2008. Online large-margin training of syntactic and structural translation features. In Proceedings ofthe Con- ference on Empirical Methods in Natural Language Processing. David Chiang. 2007. Hierarchical phrase-based translation. Computational Linguistics. Koby Crammer and Yoram Singer. 2003. Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research, 3:951–991 . John DeNero and Dan Klein. 2007. Tailoring word alignments to syntactic machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics. John DeNero and Dan Klein. 2008. The complexity of phrase alignment problems. In Proceedings of the Annual Conference of the Association for Computational Linguistics: Short Paper Track. John DeNero, Dan Gillick, James Zhang, and Dan Klein. 2006. Why generative phrase models underperform surface heuristics. In Proceedings of the NAACL Workshop on Statistical Machine Translation. John DeNero, Alexandre Bouchard-Cote, and Dan Klein. 2008. Sampling alignment structure under a bayesian translation model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Yonggang Deng and Bowen Zhou. 2009. Optimizing word alignment combination for phrase table training. In Proceedings of the Annual Conference of the Association for Computational Linguistics: Short Paper Track. Alexander Fraser and Daniel Marcu. 2006. Semisupervised training for statistical word alignment. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Alexander Fraser and Daniel Marcu. 2007. Getting the structure right for word alignment: Leaf. In Proceedings ofthe Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang, and Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Joshua Goodman. 1996. Parsing algorithms and metrics. In Proceedings of the Annual Meeting of the Association for Computational Linguistics. Aria Haghighi, John Blitzer, John DeNero, and Dan Klein. 2009. Better word alignments with supervised ITG models. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Liang Huang and David Chiang. 2007. Forest rescoring: Faster decoding with integrated language models. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Matti K a¨ a¨ri a¨inen. 2009. Sinuhe—statistical machine translation using a globally trained conditional exponential family translation model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Dan Klein and Chris Manning. 2003. A* parsing: Fast exact Viterbi parse selection. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. Philipp Koehn and Barry Haddow. 2009. Edinburghs submission to all tracks of the WMT2009 shared task with reordering and speed improvements to Moses. In Proceedings of the Workshop on Statistical Machine Translation. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 1462 Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the Annual Conference of the Associationfor Computational Linguistics: Demonstration track. Simon Lacoste-Julien, Ben Taskar, Dan Klein, and Michael I. Jordan. 2006. Word alignment via quadratic assignment. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren Thornton, Jonathan Weese, and Omar Zaidan. 2009. Joshua: An open source toolkit for parsing-based machine translation. In Proceedings of the Workshop on Statistical Machine Translation. Percy Liang, Ben Taskar, and Dan Klein. 2006. Alignment by agreement. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Adam Lopez. 2007. Hierarchical phrase-based translation with suffix arrays. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Daniel Marcu and Daniel Wong. 2002. A phrasebased, joint probability model for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. I. Dan Melamed. 2000. Models of translational equivalence among words. Computational Linguistics. Robert Moore and Chris Quirk. 2007. Faster beam-search decoding for phrasal statistical machine translation. In Proceedings of MT Summit XI. Robert C. Moore. 2005. A discriminative framework for bilingual word alignment. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Hermann Ney and Stephan Vogel. 1996. HMM-based word alignment in statistical translation. In Proceedings of the Conference on Computational lin- guistics. Franz Josef Och and Hermann Ney. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics, 29: 19–5 1. Franz Josef Och, Christoph Tillmann, and Hermann Ney. 1999. Improved alignment models for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Andreas Stolcke. 2002. Srilm an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing. Ben Taskar, Simon Lacoste-Julien, and Dan Klein. 2005. A discriminative matching approach to word alignment. In Proceedings ofthe Conference on Empirical Methods in Natural Language Processing. Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23:377–404. Hao Zhang and Daniel Gildea. 2005. Stochastic lexicalized inversion transduction grammar for alignment. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Hao Zhang and Daniel Gildea. 2006. Efficient search for inversion transduction grammar. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Hao Zhang, Chris Quirk, Robert C. Moore, and Daniel Gildea. 2008. Bayesian learning of noncompositional phrases with synchronous parsing. In Proceedings of the Annual Conference of the Association for Computational Linguistics. 1463