emnlp emnlp2010 emnlp2010-47 emnlp2010-47-reference knowledge-graph by maker-knowledge-mining

47 emnlp-2010-Example-Based Paraphrasing for Improved Phrase-Based Statistical Machine Translation


Source: pdf

Author: Aurelien Max

Abstract: In this article, an original view on how to improve phrase translation estimates is proposed. This proposal is grounded on two main ideas: first, that appropriate examples of a given phrase should participate more in building its translation distribution; second, that paraphrases can be used to better estimate this distribution. Initial experiments provide evidence of the potential of our approach and its implementation for effectively improving translation performance.


reference text

Sadaf Abdul-Rauf and Holger Schwenk. 2009. On the Use of Comparable Corpora to Improve SMT performance. In Proceedings of EACL, Athens, Greece. Wilker Aziz, Marc Dymetman, Shachar Mirkin, Lucia Specia, Nicola Cancedda, and Ido Dagan. 2010. Learning an Expert from Human Annotations in Statistical Machine Translation: the Case of Out-ofVocabulary Words. In Proceedings of EAMT, SaintRaphael, France. Colin Bannard and Chris Callison-Burch. 2005. Para- phrasing with Bilingual Parallel Corpora. In Proceedings of ACL, Ann Arbor, USA. Francis Bond, Eric Nichols, Darren Scott Appling, and Michael Paul. 2008. Improving statistical machine translation by paraphrasing the training data. In Proceedings of IWSLT, Hawai, USA. Chris Callison-Burch, Colin Bannard, and Josh Schroeder. 2005. Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases. In Proceedings of ACL, Ann Arbor, USA. Chris Callison-Burch, Philipp Koehn, and Miles Osborne. 2006. Improved Statistical Machine Translation Using Paraphrases. In Proceedings of NAACL, New York, USA. Chris Callison-Burch. 2008. Syntactic Constraints on Paraphrases Extracted from Parallel Corpora. In Proceedings of EMNLP, Hawai, USA. Marine Carpuat and Dekai Wu. 2007. ContextDependent Phrasal Translation Lexicons for Statistical Machine Translation. In Proceedings of Machine Translation Summit XI, Copenhagen, Denmark. Marine Carpuat. 2009. One Translation Per Discourse. In Proceedings of the NAACL-HLT Workshop on Semantic Evaluations, Boulder, USA. Jinhua Du, Jie Jiang, and Andy Way. 2010. Facilitating Translation Using Source Language Paraphrase Lattices. In Proceedings of EMNLP, Cambridge, USA. Kevin Gimpel and Noah A. Smith. 2008. Rich SourceSide Context for Statistical Machine Translation. In Proceedings of the ACL Workshop on Statistical Machine Translation, Columbus, USA. Rejwanul Haque, Sudip Kumar Naskar, Yanjun Ma, and Andy Way. 2009. Using Supertags as Source Language Context in SMT. In Proceedings of EAMT, Barcelona, Spain. Almut Silja Hildebrand, Matthias Eck, Stephan Vogel, and Alex Waibel. 2005. Adaptation of the Translation Model for Statistical Machine Translation Based on Information Retrieval. In Proceedings of EAMT, Budapest, Hungary. David Kauchak and Regina Barzilay. 2006. Paraphrasing for Automatic Evaluation. In Proceedings of NAACL HLT, New York, USA. Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of NAACL HLT, Edmonton, Canada. Stanley Kok and Chris Brockett. 2010. Hitting the Right Paraphrases in Good Time. In Proceedings of NAACL, Los Angeles, USA. Adam Lopez. 2008. Tera-Scale Translation Models via Pattern Matching. In Proceedings of COLING, Manchester, UK. Nitin Madnani and Bonnie J. Dorr. 2010. Generating Phrasal & Sentential Paraphrases: A Survey of DataDriven Methods. Computational Linguistics, 36(3). Nitin Madnani, Philip Resnik, Bonnie J. Dorr, and Richard Schwartz. 2008. Are Multiple Reference Translations Necessary? Investigating the Value of Paraphrased Reference Translations in Parameter Optimization. In Proceedings of AMTA, Waikiki, USA. 666 Yuval Marton, Chris Callison-Burch, and Philip Resnik. 2009. Improved Statistical Machine Translation Using Monolingually-derived Paraphrases. In Proceedings of EMNLP, Singapore. Aur e´lien Max, Rafik Makhloufi, and Philippe Langlais. 2008. Explorations in using grammatical dependencies for contextual phrase translation disambiguation. In Proceedings of EAMT, Hamburg, Germany. Aur e´lien Max, Josep M. Crego, and Fran ¸cois Yvon. 2010. Contrastive Lexical Evaluation of Machine Translation. In Proceedings of LREC, Valletta, Malta. Aur e´lien Max. 2008. Local rephrasing suggestions for supporting the work of writers. In Proceedings of GoTAL, Gothenburg, Sweden. Shachar Mirkin, Lucia Specia, Nicola Cancedda, Ido Dagan, Marc Dymetman, and Idan Szpektor. 2009. Source-Language Entailment Modeling for Translating Unknown Terms. In Proceedings of ACL, Singapore. Behrang Mohit and Rebecca Hwa. 2007. Localization of Difficult-to-Translate Phrases. In Proceedings of the ACL Workshop on Statistical Machine Translation, Prague, Czech Republic. Dragos Stefan Munteanu and Daniel Marcu. 2005. Improving Machine Translation Performance by Exploiting Non-parallel Corpora. Computational Linguistics, 31(4). Takashi Onishi, Masao Utiyama, and Eiichiro Sumita. 2010. Paraphrase Lattice for Statistical Machine Translation. In Proceedings of ACL, short paper session, Uppsala, Sweden. Philip Resnik, Olivia Buzek, Chang Hu, Yakov Kronrod, Alex Quinn, and Benjamin B. Bederson. 2010. Improving Translation via Targeted Paraphrasing. In Proceedings of EMNLP, Cambridge, USA. Josh Schroeder, Trevor Cohn, and Philipp Koehn. 2009. Word Lattices for Multi-Source Translation. In Proceedings of EACL, Athens, Greece. Matthew Snover, Bonnie J. Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings of AMTA, Boston, USA. Nicolas Stroppa, Antal van den Bosch, and Andy Way. 2007. Exploiting Source Similarity for SMT using Context-Informed Features. In Proceedings of TMI, Skovde, Sweden. Guillaume Wisniewski, Alexandre Allauzen, and Fran ¸cois Yvon. 2010. Assessing Phrase-based Translation Models with Oracle Decoding. In Proceedings of EMNLP, Cambridge, USA.