acl acl2011 acl2011-263 acl2011-263-reference knowledge-graph by maker-knowledge-mining

263 acl-2011-Reordering Constraint Based on Document-Level Context

Source: pdf

Author: Takashi Onishi ; Masao Utiyama ; Eiichiro Sumita

Abstract: One problem with phrase-based statistical machine translation is the problem of longdistance reordering when translating between languages with different word orders, such as Japanese-English. In this paper, we propose a method of imposing reordering constraints using document-level context. As the documentlevel context, we use noun phrases which significantly occur in context documents containing source sentences. Given a source sentence, zones which cover the noun phrases are used as reordering constraints. Then, in decoding, reorderings which violate the zones are restricted. Experiment results for patent translation tasks show a significant improvement of 1.20% BLEU points in JapaneseEnglish translation and 1.41% BLEU points in English-Japanese translation.

reference text

Eugene Charniak. 2000. A Maximum-Entropy-Inspired Parser. In Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 132–139. Colin Cherry. 2008. Cohesive Phrase-Based Decoding for Statistical Machine Translation. In Proceedings of ACL-08: HLT, pages 72–80. Katerina T. Frantzi and Sophia Ananiadou. 1996. Extracting Nested Collocations. In Proceedings of COLING 1996, pages 41–46. Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, Takehito Utsuro, Terumasa Ehara, Hiroshi Echizen-ya, and Sayori Shimohata. 2010. Overview of the Patent Translation Task at the NTCIR-8 Workshop. In Proceedings of NTCIR-8 Workshop Meeting, pages 371– 376. Hideki Isozaki, Katsuhito Sudoh, Hajime Tsukada, and Kevin Duh. 2010. Head Finalization: A Simple Reordering Rule for SOV Languages. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 244–25 1. Jason Katz-Brown and Michael Collins. 2008. Syntac- tic Reordering in Preprocessing for Japanese→English tTicra Rnesolartidoenr:in gM inIT P System Description feosre →NTEnCgIRli-s7h Patent Translation Task. In Proceedings of NTCIR-7 Workshop Meeting, pages 409–414. Philipp Koehn and Barry Haddow. 2009. Edinburgh’s Submission to all Tracks of the WMT 2009 Shared Task with Reordering and Speed Improvements to 438 Moses. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 160–164. Philipp Koehn and Kevin Knight. 2003. Feature-Rich Statistical Translation of Noun Phrases. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 3 11–3 18. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177–180. Philipp Koehn. 2004. Statistical Significance Tests for Machine Translation Evaluation. In Proceedings of EMNLP 2004, pages 388–395. Taku Kudo and Yuji Matsumoto. 2002. Japanese Dependency Analysis using Cascaded Chunking. In Proceedings of CoNLL-2002, pages 63–69. Yuval Marton and Philip Resnik. 2008. Soft Syntactic Constraints for Hierarchical Phrased-Based Translation. In Proceedings of ACL-08: HLT, pages 1003– 1011. Franz Josef Och. 2003. Minimum Error Rate Training in Statistical Machine Translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 160–167. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pages 3 11–3 18. Katsuhito Sudoh, Kevin Duh, Hajime Tsukada, Tsutomu Hirao, and Masaaki Nagata. 2010. Divide and Translate: Improving Long Distance Reordering in Statistical Machine Translation. In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, pages 418–427. Deyi Xiong, Min Zhang, and Haizhou Li. 2010. Learning Translation Boundaries for Phrase-Based Decoding. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 136–144. Hirofumi Yamamoto, Hideo Okuma, and Eiichiro Sumita. 2008. Imposing Constraints from the Source Tree on ITG Constraints for SMT. In Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2), pages 1 9.