acl acl2013 acl2013-363 acl2013-363-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Hendra Setiawan ; Bowen Zhou ; Bing Xiang ; Libin Shen
Abstract: Long distance reordering remains one of the greatest challenges in statistical machine translation research as the key contextual information may well be beyond the confine of translation units. In this paper, we propose Two-Neighbor Orientation (TNO) model that jointly models the orientation decisions between anchors and two neighboring multi-unit chunks which may cross phrase or rule boundaries. We explicitly model the longest span of such chunks, referred to as Maximal Orientation Span, to serve as a global parameter that constrains underlying local decisions. We integrate our proposed model into a state-of-the-art string-to-dependency translation system and demonstrate the efficacy of our proposal in a large-scale Chinese-to-English translation task. On NIST MT08 set, our most advanced model brings around +2.0 BLEU and -1.0 TER improvement.
Nguyen Bach, Qin Gao, and Stephan Vogel. 2009. Source-side dependency tree reordering models with subtree movements and constraints. In Proceedings of the Twelfth Machine Translation Summit (MTSummit-XII), Ottawa, Canada, August. International Association for Machine Translation. Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher D. Manning. 2009. Discriminative reordering with Chinese grammatical relations features. In Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009, pages 51–59, Boulder, Colorado, June. Association for Computational Linguistics. Stanley Chen. 2009. Shrinking exponential language models. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 468–476, Boulder, Colorado, June. Association for Computational Linguistics. David Chiang, Yuval Marton, and Philip Resnik. 2008. Online large-margin training of syntactic and structural translation features. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 224–233, Honolulu, Hawaii, October. David Chiang, Steve DeNeefe, and Michael Pust. 2011. Two easy improvements to lexical weighting. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 455–460, Portland, Oregon, USA, June. Association for Computational Linguistics. David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 263–270, Ann Arbor, Michigan, June. Association for Computational Linguistics. Marta R. Costa-juss a` and Jos e´ A. R. Fonollosa. 2006. Statistical machine reordering. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 70–76, Sydney, Australia, July. Association for Computational Linguistics. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, XiangRui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9: 1871–1874. Michel Galley and Christopher D. Manning. 2008. A simple and effective hierarchical phrase reordering model. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 848–856, Honolulu, Hawaii, October. Association for Computational Linguistics. Dmitriy Genzel. 2010. Automatically learning sourceside reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 376–384, Beijing, China, August. Coling 2010 Organizing Committee. Mark Hopkins and Jonathan May. 2011. Tuning as ranking. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 1352–1362, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. Zhongqiang Huang, Martin Cmejrek, and Bowen Zhou. 2010. Soft syntactic constraints for hierarchical phrase-based translation using latent syntactic distributions. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 138–147, Cambridge, MA, October. Association for Computational Linguistics. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondrej Bojar, Alexandra Constantin, and Evan Herbst. 2007. Moses: Open source toolkit for statistical machine translation, June. Yuval Marton and Philip Resnik. 2008. Soft syntactic constraints for hierarchical phrased-based translation. In Proceedings of The 46th Annual Meeting of the Association for Computational Linguis- tics: Human Language Technologies, pages 1003– 1011, Columbus, Ohio, June. Masaaki Nagata, Kuniko Saito, Kazuhide Yamamoto, and Kazuteru Ohashi. 2006. A clustered global phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 713–720, Sydney, Australia, July. Association for Computational Linguistics. Jan Niehues and Muntsin Kolss. 2009. A POS-based model for long-range reorderings in SMT. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 206–214, Athens, Greece, March. Association for Computational Linguistics. Hendra Setiawan, Min-Yen Kan, and Haizhou Li. 2007. Ordering phrases with function words. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 712– 719, Prague, Czech Republic, June. Association for Computational Linguistics. Hendra Setiawan, Min Yen Kan, Haizhou Li, and Philip Resnik. 2009. Topological ordering of function words in hierarchical phrase-based translation. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing 1273 of the AFNLP, pages 324–332, Suntec, Singapore, August. Association for Computational Linguistics. Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of ACL-08: HLT, pages 577–585, Columbus, Ohio, June. Association for Computational Linguistics. Libin Shen, Jinxi Xu, Bing Zhang, Spyros Matsoukas, and Ralph Weischedel. 2009. Effective use of linguistic and contextual information for statistical machine translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 72–80, Singapore, August. Association for Computational Linguistics. Christoph Tillman. 2004. A unigram orientation model for statistical machine translation. In HLT-NAACL 2004: Short Papers, pages 101–104, Boston, Massachusetts, USA, May 2 - May 7. Association for Computational Linguistics. Christoph Tillmann and Tong Zhang. 2007. A block bigram prediction model for statistical machine translation. ACM Transactions on Speech and Language Processing (TSLP), 4(3). Roy Tromble and Jason Eisner. 2009. Learning linear ordering problems for better translation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1007–1016, Singapore, August. Association for Computational Linguistics. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. 2009. Preference grammars: Softening syntactic constraints to improve statistical machine translation. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 236–244, Boulder, Colorado, June. Association for Computational Linguistics. Karthik Visweswariah, Rajakrishnan Rajkumar, Ankur Gandhe, Ananthakrishnan Ramanathan, and Jiri Navratil. 2011. A word reordering model for improved machine translation. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 486–496, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. Deyi Xiong, Min Zhang, Aiti Aw, and Haizhou Li. 2009. A syntax-driven bracketing model for phrasebased translation. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 3 15– 323, Suntec, Singapore, August. Association for Computational Linguistics. Deyi Xiong, Min Zhang, and Haizhou Li. 2010. Learning translation boundaries for phrase-based decoding. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 136–144, Los Angeles, California, June. Association for Computational Linguistics. Deyi Xiong, Min Zhang, and Haizhou Li. 2012. Modeling the translation of predicate-argument structure for smt. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 902–91 1, Jeju Island, Korea, July. Association for Computational Linguistics. Richard Zens and Hermann Ney. 2006. Discrimina- tive reordering models for statistical machine translation. In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL): Proceedings of the Workshop on Statistical Machine Translation, pages 55–63, New York City, NY, June. Association for Computational Linguistics. Andreas Zollmann and Stephan Vogel. 2011. A wordclass approach to labeling pscfg rules for machine translation. In Proceedings ofthe 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1–1 1, Portland, Oregon, USA, June. Association for Computational Linguistics. 1274