acl acl2012 acl2012-147 acl2012-147-reference knowledge-graph by maker-knowledge-mining

147 acl-2012-Modeling the Translation of Predicate-Argument Structure for SMT

Source: pdf

Author: Deyi Xiong ; Min Zhang ; Haizhou Li

Abstract: Predicate-argument structure contains rich semantic information of which statistical machine translation hasn’t taken full advantage. In this paper, we propose two discriminative, feature-based models to exploit predicateargument structures for statistical machine translation: 1) a predicate translation model and 2) an argument reordering model. The predicate translation model explores lexical and semantic contexts surrounding a verbal predicate to select desirable translations for the predicate. The argument reordering model automatically predicts the moving direction of an argument relative to its predicate after translation using semantic features. The two models are integrated into a state-of-theart phrase-based machine translation system and evaluated on Chinese-to-English transla- , tion tasks with large-scale training data. Experimental results demonstrate that the two models significantly improve translation accuracy.

reference text

Wilker Aziz, Miguel Rios, and Lucia Specia. 2011. Shallow semantic trees for smt. In Proceedings of the Sixth Workshop on Statistical Machine Translation, pages 3 16–322, Edinburgh, Scotland, July. Association for Computational Linguistics. Adam L. Berger, Stephen A. Della Pietra, and Vincent J. Della Pietra. 1996. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39–71. David Chiang. 2007. Hierarchical phrase-based translation. Computational Linguistics, 33(2):201–228. Pascale Fung, Wu Zhaojun, Yang Yongsheng, and Dekai Wu. 2006. Automatic learning of chinese english semantic structure mapping. In IEEE/ACL 2006 Workshop on Spoken Language Technology (SLT 2006), Aruba, December. Philipp Koehn, Franz Joseph Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, pages 58–54, Edmonton, Canada, May-June. Philipp Koehn. 2004. Statistical significance tests for machine translation evaluation. In Proceedings of EMNLP 2004, pages 388–395, Barcelona, Spain, July. Mamoru Komachi and Yuji Matsumoto. 2006. Phrase reordering for statistical machine translation based on predicate-argument structure. In In Proceedings of the International Workshop on Spoken Language Translation: Evaluation Campaign on Spoken Language Translation, pages 77–82. Junhui Li, Guodong Zhou, and Hwee Tou Ng. 2010. Joint syntactic and semantic parsing of chinese. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1108– 1117, Uppsala, Sweden, July. Association for Computational Linguistics. Ding Liu and Daniel Gildea. 2010. Semantic role features for machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 716–724, Beijing, China, August. Coling 2010 Organizing Committee. Arne Mauser, Saˇ sa Hasan, and Hermann Ney. 2009. Extending statistical machine translation with discriminative and trigger-based lexicon models. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 210–218, Singapore, August. Association for Computational Linguistics. Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of 40th 910 Annual Meeting of the Association for Computational Linguistics, pages 3 11–3 18, Philadelphia, Pennsylvania, USA, July. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 433–440, Sydney, Australia, July. Association for Computational Linguistics. Andreas Stolcke. 2002. Srilm–an extensible language modeling toolkit. In Proceedings of the 7th International Conference on Spoken Language Processing, pages 901–904, Denver, Colorado, USA, September. Sriram Venkatapathy and Srinivas Bangalore. 2007. Three models for discriminative machine translation using global lexical selection and sentence reconstruction. In Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation, pages 96–102, Rochester, New York, April. Association for Computational Linguistics. Dekai Wu and Pascale Fung. 2009a. Can semantic role labeling improve smt. In Proceedings of the 13th Annual Conference of the EAMT, pages 218–225, Barcelona, May. Dekai Wu and Pascale Fung. 2009b. Semantic roles for smt: A hybrid two-pass model. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers, pages 13–16, Boulder, Colorado, June. Association for Computational Linguistics. Xianchao Wu, Katsuhito Sudoh, Kevin Duh, Hajime Tsukada, and Masaaki Nagata. 2011. Extracting preordering rules from predicate-argument structures. In Proceedings of 5th International Joint Conference on Natural Language Processing, pages 29–37, Chiang Mai, Thailand, November. Asian Federation of Natural Language Processing. Dekai Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23(3):377–403. Deyi Xiong, Qun Liu, and Shouxun Lin. 2006. Maximum entropy based phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 521–528, Sydney, Australia, July. Association for Computational Linguistics. Deyi Xiong, Min Zhang, and Haizhou Li. 2011. A maximum-entropy segmentation model for statistical machine translation. IEEE Transactions on Audio, Speech and Language Processing, 19(8):2494–2505. Nianwen with Xue. semantic 2008. roles. Labeling chinese Computational predicates Linguistics, 34(2):225–255. 911