acl acl2010 acl2010-52 acl2010-52-reference knowledge-graph by maker-knowledge-mining

52 acl-2010-Bitext Dependency Parsing with Bilingual Subtree Constraints

Source: pdf

Author: Wenliang Chen ; Jun'ichi Kazama ; Kentaro Torisawa

Abstract: This paper proposes a dependency parsing method that uses bilingual constraints to improve the accuracy of parsing bilingual texts (bitexts). In our method, a targetside tree fragment that corresponds to a source-side tree fragment is identified via word alignment and mapping rules that are automatically learned. Then it is verified by checking the subtree list that is collected from large scale automatically parsed data on the target side. Our method, thus, requires gold standard trees only on the source side of a bilingual corpus in the training phase, unlike the joint parsing model, which requires gold standard trees on the both sides. Compared to the reordering constraint model, which requires the same training data as ours, our method achieved higher accuracy because ofricher bilingual constraints. Experiments on the translated portion of the Chinese Treebank show that our system outperforms monolingual parsers by 2.93 points for Chinese and 1.64 points for English.

reference text

Ann Bies, Martha Palmer, Justin Mott, and Colin Warner. 2007. English Chinese translation treebank v 1.0. In LDC2007T02. David Burkett and Dan Klein. 2008. Two languages are better than one (for syntactic parsing). In Proceedings ofthe 2008 Conference on Empirical Methods in Natural Language Processing, pages 877– 886, Honolulu, Hawaii, October. Association for Computational Linguistics. X. Carreras. 2007. Experiments with a higher-order projective dependency parser. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pages 957–961. WL. Chen, J. Kazama, K. Uchimoto, and K. Torisawa. 2009. Improving dependency parsing with subtrees from auto-parsed data. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 570–579, Singapore, August. Association for Computational Linguistics. John DeNero and Dan Klein. 2007. Tailoring word alignments to syntactic machine translation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 17–24, Prague, Czech Republic, June. Association for Computational Linguistics. Yuan Ding and Martha Palmer. 2005. Machine translation using probabilistic synchronous dependency insertion grammars. In ACL ’05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 541–548, Morristown, NJ, USA. Association for Computational Linguistics. J. Eisner. 1996. Three new probabilistic models for dependency parsing: An exploration. In Proc. of the 16th Intern. Conf. on Computational Linguistics (COLING), pages 340–345. Liang Huang, Wenbin Jiang, and Qun Liu. 2009. Bilingually-constrained (monolingual) shift-reduce parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1222–123 1, Singapore, August. Association for Computational Linguistics. Chu-Ren Huang. 2009. Tagged Chinese Gigaword Version 2.0, LDC2009T14. Linguistic Data Consortium. P. Koehn, F.J. Och, and D. Marcu. 2003. Statistical phrase-based translation. In Proceedings of NAACL, page 54. Association for Computational Linguistics. Canasai Kruengkrai, Kiyotaka Uchimoto, Jun’ichi Kazama, Yiou Wang, Kentaro Torisawa, and Hitoshi Isahara. 2009. An error-driven word-character hybrid model for joint Chinese word segmentation and POS tagging. In Proceedings of ACL-IJCNLP2009, pages 5 13–521, Suntec, Singapore, August. Association for Computational Linguistics. Percy Liang, Ben Taskar, and Dan Klein. 2006. Alignment by agreement. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 104–1 11, New York City, USA, June. Association for Computational Linguistics. R. McDonald and F. Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proc. of EACL2006. R. McDonald, K. Crammer, and F. Pereira. 2005. Online large-margin training of dependency parsers. In Proc. of ACL 2005. T. Nakazawa, K. Yu, D. Kawahara, and S. Kurohashi. 2006. Example-based machine translation based on deeper nlp. In Proceedings of IWSLT 2006, pages 64–70, Kyoto, Japan. J. Nivre and S. Kubler. 2006. Dependency parsing: Tutorial at Coling-ACL 2006. In CoLING-ACL. J. Nivre and R. McDonald. 2008. Integrating graphbased and transition-based dependency parsers. In Proceedings of ACL-08: HLT, Columbus, Ohio, June. J. Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of IWPT2003, pages 149–160. David A. Smith and Noah A. Smith. 2004. Bilingual parsing with factored estimation: Using English to parse Korean. In Proceedings of EMNLP. Nianwen Xue, Fu-Dong Chiou, and Martha Palmer. 2002. Building a large-scale annotated Chinese corpus. In Coling. H. Yamada and Y. Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of IWPT2003, pages 195–206. Hai Zhao, Yan Song, Chunyu Kit, and Guodong Zhou. 2009. Cross language dependency parsing using a bilingual lexicon. In Proceedings of ACLIJCNLP2009, pages 55–63, Suntec, Singapore, August. Association for Computational Linguistics. 29