acl acl2012 acl2012-87 acl2012-87-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Zhenghua Li ; Ting Liu ; Wanxiang Che
Abstract: We present a simple and effective framework for exploiting multiple monolingual treebanks with different annotation guidelines for parsing. Several types of transformation patterns (TP) are designed to capture the systematic annotation inconsistencies among different treebanks. Based on such TPs, we design quasisynchronous grammar features to augment the baseline parsing models. Our approach can significantly advance the state-of-the-art parsing accuracy on two widely used target treebanks (Penn Chinese Treebank 5. 1 and 6.0) using the Chinese Dependency Treebank as the source treebank. The improvements are respectively 1.37% and 1.10% with automatic part-of-speech tags. Moreover, an indirect comparison indicates that our approach also outperforms previous work based on treebank conversion.
Mohit Bansal and Dan Klein. 2011. Web-scale features for full-scale parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 693–702, Portland, Oregon, USA, June. Association for Computational Linguistics. Bernd Bohnet. 2009. Efficient parsing of syntactic and semantic dependency structures. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, pages 67–72, Boulder, Colorado, June. Association for Computational Linguistics. David Burkett and Dan Klein. 2008. Two languages are better than one (for syntactic parsing). In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 877–886, Honolulu, Hawaii, October. Association for Computational Linguistics. David Burkett, Slav Petrov, John Blitzer, and Dan Klein. 2010. Learning better monolingual models with unannotated bilingual text. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning, CoNLL ’ 10, pages 46–54, Stroudsburg, PA, USA. Association for Computational Linguistics. Eugene Charniak and Mark Johnson. 2005. Coarse-tofine n-best parsing and maxent discriminative reranking. In Proceedings of ACL-05, pages 173–180. Eugene Charniak. 2000. A maximum-entropy-inspired parser. In ANLP’00, pages 132–139. Wanxiang Che, Zhenghua Li, Yongqiang Li, Yuhang Guo, Bing Qin, and Ting Liu. 2009. Multilingual dependency-based syntactic and semantic parsing. In Proceedings of CoNLL 2009: Shared Task, pages 49– 54. Keh-Jiann Chen, Chi-Ching Luo, Ming-Chung Chang, Feng-Yi Chen, Chao-Jan Chen, Chu-Ren Huang, and Zhao-Ming Gao, 2003. Sinica treebank: Design criteria,representational issues and implementation, chap- ter 13, pages 23 1–248. Kluwer Academic Publishers. Wenliang Chen, Jun’ichi Kazama, Kiyotaka Uchimoto, and Kentaro Torisawa. 2009. Improving dependency parsing with subtrees from auto-parsed data. 683 In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 570–579, Singapore, August. Association for Computational Linguistics. Wenliang Chen, Jun’ichi Kazama, and Kentaro Torisawa. 2010. Bitext dependency parsing with bilingual subtree constraints. In Proceedings of the 48th Annual Meeting ofthe Associationfor ComputationalLinguistics, pages 21–29, Uppsala, Sweden, July. Association for Computational Linguistics. Micheal Collins, Lance Ramshaw, Jan Hajic, and Christoph Tillmann. 1999. A statistical parser for czech. In ACL 1999, pages 505–512. Michael Collins. 2002. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP 2002. Andrea Gesmundo, James Henderson, Paola Merlo, and Ivan Titov. 2009. A latent variable model of synchronous syntactic-semantic parsing for multiple languages. In Proceedings of CoNLL 2009: Shared Task, pages 37–42. Kevin Gimpel and Noah A. Smith. 2011. Quasisynchronous phrase dependency grammars for machine translation. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 474–485, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. Jan Haji cˇ, Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria Ant o`nia Mart ı´, Llu ı´s M `arquez, Adam Meyers, Joakim Nivre, Sebastian Pad o´, Jan Sˇt eˇp a´nek, Pavel Stra nˇ a´k, Mihai Surdeanu, Nianwen Xue, and Yi Zhang. 2009. The CoNLL2009 shared task: Syntactic and semantic dependencies in multiple languages. In Proceedings of CoNLL 2009. Liang Huang, Wenbin Jiang, and Qun Liu. 2009. Bilingually-constrained (monolingual) shift-reduce parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1222–1231, Singapore, August. Association for Computational Linguistics. Wenbin Jiang, Liang Huang, and Qun Liu. 2009. Automatic adaptation of annotation standards: Chinese word segmentation and pos tagging – a case study. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 522–530, Suntec, Singapore, August. Association for Computational Linguistics. Terry Koo and Michael Collins. 2010. Efficient thirdorder dependency parsers. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1–1 1, Uppsala, Sweden, July. Association for Computational Linguistics. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple semi-supervised dependency parsing. In Proceedings of ACL-08: HLT, pages 595–603, Columbus, Ohio, June. Association for Computational Linguistics. ZhenghuaLi, Min Zhang, Wanxiang Che, Ting Liu, Wenliang Chen, and Haizhou Li. 2011. Joint models for chinese pos tagging and dependency parsing. In EMNLP 2011, pages 1180–1 191. Ting Liu, Jinshan Ma, and Sheng Li. 2006. Building a dependency treebank for improving Chinese parser. In Journal of Chinese Language and Computing, volume 16, pages 207–224. Andr— F. T. Martins, Dipanjan Das, Noah A. Smith, and Eric P. Xing. 2008. Stacking dependency parsers. In EMNLP’08, pages 157–166. Ryan McDonald and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of EACL 2006. Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of dependency parsers. In Proceedings of ACL 2005, pages 91–98. Zheng-Yu Niu, Haifeng Wang, and Hua Wu. 2009. Exploiting heterogeneous treebanks for parsing. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pages 46–54, Suntec, Singapore, August. Association for Computational Linguistics. Joakim Nivre and Ryan McDonald. 2008. Integrating graph-based and transition-based dependency parsers. In Proceedings of ACL 2008, pages 950–958. Joakim Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 149–160. Eric W. Noreen. 1989. Computer-intensive methods for testing hypotheses: An introduction. John Wiley & Sons, Inc., New York. Book (ISBN 047161 1360 ). Zhou Qiang. 2004. Annotation scheme for chinese treebank. Journal of Chinese Information Processing, 18(4): 1–8. David Smith and Jason Eisner. 2006. Quasi-synchronous grammars: Alignment by soft projection of syntactic dependencies. In Proceedings on the Workshop on Statistical Machine Translation, pages 23–30, New York City, June. Association for Computational Linguistics. David A. Smith and Jason Eisner. 2009. Parser adaptation and projection with quasi-synchronous grammar features. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 822–83 1, Singapore, August. Association for Computational Linguistics. 684 Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the Jeopardy model? a quasisynchronous grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 22–32, Prague, Czech Republic, June. Association for Computational Linguistics. Kristian Woodsend and Mirella Lapata. 2011. Learning to simplify sentences with quasi-synchronous grammar and integer programming. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 409–420, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. Fei Xia, Rajesh Bhatt, Owen Rambow, Martha Palmer, and Dipti Misra. Sharma. 2008. Towards a multirepresentational treebank. In In Proceedings of the 7th International Workshop on Treebanks and Linguistic Theories. Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Martha Palmer. 2005. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. In Natural Language Engineering, volume 11, pages 207–238. Hiroyasu Yamada and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of IWPT 2003, pages 195–206. Yue Zhang and Stephen Clark. 2008a. Joint word segmentation and POS tagging using a single perceptron. In Proceedings of ACL-08: HLT, pages 888–896. Yue Zhang and Stephen Clark. 2008b. A tale of two parsers: Investigating and combining graph-based and transition-based dependency parsing. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 562–57 1, Honolulu, Hawaii, October. Association for Computational Linguistics. Yue Zhang and Joakim Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 188–193, Portland, Oregon, USA, June. Association for Computational Linguistics. Guangyou Zhou, Jun Zhao, Kang Liu, and Li Cai. 2011. Exploiting web-derived selectional preference to im- prove statistical dependency parsing. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 1556–1565, Portland, Oregon, USA, June. Association for Computational Linguistics.