acl acl2013 acl2013-44 acl2013-44-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jonathan K. Kummerfeld ; Daniel Tse ; James R. Curran ; Dan Klein
Abstract: Aspects of Chinese syntax result in a distinctive mix of parsing challenges. However, the contribution of individual sources of error to overall difficulty is not well understood. We conduct a comprehensive automatic analysis of error types made by Chinese parsers, covering a broad range of error types for large sets of sentences, enabling the first empirical ranking of Chinese error types by their performance impact. We also investigate which error types are resolved by using gold part-of-speech tags, showing that improving Chinese tagging only addresses certain error types, leaving substantial outstanding challenges.
Daniel M. Bikel and David Chiang. 2000. Two Statistical Parsing Models Applied to the Chinese Treebank. In Proceedings of the Second Chinese Language Processing Workshop, pages 1–6. Hong Kong, China. Martin Forst and Ji Fang. 2009. TBL-improved non-deterministic segmentation and POS tagging for a Chinese parser. In Proceedings of the 12th Conference of the European Chapter of the ACL, pages 264–272. Athens, Greece. Yuqing Guo, Haifeng Wang, and Josef van Genabith. 2007. Recovering Non-Local Dependencies for Chinese. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 257–266. Prague, Czech Republic. Wenbin Jiang, Liang Huang, and Qun Liu. 2009. Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging A Case Study. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, volume 1,pages 522–530. Suntec, Singapore. Dan Klein and Christopher D. Manning. 2003a. Accurate Unlexicalized Parsing. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 423–430. Sapporo, Japan. Dan Klein and Christopher D. Manning. 2003b. Fast Exact Inference with a Factored Model for Natural Language Parsing. In Advances in Neural Information Processing Systems 15, pages 3–10. MIT Press, Cambridge, MA. Jonathan K. Kummerfeld, David Hall, James R. Curran, and Dan Klein. 2012. Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output. In Proceedings of the 2012 Joint Conference on – Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 1048–1059. Jeju Island, South Korea. 102 Roger Levy and Christopher Manning. 2003 . Is it harder to parse Chinese, or the Chinese Treebank? In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, pages 439–446. Sapporo, Japan. Mitchell P. Marcus, Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993 . Building a Large Annotated Corpus of English: The Penn Treebank. Computational Linguistics, 19(2):3 13–330. Slav Petrov. 2010. Products of Random Latent Variable Grammars. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 19–27. Los Angeles, California. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning Accurate, Com- pact, and Interpretable Tree Annotation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pages 433–440. Sydney, Australia. Slav Petrov and Dan Klein. 2007. Improved Inference for Unlexicalized Parsing. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, pages 404–41 1. Rochester, New York, USA. Xian Qian and Yang Liu. 2012. Joint Chinese word segmentation, POS tagging and parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 501–51 1. Jeju Island, Korea. Daniel Tse and James R. Curran. 2012. The Challenges of Parsing Chinese with Combinatory Categorial Grammar. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, pages 295–304. Montre´al, Canada. Deyi Xiong, Shuanglong Li, Qun Liu, Shouxun Lin, and Yueliang Qian. 2005. Parsing the Penn Chinese Treebank with semantic knowledge. In Proceedings of the Second international joint conference on Natural Language Processing, pages 70–81 . Jeju Island, Korea. Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Martha Palmer. 2005 . The Penn Chinese TreeBank: Phrase structure annotation of a large corpus. Natural Language Engineering, 11(2):207–238. Yue Zhang and Stephen Clark. 2009. TransitionBased Parsing of the Chinese Treebank using a Global Discriminative Model. In Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09), pages 162–171 . Paris, France. 103