acl acl2013 acl2013-155 acl2013-155-reference knowledge-graph by maker-knowledge-mining

155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing

Source: pdf

Author: Muhua Zhu ; Yue Zhang ; Wenliang Chen ; Min Zhang ; Jingbo Zhu

Abstract: Shift-reduce dependency parsers give comparable accuracies to their chartbased counterparts, yet the best shiftreduce constituent parsers still lag behind the state-of-the-art. One important reason is the existence of unary nodes in phrase structure trees, which leads to different numbers of shift-reduce actions between different outputs for the same input. This turns out to have a large empirical impact on the framework of global training and beam search. We propose a simple yet effective extension to the shift-reduce process, which eliminates size differences between action sequences in beam-search. Our parser gives comparable accuracies to the state-of-the-art chart parsers. With linear run-time complexity, our parser is over an order of magnitude faster than the fastest chart parser.

reference text

Daniel M. Bikel. 2004. On the parameter space of generative lexicalized statistical parsing models. Ph.D. thesis, University of Pennsylvania. Bernd Bohnet and Joakim Nivre. 2012. A transitionbased system forjoint part-of-speech tagging and labeled non-projective dependency parsing. In Proceedings of EMNLP, pages 12–14, Jeju Island, Korea. Xavier Carreras, Michael Collins, and Terry Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proceedings of CoNLL, pages 9–16, Manchester, England. Eugune Charniak and Mark Johnson. 2005. Coarse- to-fine n-best parsing and maxent discriminative reranking. In Proceedings of ACL, pages 173–180. Eugune Charniak. 2000. A maximum-entropyinspired parser. In Proceedings of NAACL, pages 132–139, Seattle, Washington, USA. Wenliang Chen, Junichi Kazama, Kiyotaka Uchimoto, and Kentaro Torisawa. 2009. Improving dependency parsing with subtrees from auto-parsed data. In Proceedings of EMNLP, pages 570–579, Singapore. Wenliang Chen, Min Zhang, and Haizhou Li. 2012. Utilizing dependency language models for graphbased dependency. In Proceedings of ACL, pages 213–222, Jeju, Republic of Korea. Michael Collins and Brian Roark. 2004. Incremental parsing with the perceptron algorithm. In Proceedings of ACL, Stroudsburg, PA, USA. Michael Collins. 1997. Three generative, lexicalised models for statistical parsing. In Proceedings of ACL, Madrid, Spain. Michael Collins. 1999. Head-driven statistical models for natural language parsing. Ph.D. thesis, University of Pennsylvania. Michael Collins. 2000. Discriminative reranking for natural language processing. In Proceedings of ICML, pages 175–182, Stanford, CA, USA. Hal Daume III. 2006. Practical Structured Learning for Natural Language Processing. Ph.D. thesis, USC. Zhongqiang Huang and Mary Harper. 2009. Selftraining PCFG grammars with latent annotations 442 across languages. In Proceedings of EMNLP, pages 832–841, Singapore. Liang Huang and Kenji Sagae. 2010. Dynamic programming for linear-time incremental parsing. In Proceedings of ACL, pages 1077–1086, Uppsala, Sweden. Zhongqiang Huang, Mary Harper, and Slav Petrov. 2010. Self-training with products of latent variable grammars. In Proceedings of EMNLP, pages 12–22, Massachusetts, USA. Liang Huang. 2008. Forest reranking: discriminative parsing with non-local features. In Proceedings of ACL, pages 586–594, Ohio, USA. Liang-Ya Huang. 2009. Improve Chinese parsing with Max-Ent reranking parser. In Master Project Report, Brown University. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple semi-supervised dependency parsing. In Proceedings of ACL. J. Lafferty, A. McCallum, and F. Pereira. 2001 . Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML, pages 282–289, Massachusetts, USA, June. Percy Liang. 2005. Semi-supervised learning for natural language. Master’s thesis, Massachusetts Institute of Technology. Mitchell P. Marcus, Beatrice Santorini, and Mary A. Marcinkiewiz. 1993. Building a large annotated corpus of English. Computational Linguistics, 19(2):313–330. David McClosky, Eugene Charniak, and Mark Johnson. 2006. Effective self-training for parsing. In Proceedings of the HLT/NAACL, Main Conference, pages 152–159, New York City, USA, June. Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Online large-margin training of de- pendency parsers. In Proceedings ofACL, pages 91– 98, Ann Arbor, Michigan, June. Joakim Nivre, Johan Hall, and Jens Nilsson. 2006. Maltparser: a data-driven parser-generator for dependency parsing. In Proceedings of LREC, pages 2216–2219. Slav Petrov and Dan Klein. 2007. Improved inference for unlexicalized parsing. In Proceedings of HLT/NAACL, pages 404–41 1, Rochester, New York, April. Adwait Ratnaparkhi. 1997. A linear observed time statistical parser based on maximum entropy models. In Proceedings of EMNLP, Rhode Island, USA. Kenji Sagae and Alon Lavie. 2005. A classifier-based parser with linear run-time complexity. In Proceedings of IWPT, pages 125–132, Vancouver, Canada. Kenji Sagae and Alon Lavie. 2006. Parser combination by reparsing. In Proceedings of HLT/NAACL, Companion Volume: Short Papers, pages 129–132, New York, USA. Libin Shen, Jinxi Xu, and Ralph Weischedel. 2008. A new string-to-dependency machine translation algorithm with a target dependency language model. In Proceedings of ACL, pages 577–585, Ohio, USA. Weiwei Sun and Hans Uszkoreit. 2012. Capturing paradigmatic and syntagmatic lexical relations: towards accurate Chinese part-of-speech tagging. In Proceedings of ACL, Jeju, Republic of Korea. Nianwen Xue, Fei Xia, Fu dong Chiou, and Martha Palmer. 2005. The Penn Chinese Treebank: phrase structure annotation of a large corpus. Natural Language Engineering, 11(2):207–238. Hiroyasu Yamada and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of IWPT, pages 195–206, Nancy, France. Yue Zhang and Stephen Clark. 2008. Joint word segmentation and POS tagging using a single perceptron. In Proceedings of ACL/HLT, pages 888–896, Columbus, Ohio. Yue Zhang and Stephen Clark. 2009. Transition-based parsing of the Chinese Treebank using a global discriminative model. In Proceedings of IWPT, Paris, France, October. Yue Zhang and Joakim Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proceedings of ACL, pages 188–193, Portland, Oregon, USA. Muhua Zhu, Jingbo Zhu, and Huizhen Wang. 2012. Exploiting lexical dependencies from large-scale data for better shift-reduce constituency parsing. In Proceedings of COLING, pages 3 171–3 186, Mumbai, India. 443