emnlp emnlp2011 emnlp2011-4 emnlp2011-4-reference knowledge-graph by maker-knowledge-mining

4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser


Source: pdf

Author: Stephen Tratz ; Eduard Hovy

Abstract: Dependency parsers are critical components within many NLP systems. However, currently available dependency parsers each exhibit at least one of several weaknesses, including high running time, limited accuracy, vague dependency labels, and lack of nonprojectivity support. Furthermore, no commonly used parser provides additional shallow semantic interpretation, such as preposition sense disambiguation and noun compound interpretation. In this paper, we present a new dependency-tree conversion of the Penn Treebank along with its associated fine-grain dependency labels and a fast, accurate parser trained on it. We explain how a non-projective extension to shift-reduce parsing can be incorporated into non-directional easy-first parsing. The parser performs well when evaluated on the standard test section of the Penn Treebank, outperforming several popular open source dependency parsers; it is, to the best of our knowledge, the first dependency parser capable of parsing more than 75 sentences per second at over 93% accuracy.


reference text

Collin Baker, and Michael Ellsworth and Katrin Erk. 2007. SemEval’07 task 19: Frame Semantic Structure Extraction. In Proc. of the 4th International Workshop on Semantic Evaluations Collin Baker, Charles J. Fillmore and John B. Lowe. 1998. The Berkeley FrameNet Project. In Proc. of the 17th international conference on Computational linguistics Adam L. Berger, Vincent J. Della Pietra, and Stephen A. Della Pietra. 1996. A maximum entropy approach to natural language processing. In Computational Linguistics 22(1):39–71 Peter F. Brown, Peter V. deSouza, Robert L. Mercer, Vincent J. Della Pietra, and Jenifer C. Lai. 1992. ClassBased n-gram Models of Natural Language. Computational Linguistics 18(4):467–479. Sabine Buchholz and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proc. of CoNLL 2006. Xavier Carreras. 2007. Experiments with a HigherOrder Projective Dependency Parser. In Proc. of the CoNLL Shared Task Session of EMNLP-CoNLL 2007. Daniel Cer, Marie-Catherine de Marneffe, Daniel Jurafsky, and Christopher D. Manning. 2010. Parsing to Stanford Dependencies: Trade-offs between speed and accuracy. In Proc. of LREC 2010. Eugene Charniak. 2000. A Maximum-Entropy-Inspired Parser. In Proc. of NAACL 2000. Eugene Charniak and Mark Johnson. 2005. Coarse-tofind-grained n-best parsing and discriminative reranking. In Proc. of ACL 2005. Michael A. Covington. 2001. A Fundamental Algorithm for Dependency Parsing. In Proc. of the 39th Annual ACM Southeast Conference. Dipanjan Das, Nathan Schneider, Desai Chen, and Noah A. Smith. 2010. Probabilistic Frame-Semantic Parsing. In Proc. of HLT-NAACL 2010. Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The Stanford typed dependencies representation. In COLING Workshop on Cross-framework and Cross-domain Parser Evaluation. Jason Eisner. 1996. Three New Probabilistic Models for Dependency Parsing: An Exploration. In Proc. of COLING 1996. Christiane Fellbaum. 1998. WordNet: An Electronic Lexical Database. MIT Press. Daniel Gildea and Daniel Jurafsky. 2002. Automatic labeling of semantic roles. Computational Linguistics. 28(3):245–288. Jesús Giménez and Lluís Márquez 2004. SVMTool: A General POS Tagger Generator Based on Support Vector Machines. In Proc. of LREC 2004. Yoav Goldberg and Michael Elhadad. 2010. An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL. Yoav Goldberg and Michael Elhadad. 2009. The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages. In Proc. of the Thirteenth Conference on Computational Natural Language Learning: Shared Task. 1267 Dirk Hovy, Stephen Tratz, and Eduard Hovy. 2010. What’s in a Preposition?—Dimensions of Sense Disambiguation for an Interesting Word Class. In Proc. of COLING 2010. Liang Huang and Kenji Sagae. 2010. Dynamic Programming for Linear-Time Shift-Reduce Parsing. In Proc. of ACL 2010. Richard Johansson and Pierre Nugues. 2007. Extended constituent-to-dependency conversion for english. In Proc. of NODALIDA. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple Semi-supervised Dependency Parsing. In Proc. of ACL 2008. Terry Koo and Michael Collins. 2010. Efficient Thirdorder Dependency Parsers. In Proc. of ACL 2010. Ken Litkowski and Orin Hargraves. 2007. SemEval2007 Task 06: Word-Sense Disambiguation of Prepositions. In Proc. of the 4th International Workshop on Semantic Evaluations. Christopher D. Manning. 2011. Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? In Proc. of the 12th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2011). Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beat- rice Santorini. 1993. Building a large annotated corpus of English: the Penn TreeBank. Computational Linguistics, 19(2):313–330. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Haji cˇ. 2005. Non-Projective Dependency Parsing Using Spanning Tree Algorithms. In Proc. of HLTEMNLP 2005. Ryan McDonald and Fernando Pereira. 2006. Online Learning of Approximate Dependency Parsing Algorithms. In Proc. of EACL 2006. Adam Meyers, Ruth Reeves, Catherine Macleod, Rachel Szekely, Veronika Zielinska, Brian Young and Ralph Grishman. 2004. The NomBank Project: An Interim Report. In Proc. of the NAACL/HLT Workshop on Frontiers in Corpus Annotation. Joakim Nivre. 2009. Non-Projective Dependency Parsing in Expected Linear Time. In Proc. of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP. Joakim Nivre. 2003. An Efficient Algorithm for Projective Dependency Parsing. In Proc. of the 8th International Workshop on Parsing Technologies (IWPT). Joakim Nivre, Marco Kuhlmann, and Johan Hall. 2009. An Improved Oracle for Dependency Parsing with Online Reordering. In Proc. of the 11th International Conference on Parsing Technologies (IWPT). Joakim Nivre, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In Proc. of EMNLP-CoNLL 2007. Joakim Nivre, Johan Hall, and Jens Nilsson. 2006. MaltParser: A Data-Driven Parser-Generator for Dependency Parsing. In Proc. of LREC 2006. Joakim Nivre and Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proc. of ACL-2005. Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. In Computational Linguistics. 31(1):71–106. Slav Petrov and Dan Klein. 2007. Improved Inference for Unlexicalized Parsing. In Proc. of HLT-NAACL 2007. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In Proc. of COLING-ACL 2006. Brian Roark and Kristy Hollingshead. 2009. Linear complexity context-free parsing pipelines via chart constraints. In Proc. of HLT-NAACL. Evan Sandhaus. 2008. The New York Times Annotated Corpus. Linguistic Data Consortium, Philadelphia. Mihai Surdeanu, Richard Johansson, Adam Meyers, Lluís Màrquez, and Joakim Nivre. 2008. The CoNLL2008 shared task on joint parsing of syntactic and semantic dependencies. In Proc. of the Twelfth Conference on Computational Natural Language Learning. Jun Suzuki, Hideki Isozaki, Xavier Carrerras, and Michael Collins. 2009. An Empirical Study of Semisupervised Structured Conditional Models for Dependency Parsing. In Proc. of EMNLP. Pasi Tapanainen and Timo Järvinen. 1997. A nonprojective dependency parser. In Proc. of the fifth conference on applied natural language processing. Stephen Tratz and Dirk Hovy. 2009. Disambiguation of Preposition Sense using Linguistically Motivated Features. In Proc. of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium. Stephen Tratz and Eduard Hovy. 2010. A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation. In Proc. of ACL 2010. David Vadas and James R. Curran. 2007. Adding Noun Phrase Structure to the Penn Treebank. In Proc. of ACL 2007. Hiroyasu Yamada and Yuji Matsumoto. 2003. Statistical Dependency Analysis With Support Vector Machines. In Proc. of 8th International Workshop on Parsing Technologies (IWPT). 1268