acl acl2011 acl2011-230 acl2011-230-reference knowledge-graph by maker-knowledge-mining

230 acl-2011-Neutralizing Linguistically Problematic Annotations in Unsupervised Dependency Parsing Evaluation


Source: pdf

Author: Roy Schwartz ; Omri Abend ; Roi Reichart ; Ari Rappoport

Abstract: Dependency parsing is a central NLP task. In this paper we show that the common evaluation for unsupervised dependency parsing is highly sensitive to problematic annotations. We show that for three leading unsupervised parsers (Klein and Manning, 2004; Cohen and Smith, 2009; Spitkovsky et al., 2010a), a small set of parameters can be found whose modification yields a significant improvement in standard evaluation measures. These parameters correspond to local cases where no linguistic consensus exists as to the proper gold annotation. Therefore, the standard evaluation does not provide a true indication of algorithm quality. We present a new measure, Neutral Edge Direction (NED), and show that it greatly reduces this undesired phenomenon.


reference text

Taylor Berg-Kirkpatrick, Alexandre Bouchard-C oˆt´ e, John DeNero and Dan Klein, 2010. Painless unsupervised learning with features. In Proc. of NAACL. Taylor Berg-Kirkpatrick and Dan Klein, 2010. Phylogenetic Grammar Induction. In Proc. of ACL. Cristina Bosco and Vincenzo Lombardo, 2004. Dependency and relational structure in treebank annotation. In Proc. of the Workshop on Recent Advances in Dependency Grammar at COLING’04. Phil Blunsom and Trevor Cohn, 2010. Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing. In Proc. of EMNLP. Shay B. Cohen, Kevin Gimpel and Noah A. Smith, 2008. Logistic Normal Priors for UnsupervisedProbabilistic Grammar Induction. In Proc. of NIPS. Shay B. Cohen and Noah A. Smith, 2009. Shared Logistic Normal Distributions for Soft Parameter Tying. In Proc. of HLT-NAACL. Michael J. Collins, 1999. Head-driven statistical models for natural language parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia. Alexander Clark, 2001 . Unsupervised language acquisition: theory and practice. Ph.D. thesis, University of Sussex. Hal Daum e´ III, 2009. Unsupervised search-based structured prediction. In Proc. of ICML. Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, Jo˜ ao V. Gra ¸ca and Fernando Pereira, 2007. Frustratingly Hard Domain Adaptation for Dependency Parsing. In Proc. of the CoNLL 2007 Shared Task. EMNLP-CoNLL. Gregory Druck, Gideon Mann and Andrew McCallum, 2009. Semi-supervised learning of dependency parsers using generalized expectation criteria. In Proc. of ACL. Jennifer Gillenwater, Kuzman Ganchev, Jo˜ ao V. Gra ¸ca, Ben Taskar and Fernando Preira, 2010. Sparsity in dependency grammar induction. In Proc. of ACL. William P. Headden III, David McClosky, and Eugene Charniak, 2008. Evaluating Unsupervised Part-ofSpeech Tagging for Grammar Induction. In Proc. of COLING. William P. Headden III, Mark Johnson and David Mc- Closky, 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proc. of HLT-NAACL. Richard Johansson and Pierre Nugues, 2007. Extended Constituent-to-Dependency Conversionfor English. In Proc. of NODALIDA. Dan Klein, 2005. The unsupervised learning of natural language structure. Ph.D. thesis, Stanford University. Dan Klein and Christopher Manning, 2004. Corpusbased induction of syntactic structure: Models of dependency and constituency. In Proc. of ACL. Sandra K ¨ubler, 2005. How Do Treebank Annotation Schemes Influence Parsing Results? Or How Not to Compare Apples And Oranges. In Proc. of RANLP. Sandra K ¨ubler, R. McDonald and Joakim Nivre, 2009. Dependency Parsing. Morgan And Claypool Publishers. Mitchell Marcus, Beatrice Santorini and Mary Ann Marcinkiewicz, 1993. Building a large annotatedcorpus of English: The Penn treebank. Computational Linguistics 19:3 13-330. Tahira Naseem, Harr Chen, Regina Barzilay and Mark Johnson, 2010. Using universal linguistic knowledge to guide grammar induction. In Proc. of EMNLP. Joakim Nivre, 2006. Inductive Dependency Parsing. Springer. Joakim Nivre, Johan Hall and Jens Nilsson, 2006. MaltParser: A data-driven parser-generator for depen- dency parsing. In Proc. of LREC-2006. Joakim Nivre, Johan Hall, Sandra K ¨ubler, Ryan McDonald, Jens Nilsson, Sebastian Riedel and Deniz Yuret, 2007. The CoNLL 2007 shared task on dependency parsing. In Proc. of the CoNLL Shared Task, EMNLPCoNLL, 2007. Jens Nilsson, Joakim Nivre and Johan Hall, 2006. Graph transformations in data-driven dependency parsing. In Proc. of ACL. 672 Owen Rambow, Cassandre Creswell, Rachel Szekely, Harriet Tauber and Marilyn Walker, 2002. A dependency treebank for English. In Proc. of LREC. Noah A. Smith and Jason Eisner, 2005. Guiding unsupervised grammar induction using contrastive estimation. In Proc. of IJCAI. Noah A. Smith and Jason Eisner, 2006. Annealing structural bias in multilingual weighted grammar induction. In Proc. of ACL. Valentin I. Spitkovsky, Hiyan Alshawi and Daniel Jurafsky, 2010a. From Baby Steps to Leapfrog: How “Less is More ” in Unsupervised Dependency Parsing. In Proc. of NAACL-HLT. Valentin I. Spitkovsky, Hiyan Alshawi and Daniel Jurafsky, 2010b. Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing. In Proc. of ACL. Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jurafsky and Christopher D. Manning, 2010c. Viterbi training improves unsupervised dependency parsing. In Proc. of CoNLL. Qin Iris Wang, Dale Schuurmans and Dekang Lin, 2005. Strictly Lexical Dependency Parsing. In IWPT. Qin Iris Wang, Colin Cherry, Dan Lizotte and Dale Schuurmans, 2006. Improved Large Margin Dependency Parsing via Local Constraints and Laplacian Regularization. In Proc. of CoNLL. Hiroyasu Yamada and Yuji Matsumoto, 2003. Statistical dependency analysis with support vector machines. In Proc. of the International Workshop on Parsing Technologies.