acl acl2013 acl2013-368 acl2013-368-reference knowledge-graph by maker-knowledge-mining

368 acl-2013-Universal Dependency Annotation for Multilingual Parsing


Source: pdf

Author: Ryan McDonald ; Joakim Nivre ; Yvonne Quirmbach-Brundage ; Yoav Goldberg ; Dipanjan Das ; Kuzman Ganchev ; Keith Hall ; Slav Petrov ; Hao Zhang ; Oscar Tackstrom ; Claudia Bedini ; Nuria Bertomeu Castello ; Jungmee Lee

Abstract: We present a new collection of treebanks with homogeneous syntactic dependency annotation for six languages: German, English, Swedish, Spanish, French and Korean. To show the usefulness of such a resource, we present a case study of crosslingual transfer parsing with more reliable evaluation than has been possible before. This ‘universal’ treebank is made freely available in order to facilitate research on multilingual dependency parsing.1


reference text

Alena B¨ ohmov a´, Jan Haji cˇ, Eva Haji cˇov a´, and Barbora Hladk ´a. 2003. The Prague Dependency Treebank: A three-level annotation scenario. In Anne Abeill´ e, editor, Treebanks: Building and Using Parsed Corpora, pages 103–127. Kluwer. Sabine Brants, Stefanie Dipper, Silvia Hansen, Wolfgang Lezius, and George Smith. 2002. The TIGER Treebank. In Proceedings of the Workshop on Treebanks and Linguistic Theories. Sabine Buchholz and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of CoNLL. Miriam Butt, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi, and Christian Rohrer. 2002. The parallel grammar project. In Proceedings of the 2002 workshop on Grammar engineering and evaluation-Volume 15. Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher D. Manning. 2009. Discriminative reordering with Chinese grammatical relations features. In Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009. 96 Dipanjan Das and Slav Petrov. 2011. Unsupervised part-of-speech tagging with bilingual graph-based projections. In Proceedings of ACL-HLT. Marie-Catherine de Marneffe and Christopher D. Manning. 2008. The Stanford typed dependencies representation. In Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation. Marie-Catherine De Marneffe, Bill MacCartney, and Chris D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proceedings of LREC. Tomaz Erjavec. 2012. MULTEXT-East: Morphosyntactic resources for Central and Eastern European languages. Language Resources and Evaluation, 46: 131–142. Kuzman Ganchev, Jennifer Gillenwater, and Ben Taskar. 2009. Dependency grammar induction via bitext projection constraints. In Proceedings of ACL-IJCNLP. Douwe Gelling, Trevor Cohn, Phil Blunsom, and Joao Gra ¸ca. 2012. The pascal challenge on grammar induction. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure. Jan Haji cˇ, Barbora Vidova Hladka, Jarmila Panevov a´, Eva Haji cˇov a´, Petr Sgall, and Petr Pajas. 2001. Prague Dependency Treebank 1.0. LDC, 2001T10. Katri Haverinen, Timo Viljanen, Veronika Laippala, Samuel Kohonen, Filip Ginter, and Tapio Salakoski. 2010. Treebanking finnish. In Proceedings of The Ninth International Workshop on Treebanks and Linguistic Theories (TLT9). Stephen Helmreich, David Farwell, Bonnie Dorr, Nizar Habash, Lori Levin, Teruko Mitamura, Florence Reeder, Keith Miller, Eduard Hovy, Owen Rambow, and Advaith Siddharthan. 2004. Interlingual annotation of multilingual text corpora. In Proceedings of the HLT-EACL Workshop on Frontiers in Corpus Annotation. Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel. 2006. Ontonotes: the 90% solution. In Proceedings of NAACL. Rebecca Hwa, Philip Resnik, Amy Weinberg, Clara Cabezas, and Okan Kolak. 2005. Bootstrapping parsers via syntactic projection across parallel texts. Natural Language Engineering, 11(03):3 11–325. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of ACL. Dan Klein and Chris D. Manning. 2004. Corpus-based induction of syntactic structure: models of dependency and constituency. In Proceedings of ACL. Sandra K ¨ubler, Ryan McDonald, and Joakim Nivre. 2009. Dependency Parsing. Morgan and Claypool. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a large annotated corpus of English: the Penn treebank. Computational Linguistics, 19(2):3 13–330. Ryan McDonald, Slav Petrov, and Keith Hall. 2011. Multi-source transfer of delexicalized dependency parsers. In Proceedings of EMNLP. Jens Nilsson, Joakim Nivre, and Johan Hall. 2007. Generalizing tree transformations for inductive dependency parsing. In Proceedings of ACL. Joakim Nivre and Be´ ata Megyesi. 2007. Bootstrapping a Swedish treebank using cross-corpus harmonization and annotation projection. In Proceedings of the 6th International Workshop on Treebanks and Linguistic Theories. Joakim Nivre, Johan Hall, Sandra K ¨ubler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In Proceedings of EMNLPCoNLL. Slav Petrov, Dipanjan Das, and Ryan McDonald. 2012. A universal part-of-speech tagset. In Proceedings of LREC. Mojgan Seraji, Be´ ata Megyesi, and Nivre Joakim. 2012. Bootstrapping a Persian dependency treebank. Linguistic Issues in Language Technology, 7(18): 1–10. David A. Smith and Jason Eisner. 2009. Parser adaptation and projection with quasi-synchronous grammar features. In Proceedings of EMNLP. Oscar T¨ ackstr o¨m, Dipanjan Das, Slav Petrov, Ryan McDonald, and Joakim Nivre. 2013. Token and type constraints for cross-lingual part-of-speech tagging. Transactions of the ACL. Ulf Teleman. 1974. Manual f o¨r grammatisk beskrivning av talad och skriven svenska. Studentlitteratur. Reut Tsarfaty. 2013. A unified morpho-syntactic scheme of stanford dependencies. Proceedings of ACL. Daniel Zeman, David Marecek, Martin Popel, Loganathan Ramasamy, Jan Sˇtep ´anek, Zden eˇk Zˇabokrtsk y`, and Jan Hajic. 2012. Hamledt: To parse or not to parse. In Proceedings of LREC. Yue Zhang and Joakim Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proceedings of ACL-HLT. Yuan Zhang, Roi Reichart, Regina Barzilay, and Amir Globerson. 2012. Learning to map into a universal pos tagset. In Proceedings of EMNLP. 97