acl acl2013 acl2013-270 acl2013-270-reference knowledge-graph by maker-knowledge-mining

270 acl-2013-ParGramBank: The ParGram Parallel Treebank


Source: pdf

Author: Sebastian Sulger ; Miriam Butt ; Tracy Holloway King ; Paul Meurer ; Tibor Laczko ; Gyorgy Rakosi ; Cheikh Bamba Dione ; Helge Dyvik ; Victoria Rosen ; Koenraad De Smedt ; Agnieszka Patejuk ; Ozlem Cetinoglu ; I Wayan Arka ; Meladel Mistica

Abstract: This paper discusses the construction of a parallel treebank currently involving ten languages from six language families. The treebank is based on deep LFG (LexicalFunctional Grammar) grammars that were developed within the framework of the ParGram (Parallel Grammar) effort. The grammars produce output that is maximally parallelized across languages and language families. This output forms the basis of a parallel treebank covering a diverse set of phenomena. The treebank is publicly available via the INESS treebanking environment, which also allows for the alignment of language pairs. We thus present a unique, multilayered parallel treebank that represents more and different types of languages than are avail- able in other treebanks, that represents me ladel .mi st ica@ gmai l com . deep linguistic knowledge and that allows for the alignment of sentences at several levels: dependency structures, constituency structures and POS information.


reference text

Mohammed Attia. 2008. A Unified Analysis of Copula Constructions. In Proceedings of the LFG ’08 Conference, pages 89–108. CSLI Publications. Emily M. Bender, Dan Flickinger, and Stephan Oepen. 2011. Grammar Engineering and Linguistic Hypothesis Testing: Computational Support for Complexity in Syntactic Analysis. In Emily M. Bender and Jennifer E. Arnold, editors, Languages from a Cognitive Perspective: Grammar, Usage and Processing, pages 5–30. CSLI Publications. 558 Daniel G. Bobrow, Cleo Condoravdi, Dick Crouch, Valeria de Paiva, Lauri Karttunen, Tracy Holloway King, Rowan Nairn, Lottie Price, and Annie Zaenen. 2007. Precision-focused Textual Inference. In Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing. Cristina Bosco, Manuela Sanguinetti, and Leonardo Lesmo. 2012. The Parallel-TUT: a multilingual and multiformat treebank. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pages 1932–1938, Istanbul, Turkey. European Language Resources Association (ELRA). Joan Bresnan. 1982. The Passive in Lexical Theory. In Joan Bresnan, editor, The Mental Representation of Grammatical Relations, pages 3–86. The MIT Press. Joan Bresnan. 2001 . Blackwell Publishing. Lexical-Functional Syntax. Miriam Butt, Stefanie Dipper, Anette Frank, and Tracy Holloway King. 1999a. Writing Large- Scale Parallel Grammars for English, French and German. In Proceedings of the LFG99 Conference. CSLI Publications. Miriam Butt, Tracy Holloway King, Mar ı´a-Eugenia Ni˜ no, and Fr´ ed´ erique Segond. 1999b. A Grammar Writer’s Cookbook. CSLI Publications. Miriam Butt, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi, and Christian Rohrer. 2002. The Parallel Grammar Project. In Proceedings of the COLING-2002 Workshop on Grammar Engineering and Evaluation, pages 1–7. Miriam Butt. 1995. The Structure of Complex Predicates in Urdu. CSLI Publications. Noam Chomsky. 1988. Lectures on Government and Binding: The Pisa Lectures. Foris Publications. Noam Chomsky. 1995. The Minimalist Program. MIT Press. Dick Crouch and Tracy Holloway King. 2006. Semantics via F-structure Rewriting. In Proceedings of the LFG06 Conference, pages 145–165. CSLI Publications. Dick Crouch, Tracy Holloway King, John T. Maxwell III, Stefan Riezler, and Annie Zaenen. 2004. Ex- ploiting F-structure Input for Sentence Condensation. In Proceedings of the LFG04 Conference, pages 167–187. CSLI Publications. Dick Crouch, Mary Dalrymple, Ronald M. Kaplan, Tracy Holloway King, John T. Maxwell III, and Paula Newman, 2012. XLE Documentation. Palo Alto Research Center. Mary Dalrymple, Helge Dyvik, and Tracy Holloway King. 2004. Copular Complements: Closed or Open? In Proceedings of the LFG ’04 Conference, pages 188–198. CSLI Publications. Mary Dalrymple. 2001 . Lexical Functional Grammar, volume 34 of Syntax and Semantics. Academic Press. Helge Dyvik, Paul Meurer, Victoria Ros e´n, and Koenraad De Smedt. 2009. Linguistically Motivated Parallel Parsebanks. In Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT8), pages 71–82, Milan, Italy. EDUCatt. Dan Flickinger, Valia Kordoni, Yi Zhang, Ant o´nio Branco, Kiril Simov, Petya Osenova, Catarina Carvalheiro, Francisco Costa, and S ´ergio Castro. 2012. ParDeepBank: Multiple Parallel Deep Treebanking. In Proceedings of the 11th International Workshop on Treebanks and Linguistic Theories (TLT11), pages 97–107, Lisbon. Edi ¸c˜ oes Colibri. Tracy Holloway King, Richard Crouch, Stefan Riezler, Mary Dalrymple, and Ronald Kaplan. 2003. The PARC700 Dependency Bank. In Proceedings of the EACL03: 4th International Workshop on Linguistically Interpreted Corpora (LINC-03). Tracy Holloway King, Martin Forst, Jonas Kuhn, and Miriam Butt. 2005. The Feature Space in Parallel Grammar Writing. In Emily M. Bender, Dan Flickinger, Frederik Fouvry, and Melanie Siegel, editors, Research on Language and Computation: Special Issue on Shared Representation in Multilingual Grammar Engineering, volume 3, pages 139–163. Springer. Natalia Klyueva and David Mare c˘ek. 2010. Towards a Parallel Czech-Russian Dependency Treebank. In Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora, Tartu. Northern European Association for Language Technology (NEALT). Jonas Kuhn and Michael Jellinghaus. 2006. Multilingual Parallel Treebanking: A Lean and Flexible Approach. In Proceedings of the LREC 2006, Genoa, Italy. ELRA/ELDA. Tibor Laczk ´o. 2012. On the (Un)Bearable Lightness ofBeing an LFG Style Copula in Hungarian. In Proceedings of the LFG12 Conference, pages 341–361. CSLI Publications. Sabine Lehmann, Stephan Oepen, Sylvie RegnierProst, Klaus Netter, Veronika Lux, Judith Klein, Kirsten Falkedal, Frederik Fouvry, Dominique Estival, Eva Dauphin, Herv e´ Compagnion, Judith Baur, Lorna Balkan, and Doug Arnold. 1996. TSNLP Test Suites for Natural Language Processing. In Proceedings of COLING, pages 711–716. — Muhammad Kamran Malik, Tafseer Ahmed, Sebastian Sulger, Tina B ¨ogel, Atif Gulzar, Ghulam Raza, Sarmad Hussain, and Miriam Butt. 2010. Transliterating Urdu for a Broad-Coverage Urdu/Hindi LFG Grammar. In Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010), Valletta, Malta. 559 Inderjeet Mani and James Pustejovsky. 2004. Temporal Discourse Models for Narrative Structure. In Proceedings of the 2004 ACL Workshop on Discourse Annotation, pages 57–64. Be´ ata Megyesi, Bengt Dahlqvist, E´va A´. Csat o´, and Joakim Nivre. 2010. The English-Swedish-Turkish Parallel Treebank. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta. European Language Resources Association (ELRA). Tahira Naseem, Regina Barzilay, and Amir Globerson. 2012. Selective Sharing for Multilingual Dependency Parsing. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 629–637, Jeju Island, Korea, July. Association for Computational Linguistics. Joakim Nivre, Igor Boguslavsky, and Leonid Iomdin. 2008. Parsing the SynTagRus Treebank. In Proceedings of COLING08, pages 641–648. Rachel Nordlinger and Louisa Sadler. 2007. Verbless Clauses: Revealing the Structure within. In Annie Zaenen, Jane Simpson, Tracy Holloway King, Jane Grimshaw, Joan Maling, and Chris Manning, editors, Architectures, Rules and Preferences: A Festschrift for Joan Bresnan, pages 139–160. CSLI Publications. Martha Palmer, Daniel Gildea, and Paul Kingsbury. 2005. The Proposition Bank: An Annotated Corpus of Semantic Roles. Computational Linguistics, 31(1):71–106. Martin Popel and Zden eˇk Zˇabokrtsk y´. 2010. TectoMT: Modular NLP Framework. In Proceedings of the 7th International Conference on Advances in Natural Language Processing (IceTAL 2010), pages 293–304. Victoria Ros e´n, Paul Meurer, and Koenraad de Smedt. 2009. LFG Parsebanker: A Toolkit for Building and Searching a Treebank as a Parsed Corpus. In Proceedings of the 7th International Workshop on Treebanks and Linguistic Theories (TLT7), pages 127– 133, Utrecht. LOT. Victoria Ros e´n, Koenraad De Smedt, Paul Meurer, and Helge Dyvik. 2012. An Open Infrastructure for Advanced Treebanking. In META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 22–29, Istanbul, Turkey. Manuela Sanguinetti and Cristina Bosco. 2011. Building the Multilingual TUT Parallel Treebank. In Proceedings of Recent Advances in Natural Language Processing, pages 19–28. Sebastian Sulger. 2011. A Parallel Analysis of haveType Copular Constructions in have-Less IndoEuropean Languages. In Proceedings of the LFG ’11 Conference. CSLI Publications. Josef van Genabith and Dick Crouch. 1996. Direct and Underspecified Interpretations of LFG f-structures. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), volume 1, pages 262–267, Copenhagen, Denmark. Martin Volk, Anne G ¨ohring, Torsten Marek, and Yvonne Samuelsson. 2010. SMULTRON (version 3.0) The Stockholm MULtilingual parallel TReebank. http://www.cl.uzh.ch/research/paralleltreebanks en. html. — 560