emnlp emnlp2012 emnlp2012-123 emnlp2012-123-reference knowledge-graph by maker-knowledge-mining

123 emnlp-2012-Syntactic Transfer Using a Bilingual Lexicon


Source: pdf

Author: Greg Durrett ; Adam Pauls ; Dan Klein

Abstract: We consider the problem of using a bilingual dictionary to transfer lexico-syntactic information from a resource-rich source language to a resource-poor target language. In contrast to past work that used bitexts to transfer analyses of specific sentences at the token level, we instead use features to transfer the behavior of words at a type level. In a discriminative dependency parsing framework, our approach produces gains across a range of target languages, using two different lowresource training methodologies (one weakly supervised and one indirectly supervised) and two different dictionary sources (one manually constructed and one automatically constructed).


reference text

Mohit Bansal and Dan Klein. 2011. Web-scale Features for Full-scale Parsing. In Proceedings of ACL, pages 693–702, Portland, Oregon, USA. Taylor Berg-Kirkpatrick and Dan Klein. 2010. Phylogenetic Grammar Induction. In Proceedings of ACL, pages 1288–1297, Uppsala, Sweden. Sabine Buchholz and Erwin Marsi. 2006. CoNLL-X Shared Task on Multilingual Dependency Parsing. In Proceedings of CoNLL, pages 149–164. Shay B. Cohen and Noah A. Smith. 2009. Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction. In Proceedings of NAACL, pages 74–82, Boulder, Colorado. Shay B. Cohen, Dipanjan Das, and Noah A. Smith. 2011. Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance. In Proceedings of EMNLP, pages 50–61, Edinburgh, UK. Michael Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania. Koby Crammer and Yoram Singer. 2001 . Ultraconservative Online Algorithms for Multiclass Problems. Journal of Machine Learning Research, 3:2003. Dipanjan Das and Slav Petrov. 2011. Unsupervised Partof-Speech Tagging with Bilingual Graph-Based Projections. In Proceedings of ACL, pages 600–609, Portland, Oregon, USA. Kuzman Ganchev, Jennifer Gillenwater, and Ben Taskar. 2009. Dependency Grammar Induction via Bitext Projection Constraints. In Proceedings of ACL, pages 369–377, Suntec, Singapore. David Graff, Junbo Kong, Ke Chen, and Kazuaki Maeda. 2007. English Gigaword Third Edition. Linguistic Data Consortium, Catalog Number LDC2007T07. Aria Haghighi and Dan Klein. 2006. Prototype-driven Grammar Induction. In Proceedings of CoLING-ACL, pages 881–888, Sydney, Australia. Rebecca Hwa, Philip Resnik, Amy Weinberg, Clara Cabezas, and Okan Kolak. 2005. Bootstrapping Parsers via Syntactic Projection Across Parallel Texts. Natural Language Engineering, 11:3 11–325, September. Dan Klein and Christopher D. Manning. 2004. CorpusBased Induction of Syntactic Structure: Models of Dependency and Constituency. In Proceedings of ACL, pages 479–486. Philipp Koehn. 2005. Europarl: A Parallel Corpus for Statistical Machine Translation. In MT Summit X, pages 79–86, Phuket, Thailand. AAMT. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple Semi-Supervised Dependency Parsing. In Proceedings of ACL. Percy Liang, Ben Taskar, and Dan Klein. 2006. Alignment by Agreement. In Proceedings of NAACL, New York, New York. Mitchell P. Marcus, Mary Ann Marcinkiewicz, and Beatrice Santorini. 1993. Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 19:3 13–330, June. Ryan McDonald, Koby Crammer, and Fernando Pereira. 2005. Online Large-Margin Training of Dependency Parsers. In Proceedings of ACL, pages 91–98, Ann Arbor, Michigan. 11 Ryan McDonald, Slav Petrov, and Keith Hall. 2011. Multi-Source Transfer of Delexicalized Dependency Parsers. In Proceedings of EMNLP, pages 62–72, Edinburgh, Scotland, UK. Tahira Naseem, Harr Chen, Regina Barzilay, and Mark Johnson. 2010. Using Universal Linguistic Knowledge to Guide Grammar Induction. In Proceedings of EMNLP, pages 1234–1244, Cambridge, Massachusetts. Joakim Nivre, Johan Hall, Sandra K ¨ubler, Ryan Mcdonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007. The CoNLL 2007 Shared Task on Dependency Parsing. In Proceedings of EMNLP-CoNLL, pages 915–932, Prague, Czech Republic. Joakim Nivre. 2008. Algorithms for Deterministic Incremental Dependency Parsing. Computational Linguistics, 34:513–553, December. Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning Accurate, Compact, and Interpretable Tree Annotation. In Proceedings of ACL, pages 433–440, Sydney, Australia. Slav Petrov, Dipanjan Das, and Ryan McDonald. 2011. A Universal Part-of-Speech Tagset. In ArXiv, April. David A. Smith and Jason Eisner. 2009. Parser Adaptation and Projection with Quasi-Synchronous Grammar Features. In Proceedings of EMNLP, pages 822–83 1, Suntec, Singapore. Oscar T¨ ackstr o¨m, Ryan McDonald, and Jakob Uszkoreit. 2012. Cross-lingual Word Clusters for Direct Trans- fer of Linguistic Structure. In Proceedings of NAACL, Montreal, Canada. Wikimedia Foundation. 2012. Wiktionary. Online at http://www.wiktionary.org/. Guangyou Zhou, Jun Zhao, Kang Liu, and Li Cai. 2011. Exploiting Web-Derived Selectional Preference to Improve Statistical Dependency Parsing. In Proceedings of ACL, pages 1556–1565, Portland, Oregon, USA.