emnlp emnlp2012 emnlp2012-46 emnlp2012-46-reference knowledge-graph by maker-knowledge-mining

46 emnlp-2012-Exploiting Reducibility in Unsupervised Dependency Parsing

Source: pdf

Author: David Marecek ; Zdene20 ek Zabokrtsky

Abstract: The possibility of deleting a word from a sentence without violating its syntactic correctness belongs to traditionally known manifestations of syntactic dependency. We introduce a novel unsupervised parsing approach that is based on a new n-gram reducibility measure. We perform experiments across 18 languages available in CoNLL data and we show that our approach achieves better accuracy for the majority of the languages then previously reported results.

reference text

Phil Blunsom and Trevor Cohn. 2010. Unsupervised induction of tree substitution grammars for dependency parsing. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’ 10, pages 1204–1213, Stroudsburg, PA, USA. Association for Computational Linguistics. Thorsten Brants. 2000. TnT - A Statistical Part-ofSpeech Tagger. Proceedings of the sixth conference on Applied natural language processing, page 8. Sabine Buchholz and Erwin Marsi. 2006. CoNLLX shared task on multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning, CoNLL-X ’06, pages 149–164, Stroudsburg, PA, USA. Association for Computational Linguistics. Y. J. Chu and T. H. Liu. 1965. On the Shortest Arborescence of a Directed Graph. Science Sinica, 14: 1396– 1400. Shay B. Cohen, Kevin Gimpel, and Noah A. Smith. 2008. Logistic normal priors for unsupervised probabilistic grammar induction. In Neural Information Processing Systems, pages 321–328. Jason Eisner. 1996. Three New Probabilistic Models for Dependency Parsing: An Exploration. In Proceedings of the 16th International Conference on Computational Linguistics (COLING-96), pages 340–345, Copenhagen, August. Kim Gerdes and Sylvain Kahane. 2011. Defining dependencies (and constituents). In Proceedings of Dependency Linguistics 2011, Barcelona. Walter R. Gilks, S. Richardson, and David J. Spiegelhalter. 1996. Markov chain Monte Carlo in practice. In- terdisciplinary statistics. Chapman & Hall. Jennifer Gillenwater, Kuzman Ganchev, Jo˜ ao Gra ¸ca, Fernando Pereira, and Ben Taskar. 2011. Posterior Sparsity in Unsupervised Dependency Parsing. The Journal of Machine Learning Research, 12:455–490, February. Jiˇ r ´ı Havelka. 2007. Beyond Projectivity: Multilingual Evaluation of Constraints and Measures on NonProjective Structures. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pages 608–615. William P. Headden III, Mark Johnson, and David McClosky. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL ’09, pages 101–109, Stroudsburg, PA, USA. Association for Computational Linguistics. Dan Klein and Christopher D. Manning. 2004. Corpusbased induction of syntactic structure: models of dependency and constituency. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, ACL ’04, Stroudsburg, PA, USA. Association for Computational Linguistics. Sandra K ¨ubler, Ryan T. McDonald, and Joakim Nivre. 2009. Dependency Parsing. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers. Mark e´ta Lopatkov a´, Martin Pl´ atek, and Vladislav Kubo nˇ. 2005. Modeling syntax of free word-order languages: Dependency analysis by reduction. In V ´aclav Matou sˇek, Pavel Mautner, and Tom a´ˇ s Pavelka, editors, Lecture Notes in Artificial Intelligence, Proceedings of the 8th International Conference, TSD 2005, volume 3658 of Lecture Notes in Computer Science, pages 140–147, Berlin / Heidelberg. Springer. Martin Majli ˇs and Zden eˇk Zˇabokrtsk y´. 2012. Language richness of the web. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, May. European Language Resources Association (ELRA). David Mare cˇek and Zden eˇk Zˇabokrtsk y´. 2011. Gibbs Sampling with Treeness constraint in Unsupervised Dependency Parsing. In Proceedings of RANLP Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing, pages 1–8, Hissar, Bulgaria. Ryan McDonald, Slav Petrov, and Keith Hall. 2011. Multi-source transfer of delexicalized dependency parsers. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 62–72, Edinburgh, Scotland, UK., July. Association for Computational Linguistics. 307 Joakim Nivre, Johan Hall, Sandra K ¨ubler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007. The CoNLL 2007 Shared Task on Dependency Parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pages 915–932, Prague, Czech Republic, June. Association for Computational Linguistics. Martin Popel and Zden eˇk Zˇabokrtsk y´. 2010. TectoMT: modular NLP framework. In Proceedings of the 7th international conference on Advances in natural language processing, IceTAL’ 10, pages 293–304, Berlin, Heidelberg. Springer-Verlag. Noah Ashton Smith. 2007. Novel estimation methods for unsupervised discovery of latent structure in natural language text. Ph.D. thesis, Baltimore, MD, USA. AAI3240799. Valentin I. Spitkovsky, Hiyan Alshawi, Angel X. Chang, and Daniel Jurafsky. 2011a. Unsupervised dependency parsing without gold part-of-speech tags. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011). Valentin I. Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2011b. Lateen EM: Unsupervised training with multiple objectives, applied to dependency grammar induction. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011). Valentin I. Spitkovsky, Hiyan Alshawi, and Daniel Jurafsky. 2011c. Punctuation: Making a point in unsupervised dependency parsing. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning (CoNLL-2011).