emnlp emnlp2011 emnlp2011-146 emnlp2011-146-reference knowledge-graph by maker-knowledge-mining

146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance

Source: pdf

Author: Shay B. Cohen ; Dipanjan Das ; Noah A. Smith

Abstract: We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-theart performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.

reference text

T. Berg-Kirkpatrick and D. Klein. 2010. Phylogenetic grammar induction. In Proceedings of ACL. T. Berg-Kirkpatrick, A. B. C ˆot ´e, J. DeNero, and D. Klein. 2010. Painless unsupervised learning with features. In Proceedings of NAACL-HLT. S. Brants, S. Dipper, S. Hansen, W. Lezius, and G. Smith. 2002. The TIGER treebank. In Proceedings of the Workshop on Treebanks and Linguistic Theories. S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of CoNLL. D. Burkett and D. Klein. 2008. Two languages are better than one (for syntactic parsing). In Proceedings of EMNLP. S. B. Cohen and N. A. Smith. 2007. Joint morphological and syntactic disambiguation. In Proceedings of EMNLP-CoNLL. S. B. Cohen and N. A. Smith. 2009. Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction. In Proceedings of HLTNAACL. S. B. Cohen and N. A. Smith. 2010. Covariance in unsupervised learning of probabilistic grammars. Journal of Machine Learning Research, 11:3017–305 1. D. Das and S. Petrov. 2011. Unsupervised part-ofspeech tagging with bilingual graph-based projections. In Proceedings of ACL-HLT. D. Elworthy. 1994. Does Baum-Welch re-estimation help taggers? In Proceedings of ACL. K. Ganchev, J. Gra ¸ca, J. Gillenwater, and B. Taskar. 2010. Posterior regularization for structured latent variable models. Journal of Machine Learning Research, 11:2001–2049. J. Gillenwater, K. Ganchev, J. Gra ¸ca, F. Pereira, and B. Taskar. 2010. Sparsity in dependency grammar induction. In Proceedings of ACL. A. Haghighi and D. Klein. 2006. Prototype driven learning for sequence models. In Proceedings of HLTNAACL. J. Haji cˇ. 1998. Building a syntactically annotated corpus: The Prague Dependency Treebank. In Issues of Valency and Meaning. Studies in Honor of Jarmila Panevov a´. Prague Karolinum, Charles University Press. W. P. Headden, M. Johnson, and D. McClosky. 2009. Improving unsupervised dependency parsing with richer contexts and smoothing. In Proceedings of NAACL-HLT. M. Johnson, T. L. Griffiths, and S. Goldwater. 2007. Bayesian inference for PCFGs via Markov chain Monte Carlo. In Proceedings of NAACL. D. Klein and C. D. Manning. 2004. Corpus-based induction of syntactic structure: Models of dependency and constituency. In Proceedings of ACL. D. C. Liu and J. Nocedal. 1989. On the limited memory BFGS method for large scale optimization. Math. Programming, 45:503–528. 60 M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini. 1993. Building a large annotated corpus of English: the Penn treebank. Computational Linguistics, 19. R. McDonald, S. Petrov, and K. Hall. 2011. Multi-source transfer of delexicalized dependency parsers. In Proceedings of EMNLP. B. Merialdo. 1994. Tagging English text with a probabilistic model. Compulational Lingustics, 20(2): 155– 72. S. Montemagni, F. Barsotti, M. Battista, N. Calzolari, O. Corazzari, A. Zampolli, F. Fanciulli, M. Massetani, R. Raffaelli, R. Basili, M. T. Pazienza, D. Saracino, F. Zanzotto, N. Mana, F. Pianesi, and R. Delmonte. 2003. Building the Italian Syntactic-Semantic Treebank. In Building and using Parsed Corpora, Language and Speech Series. Kluwer, Dordrecht. T. Naseem, B. Snyder, J. Eisenstein, and R. Barzilay. 2009. Multilingual part-of-speech tagging: Two unsupervised approaches. JAIR, 36. T. Naseem, H. Chen, R. Barzilay, and M. Johnson. 2010. Using universal linguistic knowledge to guide grammar induction. In Proceedings of EMNLP. J. Nivre, J. Hall, S. K ¨ubler, R. McDonald, J. Nilsson, S. Riedel, and D. Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In Proceedings of CoNLL. S. Petrov, D. Das, and R. McDonald. 2011. A universal part-of-speech tagset. ArXiv:1104.2086. Y. Seginer. 2007. Fast unsupervised incremental parsing. In Proceedings of ACL. N. A. Smith and J. Eisner. 2005. Contrastive estimation: Training log-linear models on unlabeled data. In Proceedings of ACL. D. A. Smith and J. Eisner. 2009. Parser adaptation and projection with quasi-synchronous grammar features. In Proceedings of EMNLP. D. A. Smith and N. A. Smith. 2004. Bilingual parsing with factored estimation: Using English to parse Korean. In Proceedings of EMNLP. N. A. Smith. 2006. Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text. Ph.D. thesis, Johns Hopkins University. B. Snyder and R. Barzilay. 2008. Unsupervised multilingual learning for morphological segmentation. In Proceedings of ACL. B. Snyder, T. Naseem, and R. Barzilay. 2009. Unsupervised multilingual grammar induction. In Proceedings of ACL-IJCNLP. V. Spitkovsky, H. Alshawi, and D. Jurafsky. 2010. From baby steps to leapfrog: How “less is more” in unsupervised dependency parsing. In Proceedings of NAACL. C. Xi and R. Hwa. 2005. A backoff model for bootstrap- ping resources for non-English languages. ings of HLT-EMNLP. D. Yarowsky and G. Ngai. 2001. Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora. In Proceedings of NAACL. 61 In Proceed-