acl acl2011 acl2011-3 acl2011-3-reference knowledge-graph by maker-knowledge-mining

3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

Source: pdf

Author: Ivan Titov ; Alexandre Klementiev

Abstract: We propose a non-parametric Bayesian model for unsupervised semantic parsing. Following Poon and Domingos (2009), we consider a semantic parsing setting where the goal is to (1) decompose the syntactic dependency tree of a sentence into fragments, (2) assign each of these fragments to a cluster of semantically equivalent syntactic structures, and (3) predict predicate-argument relations between the fragments. We use hierarchical PitmanYor processes to model statistical dependencies between meaning representations of predicates and those of their arguments, as well as the clusters of their syntactic realizations. We develop a modification of the MetropolisHastings split-merge sampler, resulting in an efficient inference algorithm for the model. The method is experimentally evaluated by us- ing the induced semantic representation for the question answering task in the biomedical domain.

reference text

O. Abend, R. Reichart, and A. Rappoport. 2009. Unsupervised argument identification for semantic role labeling. In Proceedings of ACL-IJCNLP, pages 28–36, Singapore. Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In Proc. of the International Joint Conference on Artificial Intelligence (IJCAI), pages 2670–2676. Matthew J. Beal, Zoubin Ghahramani, and Carl E. Rasmussen. 2002. The infinite hidden markov model. In Machine Learning, pages 29–245. MIT Press. David Blackwell and James B. MacQueen. 1973. Ferguson distributions via polya urn schemes. The Annals of Statistics, 1(2):353–355. Xavier Carreras and Llu ı´s M `arquez. 2005. Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In Proceedings of the 9th Conference on Natural Language Learning, CoNLL-2005, Ann Arbor, MI USA. Nathanael Chambers and Dan Jurafsky. 2009. Unsu- pervised learning of narrative schemas and their participants. In Proc. of the Annual Meeting of the Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP). James Clarke, Dan Goldwasser, Ming-Wei Chang, and Dan Roth. 2010. Driving semantic parsing from the world’s response. In Proc. of the Conference on Computational Natural Language Learning (CoNLL). Trevor Cohn, Sharon Goldwater, and Phil Blunsom. 2009. Inducing compact but accurate tree-substitution grammars. In HLT-NAACL, pages 548–556. David B. Dahl. 2003. An improved merge-split sampler for conjugate dirichlet process mixture models. Technical Report 1086, Department of Statistics, University of Wiscosin - Madison, November. Jacob Eisenstein, James Clarke, Dan Goldwasser, and Dan Roth. 2009. Reading to learn: Constructing features from semantic abstracts. In Proceedings of EMNLP. Thomas S. Ferguson. 1973. A bayesian analysis of some nonparametric problems. The Annals of Statistics, 1(2):209–230. C. J. Fillmore, C. R. Johnson, and M. R. L. Petruck. 2003. Background to framenet. International Journal of Lexicography, 16:235–250. Hagen F ¨urstenau and Mirella Lapata. 2009. Graph alignment for semi-supervised semantic role labeling. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP). 1454 Ruifang Ge and Raymond J. Mooney. 2005. A statistical semantic parser that integrates syntax and semantics. In Proceedings of the Ninth Conference on Computational Natural Language Learning (CONLL-05), Ann Arbor, Michigan. Daniel Gildea and Daniel Jurafsky. 2002. Automatic labelling of semantic roles. Computational Linguistics, 28(3):245–288. Dan Goldwasser, Roi Reichart, James Clarke, and Dan Roth. 2011. Confidence driven unsupervised semantic parsing. In Proc. of the Meeting of Association for Computational Linguistics (ACL), Portland, OR, USA. Trond Grenager and Christoph Manning. 2006. Unsupervised discovery of a statistical verb lexicon. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP). Sonia Jain and Radford Neal. 2000. A split-merge markov chain monte carlo procedure for the dirichlet process mixture model. Journal of Computational and Graphical Statistics, 13: 158–182. Mark Johnson, Thomas L. Griffiths, and Sharon Goldwater. 2007. Bayesian inference for PCFGs via Markov chain Monte Carlo. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics, Rochester, USA. Rohit J. Kate and Raymond J. Mooney. 2007. Learning language semantics from ambigous supervision. In Association for the Advancement of Artificial Intelligence (AAAI), pages 895–900. Jin-Dong Kim, Tomoko Ohta, Yuka Tateisi, and Jun’ichi Tsujii. 2003. Genia corpus—a semantically annotated corpus for bio-textmining. Bioinformatics, 19:i180– i182. Joel Lang and Mirella Lapata. 2010. Unsupervised induction of semantic roles. In Proceedings of the 48rd Annual Meeting of the Association for Computational Linguistics (ACL), Uppsala, Sweden. Percy Liang, Slav Petrov, Michael Jordan, and Dan Klein. 2007. The infinite PCFG using hierarchical dirichlet processes. In Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pages 688–697, Prague, Czech Republic. Percy Liang, Michael I. Jordan, and Dan Klein. 2009. Learning semantic correspondences with less supervision. In Proc. ofthe Annual Meeting ofthe Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACLIJCNLP). Dekang Lin and Patrick Pantel. 2001 . Dirt discovery of inference rules from text. In Proc. of International Conference on Knowledge Discovery and Data Mining, pages 323–328. – Raymond J. Mooney. 2007. Learning for semantic parsing. In Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing, pages 982–991. Alexis Palmer and Caroline Sporleder. 2010. Evaluating framenet-style semantic parsing: the role of coverage gaps in framenet. In Proceedings of the Conference on Computational Linguistics (COLING-2000), Beijing. Jim Pitman. 2002. Poisson-dirichlet and gem invariant distributions for split-and-merge transformations of an interval partition. Combinatorics, Probability and Computing, 11:501–514. Hoifung Poon and Pedro Domingos. 2009. Unsupervised semantic parsing. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, (EMNLP-09). Matt Richardson and Pedro Domingos. 2006. Markov logic networks. Machine Learning, 62: 107–136. Alan Ritter, Mausam, and Oren Etzioni. 2010. A latent dirichlet allocation method for selectional preferences. In Proceedings of the 48rd Annual Meeting of the Association for Computational Linguistics (ACL), Uppsala, Sweden. Diarmuid O´ S ´eaghdha. 2010. Latent variable models of selectional preference. In Proceedings of the 48rd Annual Meeting of the Association for Computational Linguistics (ACL), Uppsala, Sweden. R. Swier and S. Stevenson. 2004. Unsupervised semantic role labelling. In Proceedings of EMNLP, pages 95–102, Barcelona, Spain. Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476): 1566–1581. Y. W. Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pages 985– 992. Cynthia A. Thompson, Roger Levy, and Christopher D. Manning. 2003. A generative model for semantic role labeling. In In Senseval-3, pages 397–408. Ivan Titov and Mikhail Kozhevnikov. 2010. Bootstrapping semantic analyzers from non-contradictory texts. In Proceedings of the 48rd Annual Meeting of the Association for Computational Linguistics (ACL), Uppsala, Sweden. Alexander Yates and Oren Etzioni. 2009. Unsupervised methods for determining object and relation synonyms on the web. Journal ofArtificial Intelligence Research, 34:255–296. B. Zapirain, E. Agirre, L. L. M `arquez, and M. Surdeanu. 2010. Improving semantic role classification with selectional prefrences. In Proceedings of the Meeting 1455 of the North American chapter of the Association for Computational Linguistics (NAACL 2010), Los Angeles. Luke Zettlemoyer and Michael Collins. 2005. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammar. In Proceedings of the Twenty-first Conference on Uncertainty in Artificial Intelligence, Edinburgh, UK, August.