emnlp emnlp2011 emnlp2011-37 emnlp2011-37-reference knowledge-graph by maker-knowledge-mining

37 emnlp-2011-Cross-Cutting Models of Lexical Semantics

Source: pdf

Author: Joseph Reisinger ; Raymond Mooney

Abstract: Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superficial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM finds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirich- let Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such as Latent Dirichlet Allocation.

reference text

David Blei, Thomas Griffiths, Michael Jordan, and Joshua Tenenbaum. 2003. Hierarchical topic models and the nested Chinese restaurant process. In Proc. NIPS-2003. Peter F. Brown, Peter V. deSouza, Robert L. Mercer, Vincent J. Della Pietra, and Jenifer C. Lai. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18:467–479. Jonathan Chang, Jordan Boyd-Graber, Chong Wang, Sean Gerrish, and David M. Blei. 2009. Reading tea leaves: How humans interpret topic models. In NIPS. James Curran. 2004. From Distributional to Semantic Similarity. Ph.D. thesis, University of Edinburgh. Katrin Erk. 2007. A simple, similarity-based model for selectional preferences. In Proc. of the ACL. Association for Computer Linguistics. Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: the concept revisited. In Proc. of WWW 2001. James Gorman and James R. Curran. 2006. Scaling distributional similarity to large corpora. In Proc. of ACL 2006. Thomas L. Griffiths, Mark Steyvers, and Joshua B. Tenenbaum. 2007. Topics in semantic representation. Psychological Review, 114:2007. Thomas Landauer and Susan Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):21 1–240. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Sch u¨tze. 2008. Introduction to Information Retrieval. Cambridge University Press. Vikash K. Mansinghka, Eric Jonas, Cap Petschulat, Beau Cronin, Patrick Shafto, and Joshua B. Tenenbaum. 2009. Cross-categorization: A method for discovering multiple overlapping clusterings. In Proc. of Nonparametric Bayes Workshop at NIPS 2009. Diana McCarthy and Roberto Navigli. 2007. SemEval-2007 task 10: English lexical substitution task. In SemEval ’07: Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics. George A. Miller and Walter G. Charles. 1991 . Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1): 1–28. David Mimno, Wei Li, and Andrew McCallum. 2007. Mixtures of hierarchical topics with pachinko allocation. In ICML. 1415 Gregory L. Murphy. 2002. The Big Book of Concepts. The MIT Press. Donglin Niu, Jennifer G. Dy, and Michael I. Jordan. 2010. Multiple non-redundant spectral clustering views. In Johannes F ¨urnkranz and Thorsten Joachims, editors, Proceedings of the 27th International Conference on Machine Learning (ICML-10), pages 83 1–838. Sebastian Pad o´ and Mirella Lapata. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33(2): 161–199. Joseph Reisinger and Raymond J. Mooney. 2010. A mixture model with sharing for lexical semantics. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2010). Philip Resnik. 1997. Selectional preference and sense disambiguation. In Proceedings of ACL SIGLEX Workshop on Tagging Text with Lexical Semantics, pages 52–57. ACL. Hinrich Sch u¨tze. 1998. Automatic word sense discrimination. Computational Linguistics, 24(1):97–123. Patrick Shafto, Charles Kemp, Vikash Mansinghka, Matthew Gordon, and Joshua B. Tenenbaum. 2006. Learning cross-cutting systems of categories. In Proc. CogSci 2006. Rion Snow, Daniel Jurafsky, and Andrew Ng. 2006. Semantic taxonomy induction from heterogenous evidence. In Proc. of ACL 2006. Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proc. of the ACL. Peter D. Turney. 2006. Similarity of semantic relations. Computational Linguistics, 32(3):379–416. Amos Tversky and Itamar Gati. 1982. Similarity, separability, and the triangle inequality. Psychological Review, 89(2): 123–154. Benjamin Van Durme and Marius Pas ¸ca. 2008. Finding cars, goddesses and enzymes: Parametrizable acquisition of labeled instances for open-domain information extraction. In Proc. of AAAI 2008.