emnlp emnlp2012 emnlp2012-116 emnlp2012-116-reference knowledge-graph by maker-knowledge-mining

116 emnlp-2012-Semantic Compositionality through Recursive Matrix-Vector Spaces

Source: pdf

Author: Richard Socher ; Brody Huval ; Christopher D. Manning ; Andrew Y. Ng

Abstract: Single-word vector space models have been very successful at learning lexical information. However, they cannot capture the compositional meaning of longer phrases, preventing them from a deeper understanding of language. We introduce a recursive neural network (RNN) model that learns compositional vector representations for phrases and sentences of arbitrary syntactic type and length. Our model assigns a vector and a matrix to every node in a parse tree: the vector captures the inherent meaning of the constituent, while the matrix captures how it changes the meaning of neighboring words or phrases. This matrix-vector RNN can learn the meaning of operators in propositional logic and natural language. The model obtains state of the art performance on three different experiments: predicting fine-grained sentiment distributions of adverb-adjective pairs; classifying sentiment labels of movie reviews and classifying semantic relationships such as cause-effect or topic-message between nouns using the syntactic path between them.

reference text

M. Baroni and A. Lenci. 2010. Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4):673–721. M. Baroni and Roberto Zamparelli. 2010. Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In EMNLP. L. Bottou. 2011. From machine learning to machine reasoning. CoRR, abs/1 102. 1808. M. Ciaramita and Y. Altun. 2006. Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In EMNLP. S. Clark and S. Pulman. 2007. Combining symbolic and distributional models of meaning. In Proceedings of the AAAI Spring Symposium on Quantum Interaction, pages 52–55. R. Collobert and J. Weston. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. In ICML. J. Curran. 2004. From Distributional to Semantic Similarity. Ph.D. thesis, University of Edinburgh. J. L. Elman. 1991. Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning, 7(2-3). G. Frege. 1892. U¨ber Sinn und Bedeutung. In Zeitschrift f u¨r Philosophie und philosophische Kritik, 100. D. Garrette, K. Erk, and R. Mooney. 2011. Integrating Logical Representations with Probabilistic Information using Markov Logic. In Proceedings of the International Conference on Computational Semantics. C. Goller and A. K ¨uchler. 1996. Learning taskdependent distributed representations by backpropagation through structure. In Proceedings of the International Conference on Neural Networks (ICNN-96). E. Grefenstette and M. Sadrzadeh. 2011. Experimental support for a categorical compositional distributional model of meaning. In EMNLP. T. L. Griffiths, J. B. Tenenbaum, and M. Steyvers. 2007. Topics in semantic representation. Psychological Review, 114. I. Hendrickx, S.N. Kim, Z. Kozareva, P. Nakov, D. O´ S ´eaghdha, S. Pad o´, M. Pennacchiotti, L. Romano, and S. Szpakowicz. 2010. Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the 5th International Workshop on Semantic Evaluation. G. E. Hinton. 1990. Mapping part-whole hierarchies into connectionist networks. Artificial Intelligence, 46(12). R. Jones, B. Rey, O. Madani, and W. Greiner. 2006. Generating query substitutions. In Proceedings of the 15th international conference on World Wide Web. D. Klein and C. D. Manning. 2003. Accurate unlexicalized parsing. In ACL. D. Lin. 1998. Automatic retrieval and clustering of similar words. In Proceedings of COLING-ACL, pages 768–774. E. J. Metcalfe. 1990. A compositive holographic associative recall model. Psychological Review, 88:627– 661. J. Mitchell and M. Lapata. 2010. Composition in distributional models of semantics. Cognitive Science, 34(8): 1388–1429. R. Montague. 1974. English as a formal language. Linguaggi nella Societa e nella Tecnica, pages 189–224. T. Nakagawa, K. Inui, and S. Kurohashi. 2010. Dependency tree-based sentiment classification using CRFs with hidden variables. In NAACL, HLT. M. Pas ¸ca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. 2006. Names and similarities on the web: fact extraction in the fast lane. In ACL. S. Pado and M. Lapata. 2007. Dependency-based construction of semantic space models. Computational Linguistics, 33(2): 161–199. B. Pang and L. Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In ACL, pages 115–124. T. A. Plate. 1995. Holographic reduced representations. IEEE Transactions on Neural Networks, 6(3):623– 641. J. B. Pollack. 1990. Recursive distributed representations. Artificial Intelligence, 46, November. C. Potts. 2010. On the negativity of negation. In David Lutz and Nan Li, editors, Proceedings of Semantics and Linguistic Theory 20. CLC Publications, Ithaca, NY. 1211 L. Ratinov, D. Roth, D. Downey, and M. Anderson. 2011. Local and global algorithms for disambiguation to wikipedia. In ACL. B. Rink and S. Harabagiu. 2010. UTD: Classifying semantic relations by combining lexical and semantic resources. In Proceedings of the 5th International Workshop on Semantic Evaluation. S. Rudolph and E. Giesbrecht. 2010. Compositional matrix-space models of language. In ACL. H. Sch u¨tze. 1998. Automatic word sense discrimination. Computational Linguistics, 24:97–124. R. Socher, C. D. Manning, and A. Y. Ng. 2010. Learning continuous phrase representations and syntactic parsing with recursive neural networks. In Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop. R. Socher, E. H. Huang, J. Pennington, A. Y. Ng, and C. D. Manning. 2011a. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. In NIPS. MIT Press. R. Socher, C. Lin, A. Y. Ng, and C.D. Manning. 2011b. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In ICML. R. Socher, J. Pennington, E. H. Huang, A. Y. Ng, and C. D. Manning. 2011c. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions. In EMNLP. P. D. Turney and P. Pantel. 2010. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37: 141–188. D. Widdows. 2008. Semantic vector products: Some initial investigations. In Proceedings of the Second AAAI Symposium on Quantum Interaction. A. Yessenalina and C. Cardie. 2011. Compositional matrix-space models for sentiment analysis. In EMNLP. F.M. Zanzotto, I. Korkontzelos, F. Fallucchi, and S. Man- andhar. 2010. Estimating linear models for compositional distributional semantics. COLING.