acl acl2013 acl2013-103 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Georgiana Dinu ; Nghia The Pham ; Marco Baroni
Abstract: We introduce DISSECT, a toolkit to build and explore computational models of word, phrase and sentence meaning based on the principles of distributional semantics. The toolkit focuses in particular on compositional meaning, and implements a number of composition methods that have been proposed in the literature. Furthermore, DISSECT can be useful to researchers and practitioners who need models of word meaning (without composition) as well, as it supports various methods to construct distributional semantic spaces, assessing similarity and even evaluating against benchmarks, that are independent of the composition infrastructure.
Reference: text
sentIndex sentText sentNum sentScore
1 DISSECT - DIStributional SEmantics Composition Toolkit Georgiana Dinu and Nghia The Pham and Marco Baroni Center for Mind/Brain Sciences (University of Trento, Italy) ( georgiana . [sent-1, score-0.062]
2 Abstract We introduce DISSECT, a toolkit to build and explore computational models of word, phrase and sentence meaning based on the principles of distributional semantics. [sent-3, score-0.18]
3 The toolkit focuses in particular on compositional meaning, and implements a number of composition methods that have been proposed in the literature. [sent-4, score-0.438]
4 1 Introduction Distributional methods for meaning similarity are based on the observation that similar words occur in similar contexts and measure similarity based on patterns of word occurrence in large corpora (Clark, 2012; Erk, 2012; Turney and Pantel, 2010). [sent-6, score-0.154]
5 Semantic relatedness is assessed by comparing vectors, leading, for example, to determine that car and vehicle are very similar in meaning, since they have similar contextual distributions. [sent-8, score-0.102]
6 Despite the appeal of these methods, modeling words in isolation has limited applications and ideally we want to model semantics beyond word level by representing the meaning of phrases or sentences. [sent-9, score-0.052]
7 These combinations are infinite and compositional methods are called for to derive the meaning of a larger construction from the meaning of its parts. [sent-10, score-0.22]
8 For this reason, the question of compositionality within the distributional . [sent-11, score-0.176]
9 it paradigm has received a lot of attention in recent years and a number of compositional frameworks have been proposed in the distributional semantic literature, see, e. [sent-14, score-0.312]
10 For example, in such frameworks, the distributional representations of red and car may be combined, through various operations, in order to obtain a vector for red car. [sent-18, score-0.401]
11 it / compo s e s /t oolkit) is, to the best of our knowledge, the first to provide an easy-to-use implementation of many compositional methods proposed in the literature. [sent-22, score-0.509]
12 As such, we hope that it will foster further work on compositional distributional semantics, as well as making the relevant techniques easily available to those interested in their many potential applications, e. [sent-23, score-0.244]
13 Moreover, the DISSECT tools to construct distributional semantic spaces from raw co-occurrence counts, to measure similarity and to evaluate these spaces might also be of use to researchers who are not interested in the compositional framework. [sent-26, score-0.55]
14 2 Building and composing distributional semantic representations The pipeline from corpora to compositional models of meaning can be roughly summarized as consisting of three stages:1 1. [sent-28, score-0.404]
15 Extraction of co-occurrence counts from corpora In this stage, an input corpus is used to extract counts of target elements co-occurring with some contextual features. [sent-29, score-0.131]
16 The target elements can vary from words (for lexical similarity), to pairs of words (e. [sent-30, score-0.045]
17 , for relation categorization), 1See Turney and Pantel (2010) for a technical overview of distributional methods for semantics. [sent-32, score-0.128]
18 Transformation of the raw counts This stage may involve the application of weighting schemes such as Pointwise Mutual Information, feature selection, dimensionality reduction methods such as Singular Value Decomposition, etc. [sent-38, score-0.179]
19 The goal is to eliminate the biases that typically affect raw counts and to produce vectors which better approximate similarity in meaning. [sent-39, score-0.165]
20 DISSECT can be used for the second and third stages of this pipeline, as well as to measure similarity among the resulting word or phrase vectors. [sent-42, score-0.051]
21 We do not attempt to implement all the corpus pre-processing and co-occurrence extraction routines that it would require to be of general use, and expect instead as input a matrix of raw target-context co-occurrence counts. [sent-44, score-0.106]
22 2 DISSECT provides various methods to re-weight the counts with association measures, dimensionality reduction methods as well as the composition functions proposed by Mitchell and Lapata (2010) (Additive, Multiplicative and Dilation), Baroni and Zamparelli (2010)/Coecke et al. [sent-45, score-0.477]
23 The focus of DISSECT is to provide an intuitive interface for researchers and to allow easy extension by adding other composition methods. [sent-49, score-0.322]
24 We provide many standard functionalities through a set of power2These counts can be read from a text file containing two strings (the target and context items) and a number (the corresponding count) on each line (e. [sent-51, score-0.043]
25 , maggot food 15) or from a matrix in format word freq1 freq2 . [sent-53, score-0.218]
26 # creat e a s emant i space from count s in c #dens e format ( " dm" ) : wo rd freq1 freq2 . [sent-56, score-0.136]
27 txt " fo rmat= " dm" ) , # apply t ran s format i s on s s = s s . [sent-60, score-0.041]
28 apply ( Svd ( 3 0 0 ) ) # ret r ieve the ve ct o r o f a t arget print s s . [sent-62, score-0.138]
29 get_row ( " car " ) Figure 1: Creating ful command-line e lement a semantic space. [sent-63, score-0.139]
30 This section focuses on this interface (see the online documentation on how to perform the same operations with the command-line tools), that consists of the following top-level packages: #D I SECT S package s . [sent-65, score-0.137]
31 ut i s l compo se s compo compo compo compo compo Semantic spaces and transformations The concept of a semantic space (composes . [sent-71, score-2.671]
32 A semantic space consists of co-occurrence values, stored as a matrix, together with strings associated to the rows of this matrix (by design, the target linguistic elements) and a (potentially empty) list of strings associated to the columns (the context features). [sent-73, score-0.177]
33 t rans format ion) can be applied to semantic spaces. [sent-75, score-0.078]
34 We implement weighting schemes such as positive Pointwise Mutual Information (ppmi) and Local Mutual Information, feature selection methods, dimensionality reduction (Singular Value Decomposition (SVD) and Nonnegative Matrix Factorization (NMF)), and new methods be easily added. [sent-76, score-0.106]
35 3 Going from raw counts transformed space is accomplished can to a in just a few lines of code (Figure 1). [sent-77, score-0.137]
36 3The complete list of transformations currently supported can be found at http : / / cli . [sent-78, score-0.089]
37 32 #l oad a previ ou s ly s aved space ss = i o_ut i s . [sent-83, score-0.1]
38 pk l ) " # comput e co s ine s imi l arit y print s s . [sent-85, score-0.318]
39 get_s im ( " car " " book " Co s S imi l arity ( ) ) , , #the two neare st ne ighbour s o f " car " print s s . [sent-86, score-0.62]
40 get_ne ighbours ( " car " 2 Co s S imi l arity ( ) ) , , Figure 2: Similarity queries in a semantic space. [sent-87, score-0.411]
41 Furthermore DISSECT allows the possibility of adding new data to a seman- tic space in an online manner (using the semantic space . [sent-88, score-0.165]
42 This can be used as a way to efficiently expand a co-occurrence matrix with new rows, without re-applying the transformations to the entire space. [sent-90, score-0.135]
43 In some other cases, the user may want to represent phrases that are specialization of words already existing in the space (e. [sent-91, score-0.064]
44 , slimy maggot and maggot), without distorting the computation of association measures by counting the same context twice. [sent-93, score-0.141]
45 In this case, adding slimy maggot as a “peripheral” row to a semantic space that already contains maggot implements the desired behaviour. [sent-94, score-0.343]
46 Similarity queries Semantic spaces are used for the computation of similarity scores. [sent-95, score-0.178]
47 DISSECT provides a series of similarity measures such as cosine, inverse Euclidean distance and Lin similarity, implemented in the compo se s . [sent-96, score-0.503]
48 Similarity of two elements can be computed within one semantic space or across two spaces that have the same dimensionality. [sent-98, score-0.24]
49 Figure 2 exemplifies (word) similarity computations with DISSECT. [sent-99, score-0.129]
50 Composition functions Composition functions in DISSECT (compo s e s . [sent-100, score-0.074]
51 compo s it ion) take as arguments a list of element pairs to be composed, and one or two spaces where the elements to be composed are represented. [sent-101, score-0.562]
52 They return a semantic space containing the distributional representations of the composed items, which can be further transformed, used for similarity queries, or used as inputs to another round of composition, thus scaling up beyond binary composition. [sent-102, score-0.381]
53 See Table 1 for the currently available composition models, their definitions and parameters. [sent-104, score-0.322]
54 | |22 v~~ v + (λ − 1) h u~, vi~ u λ(= 2) Fulladd |W|~ u1|| u~ +v + +W (2λv~ − W1, W2 ∈ Rm×m Lexfunc Au~ v Au ∈ Rm∈× Rm Table 1: Currently implemented composition functions of inputs (u, v) together with parameters and their default values in parenthesis, where defined. [sent-109, score-0.359]
55 Note that in Lexfunc each functor word corresponds to a separate matrix or tensor Au (Baroni and Zamparelli, 2010). [sent-110, score-0.076]
56 Parameter estimation All composition models except Multiplicative have parameters to be estimated. [sent-111, score-0.363]
57 However, DISSECT supports automated parameter estimation from training examples. [sent-113, score-0.041]
58 The problem can be generally stated as: θ∗ = argθ min||P − fcompθ(U,V )||F where U, V and P are matrices containing input and output vectors respectively. [sent-115, score-0.041]
59 For example U may contain adjective vectors such as red, blue, V noun vectors such as car, sky and P corpusextracted vectors for the corresponding phrases red car, blue sky. [sent-116, score-0.173]
60 fcompθ is a composition function and θ stands for a list of parameters that this composition function is associated with. [sent-117, score-0.644]
61 Composition output examples DISSECT provides functions to evaluate (compositional) distributional semantic spaces against benchmarks in the compose s . [sent-120, score-0.296]
62 33 # inst ant i e a mult ipl i at cat ive mode l mult_model = Mult ipl icat ive ( ) #u s e the mode l t o compo s e wo rds from i nput space i nput_space comp_space = mult_mode l compo s e ( [ ( " red " . [sent-124, score-1.179]
63 #t raining dat a for learning an adj ect ive-noun phrase mode l t rain_dat a = [ ( " red " " book " " red_book " ) ( "blue " " car " " blue_car " ) ] , , #t rain a ful l add mode l fa_mode l = Fu l lAddit ive ( ) fa_mode l t rain ( t rain_dat a . [sent-128, score-0.38]
64 #use the mode l to compo se a phrase from new words and ret rieve it s neare st ne ighb comp_space = fa_model compo se ( [ ( " ye l low " " t able " "my_yel low_t able " ) ] input_space ) print comp_space get_ne ighbours ( "my_ye l ow_t abl e " l 1 0 Co s S imi l arity ( ) ) . [sent-130, score-1.363]
65 Top 3 neighbours of florist using its (lowfrequency) corpus-extracted vector, and when the vector is obtained through composition of flora and -ist with Fulladd, Lexfunc and Additive. [sent-134, score-0.46]
66 neighbours of false belief obtained through composition with the Fulladd, Lexfunc and Additive models. [sent-135, score-0.413]
67 In Table 3, we exemplify a less typical application of compositional models to derivational morphology, namely obtaining a representation of florist compositionally from distributional representations of flora and -ist (Lazaridou et al. [sent-136, score-0.391]
68 4 Main features Support for dense and sparse representations Co-occurrence matrices, as extracted from text, tend to be very sparse structures, especially when using detailed context features which include syntactic information, for example. [sent-138, score-0.206]
69 On the other hand, dimensionality reduction operations, which are often used in distributional models, lead to smaller, dense structures, for which sparse representations are not optimal. [sent-139, score-0.368]
70 This is our motivation for supporting both dense and sparse representations. [sent-140, score-0.094]
71 sparse is initially determined by the input format, if a space is created from co-occurrence counts. [sent-142, score-0.105]
72 By default, DISSECT switches to dense representations after dimensionality reduction, however the user can freely switch from one representation to the other, in order to optimize computations. [sent-143, score-0.168]
73 For this purpose DISSECT provides wrappers around matrix operations, as well as around common linear algebra operations, in the compose s . [sent-144, score-0.076]
74 5For SVD on sparse structures, we use sparse svd (https : / /pypi . [sent-151, score-0.132]
75 5 ry Table 5: Running times (in seconds) for 1) application of ppmi weighting and 2) querying for the top neighbours of a word (cosine similarity) for different matrix sizes (nnz: number of non-zero entries, in millions). [sent-166, score-0.23]
76 The price to pay for fast computations is that data must be stored in main memory. [sent-168, score-0.042]
77 Simple design We have opted for a very simple and intuitive design as the classes interact in very natural ways: A semantic space stores the actual data matrix and structures to index its rows and columns, and supports similarity queries and transformations. [sent-171, score-0.261]
78 Transformations take one semantic space as input to return another, transformed, space. [sent-172, score-0.101]
79 Composition functions take one or more input spaces and yield a composed-elements space, which can further undergo transformations and be used for similarity queries. [sent-173, score-0.241]
80 In fact, DISSECT semantic spaces also support higher-order tensor representations, not just vectors. [sent-174, score-0.131]
81 Higher-order representations are used, for example, to represent transitive verbs and other multi-argument functors by Coecke et al. [sent-175, score-0.071]
82 Extensive documentation The DISSECT documentation can be found at http : / / clic . [sent-183, score-0.1]
83 We provide a tutorial which guides the user through the creation of some toy semantic spaces, estimation of the parameters of composition models and similarity computations in semantic spaces. [sent-187, score-0.53]
84 We show how to go from co-occurrence counts to composed representations and make the data used in the examples available for download. [sent-189, score-0.144]
85 The SSpace package of Jurgens and Stevens (2010) also overlaps to some degree with DISSECT in terms of its intended use, however, like Gensim, it does not support composi- tional operations that, as far as we know, are an unique feature of DISSECT. [sent-192, score-0.105]
86 5 Future extensions We implemented and are currently testing DISSECT functions supporting other composition methods, including the one proposed by Socher et al. [sent-193, score-0.359]
87 In particular, several distributional models of word meaning in context share important similarities with composition models, and we plan to add them to DISSECT. [sent-196, score-0.502]
88 (201 1) and Erk and Pad o´ (2008) can be reduced to relatively simple matrix operations, making them particularly suitable for a DISSECT implementation. [sent-200, score-0.076]
89 35 DISSECT is currently optimized for the composition of many phrases of the same type. [sent-201, score-0.322]
90 In the future we plan to provide a module for composing entire sentences, taking syntactic trees as input and returning composed representations for each node in the input trees. [sent-203, score-0.101]
91 Mathematical foundations for a compositional distributional model of meaning. [sent-220, score-0.244]
92 A comparison of models of word meaning in context. [sent-224, score-0.052]
93 A general framework for the estimation of distributional composition functions. [sent-228, score-0.491]
94 A structured vector space model for word meaning in context. [sent-233, score-0.116]
95 Vector space models of word meaning and phrase meaning: A survey. [sent-237, score-0.116]
96 A regression model of adjective-noun compositionality in distributional semantics. [sent-245, score-0.176]
97 The S-Space package: an open source package for word space models. [sent-249, score-0.094]
98 Compositional-ly derived representations of morphologically complex words in distributional semantics. [sent-253, score-0.199]
99 Word meaning in context: A simple and effective vector model. [sent-278, score-0.052]
100 From frequency to meaning: Vector space models of semantics. [sent-282, score-0.064]
wordName wordTfidf (topN-words)
[('dissect', 0.607), ('compo', 0.393), ('composition', 0.322), ('imi', 0.133), ('fulladd', 0.132), ('distributional', 0.128), ('compositional', 0.116), ('lexfunc', 0.115), ('print', 0.107), ('car', 0.102), ('maggot', 0.101), ('spaces', 0.094), ('matrix', 0.076), ('operations', 0.075), ('dinu', 0.074), ('representations', 0.071), ('arity', 0.066), ('space', 0.064), ('multiplicative', 0.062), ('neighbours', 0.062), ('georgiana', 0.062), ('thater', 0.062), ('ppmi', 0.061), ('transformations', 0.059), ('se', 0.059), ('zamparelli', 0.059), ('python', 0.058), ('dense', 0.053), ('meaning', 0.052), ('marco', 0.052), ('similarity', 0.051), ('baroni', 0.051), ('red', 0.05), ('svd', 0.05), ('cimec', 0.049), ('lazaridou', 0.049), ('gensim', 0.049), ('compositionality', 0.048), ('additive', 0.048), ('nmf', 0.047), ('elements', 0.045), ('coecke', 0.044), ('dimensionality', 0.044), ('counts', 0.043), ('computations', 0.042), ('ive', 0.042), ('mode', 0.042), ('vectors', 0.041), ('pham', 0.041), ('sparse', 0.041), ('estimation', 0.041), ('format', 0.041), ('arit', 0.04), ('fcomp', 0.04), ('florist', 0.04), ('ighbours', 0.04), ('ipl', 0.04), ('lkit', 0.04), ('neare', 0.04), ('phra', 0.04), ('slimy', 0.04), ('spacet', 0.04), ('book', 0.04), ('co', 0.038), ('functions', 0.037), ('semantic', 0.037), ('nghia', 0.036), ('oolkit', 0.036), ('clic', 0.036), ('exemplifies', 0.036), ('flora', 0.036), ('oad', 0.036), ('queries', 0.033), ('rek', 0.033), ('dilation', 0.033), ('cipy', 0.033), ('numpy', 0.033), ('rix', 0.033), ('documentation', 0.032), ('wo', 0.031), ('ion', 0.031), ('angeliki', 0.031), ('ret', 0.031), ('rain', 0.031), ('au', 0.031), ('frameworks', 0.031), ('reduction', 0.031), ('weighting', 0.031), ('raw', 0.03), ('package', 0.03), ('composed', 0.03), ('im', 0.03), ('guevara', 0.03), ('cli', 0.03), ('intransitive', 0.03), ('erk', 0.029), ('belief', 0.029), ('mehrnoosh', 0.028), ('composes', 0.028), ('jurgens', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 103 acl-2013-DISSECT - DIStributional SEmantics Composition Toolkit
Author: Georgiana Dinu ; Nghia The Pham ; Marco Baroni
Abstract: We introduce DISSECT, a toolkit to build and explore computational models of word, phrase and sentence meaning based on the principles of distributional semantics. The toolkit focuses in particular on compositional meaning, and implements a number of composition methods that have been proposed in the literature. Furthermore, DISSECT can be useful to researchers and practitioners who need models of word meaning (without composition) as well, as it supports various methods to construct distributional semantic spaces, assessing similarity and even evaluating against benchmarks, that are independent of the composition infrastructure.
2 0.30283716 87 acl-2013-Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics
Author: Angeliki Lazaridou ; Marco Marelli ; Roberto Zamparelli ; Marco Baroni
Abstract: Speakers of a language can construct an unlimited number of new words through morphological derivation. This is a major cause of data sparseness for corpus-based approaches to lexical semantics, such as distributional semantic models of word meaning. We adapt compositional methods originally developed for phrases to the task of deriving the distributional meaning of morphologically complex words from their parts. Semantic representations constructed in this way beat a strong baseline and can be of higher quality than representations directly constructed from corpus data. Our results constitute a novel evaluation of the proposed composition methods, in which the full additive model achieves the best performance, and demonstrate the usefulness of a compositional morphology component in distributional semantics.
Author: Raffaella Bernardi ; Georgiana Dinu ; Marco Marelli ; Marco Baroni
Abstract: Distributional models of semantics capture word meaning very effectively, and they have been recently extended to account for compositionally-obtained representations of phrases made of content words. We explore whether compositional distributional semantic models can also handle a construction in which grammatical terms play a crucial role, namely determiner phrases (DPs). We introduce a new publicly available dataset to test distributional representations of DPs, and we evaluate state-of-the-art models on this set.
4 0.16686393 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
Author: Kartik Goyal ; Sujay Kumar Jauhar ; Huiying Li ; Mrinmaya Sachan ; Shashank Srivastava ; Eduard Hovy
Abstract: In this paper we present a novel approach to modelling distributional semantics that represents meaning as distributions over relations in syntactic neighborhoods. We argue that our model approximates meaning in compositional configurations more effectively than standard distributional vectors or bag-of-words models. We test our hypothesis on the problem of judging event coreferentiality, which involves compositional interactions in the predicate-argument structure of sentences, and demonstrate that our model outperforms both state-of-the-art window-based word embeddings as well as simple approaches to compositional semantics pre- viously employed in the literature.
5 0.14093083 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
Author: Karl Moritz Hermann ; Phil Blunsom
Abstract: Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is a fundamental task of Natural Language Processing. In this paper we draw upon recent advances in the learning of vector space representations of sentential semantics and the transparent interface between syntax and semantics provided by Combinatory Categorial Grammar to introduce Combinatory Categorial Autoencoders. This model leverages the CCG combinatory operators to guide a non-linear transformation of meaning within a sentence. We use this model to learn high dimensional embeddings for sentences and evaluate them in a range of tasks, demonstrating that the incorporation of syntax allows a concise model to learn representations that are both effective and general.
6 0.10693494 238 acl-2013-Measuring semantic content in distributional vectors
7 0.096314259 275 acl-2013-Parsing with Compositional Vector Grammars
8 0.081414409 104 acl-2013-DKPro Similarity: An Open Source Framework for Text Similarity
9 0.072950803 27 acl-2013-A Two Level Model for Context Sensitive Inference Rules
10 0.069039159 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
11 0.067733034 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics
12 0.062013898 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
13 0.061043967 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data
14 0.059673402 31 acl-2013-A corpus-based evaluation method for Distributional Semantic Models
15 0.057628315 380 acl-2013-VSEM: An open library for visual semantics representation
16 0.055053297 76 acl-2013-Building and Evaluating a Distributional Memory for Croatian
17 0.051240545 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
18 0.049750324 263 acl-2013-On the Predictability of Human Assessment: when Matrix Completion Meets NLP Evaluation
19 0.048924074 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
20 0.047883097 218 acl-2013-Latent Semantic Tensor Indexing for Community-based Question Answering
topicId topicWeight
[(0, 0.127), (1, 0.024), (2, 0.02), (3, -0.119), (4, -0.099), (5, -0.077), (6, -0.002), (7, 0.023), (8, -0.026), (9, 0.088), (10, -0.074), (11, 0.012), (12, 0.227), (13, -0.129), (14, -0.041), (15, 0.104), (16, -0.099), (17, -0.154), (18, -0.007), (19, -0.02), (20, -0.026), (21, 0.066), (22, -0.025), (23, -0.024), (24, -0.107), (25, -0.02), (26, 0.007), (27, 0.08), (28, -0.069), (29, 0.052), (30, 0.036), (31, 0.086), (32, 0.051), (33, -0.035), (34, 0.022), (35, 0.107), (36, 0.051), (37, 0.031), (38, 0.021), (39, 0.067), (40, 0.063), (41, -0.133), (42, 0.059), (43, 0.036), (44, -0.022), (45, 0.039), (46, 0.027), (47, -0.052), (48, 0.036), (49, 0.109)]
simIndex simValue paperId paperTitle
same-paper 1 0.94199759 103 acl-2013-DISSECT - DIStributional SEmantics Composition Toolkit
Author: Georgiana Dinu ; Nghia The Pham ; Marco Baroni
Abstract: We introduce DISSECT, a toolkit to build and explore computational models of word, phrase and sentence meaning based on the principles of distributional semantics. The toolkit focuses in particular on compositional meaning, and implements a number of composition methods that have been proposed in the literature. Furthermore, DISSECT can be useful to researchers and practitioners who need models of word meaning (without composition) as well, as it supports various methods to construct distributional semantic spaces, assessing similarity and even evaluating against benchmarks, that are independent of the composition infrastructure.
2 0.88439178 32 acl-2013-A relatedness benchmark to test the role of determiners in compositional distributional semantics
Author: Raffaella Bernardi ; Georgiana Dinu ; Marco Marelli ; Marco Baroni
Abstract: Distributional models of semantics capture word meaning very effectively, and they have been recently extended to account for compositionally-obtained representations of phrases made of content words. We explore whether compositional distributional semantic models can also handle a construction in which grammatical terms play a crucial role, namely determiner phrases (DPs). We introduce a new publicly available dataset to test distributional representations of DPs, and we evaluate state-of-the-art models on this set.
Author: Angeliki Lazaridou ; Marco Marelli ; Roberto Zamparelli ; Marco Baroni
Abstract: Speakers of a language can construct an unlimited number of new words through morphological derivation. This is a major cause of data sparseness for corpus-based approaches to lexical semantics, such as distributional semantic models of word meaning. We adapt compositional methods originally developed for phrases to the task of deriving the distributional meaning of morphologically complex words from their parts. Semantic representations constructed in this way beat a strong baseline and can be of higher quality than representations directly constructed from corpus data. Our results constitute a novel evaluation of the proposed composition methods, in which the full additive model achieves the best performance, and demonstrate the usefulness of a compositional morphology component in distributional semantics.
4 0.65142566 238 acl-2013-Measuring semantic content in distributional vectors
Author: Aurelie Herbelot ; Mohan Ganesalingam
Abstract: Some words are more contentful than others: for instance, make is intuitively more general than produce and fifteen is more ‘precise’ than a group. In this paper, we propose to measure the ‘semantic content’ of lexical items, as modelled by distributional representations. We investigate the hypothesis that semantic content can be computed using the KullbackLeibler (KL) divergence, an informationtheoretic measure of the relative entropy of two distributions. In a task focusing on retrieving the correct ordering of hyponym-hypernym pairs, the KL diver- gence achieves close to 80% precision but does not outperform a simpler (linguistically unmotivated) frequency measure. We suggest that this result illustrates the rather ‘intensional’ aspect of distributions.
5 0.63413048 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
Author: Kartik Goyal ; Sujay Kumar Jauhar ; Huiying Li ; Mrinmaya Sachan ; Shashank Srivastava ; Eduard Hovy
Abstract: In this paper we present a novel approach to modelling distributional semantics that represents meaning as distributions over relations in syntactic neighborhoods. We argue that our model approximates meaning in compositional configurations more effectively than standard distributional vectors or bag-of-words models. We test our hypothesis on the problem of judging event coreferentiality, which involves compositional interactions in the predicate-argument structure of sentences, and demonstrate that our model outperforms both state-of-the-art window-based word embeddings as well as simple approaches to compositional semantics pre- viously employed in the literature.
6 0.59094381 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
7 0.51078749 31 acl-2013-A corpus-based evaluation method for Distributional Semantic Models
8 0.47888529 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
9 0.4695282 275 acl-2013-Parsing with Compositional Vector Grammars
10 0.46680716 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
11 0.45267564 76 acl-2013-Building and Evaluating a Distributional Memory for Croatian
12 0.41213387 313 acl-2013-Semantic Parsing with Combinatory Categorial Grammars
14 0.36033368 294 acl-2013-Re-embedding words
15 0.3597891 349 acl-2013-The mathematics of language learning
16 0.35049087 302 acl-2013-Robust Automated Natural Language Processing with Multiword Expressions and Collocations
17 0.3503598 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics
18 0.34987298 304 acl-2013-SEMILAR: The Semantic Similarity Toolkit
19 0.34459007 381 acl-2013-Variable Bit Quantisation for LSH
20 0.34294349 12 acl-2013-A New Set of Norms for Semantic Relatedness Measures
topicId topicWeight
[(0, 0.075), (6, 0.018), (11, 0.064), (14, 0.021), (24, 0.035), (26, 0.051), (28, 0.017), (35, 0.133), (42, 0.043), (45, 0.207), (48, 0.124), (64, 0.011), (70, 0.043), (88, 0.025), (90, 0.011), (95, 0.045)]
simIndex simValue paperId paperTitle
1 0.87208486 371 acl-2013-Unsupervised joke generation from big data
Author: Sasa Petrovic ; David Matthews
Abstract: Humor generation is a very hard problem. It is difficult to say exactly what makes a joke funny, and solving this problem algorithmically is assumed to require deep semantic understanding, as well as cultural and other contextual cues. We depart from previous work that tries to model this knowledge using ad-hoc manually created databases and labeled training examples. Instead we present a model that uses large amounts of unannotated data to generate I like my X like I like my Y, Z jokes, where X, Y, and Z are variables to be filled in. This is, to the best of our knowledge, the first fully unsupervised humor generation system. Our model significantly outperforms a competitive baseline and generates funny jokes 16% of the time, compared to 33% for human-generated jokes.
same-paper 2 0.86292654 103 acl-2013-DISSECT - DIStributional SEmantics Composition Toolkit
Author: Georgiana Dinu ; Nghia The Pham ; Marco Baroni
Abstract: We introduce DISSECT, a toolkit to build and explore computational models of word, phrase and sentence meaning based on the principles of distributional semantics. The toolkit focuses in particular on compositional meaning, and implements a number of composition methods that have been proposed in the literature. Furthermore, DISSECT can be useful to researchers and practitioners who need models of word meaning (without composition) as well, as it supports various methods to construct distributional semantic spaces, assessing similarity and even evaluating against benchmarks, that are independent of the composition infrastructure.
3 0.82509118 102 acl-2013-DErivBase: Inducing and Evaluating a Derivational Morphology Resource for German
Author: Britta Zeller ; Jan Snajder ; Sebastian Pado
Abstract: Derivational models are still an underresearched area in computational morphology. Even for German, a rather resourcerich language, there is a lack of largecoverage derivational knowledge. This paper describes a rule-based framework for inducing derivational families (i.e., clusters of lemmas in derivational relationships) and its application to create a highcoverage German resource, DERIVBASE, mapping over 280k lemmas into more than 17k non-singleton clusters. We focus on the rule component and a qualitative and quantitative evaluation. Our approach achieves up to 93% precision and 71% recall. We attribute the high precision to the fact that our rules are based on information from grammar books.
4 0.74093968 171 acl-2013-Grammatical Error Correction Using Integer Linear Programming
Author: Yuanbin Wu ; Hwee Tou Ng
Abstract: unkown-abstract
5 0.69777131 87 acl-2013-Compositional-ly Derived Representations of Morphologically Complex Words in Distributional Semantics
Author: Angeliki Lazaridou ; Marco Marelli ; Roberto Zamparelli ; Marco Baroni
Abstract: Speakers of a language can construct an unlimited number of new words through morphological derivation. This is a major cause of data sparseness for corpus-based approaches to lexical semantics, such as distributional semantic models of word meaning. We adapt compositional methods originally developed for phrases to the task of deriving the distributional meaning of morphologically complex words from their parts. Semantic representations constructed in this way beat a strong baseline and can be of higher quality than representations directly constructed from corpus data. Our results constitute a novel evaluation of the proposed composition methods, in which the full additive model achieves the best performance, and demonstrate the usefulness of a compositional morphology component in distributional semantics.
6 0.69705486 275 acl-2013-Parsing with Compositional Vector Grammars
7 0.6949088 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
8 0.69407564 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics
9 0.6845454 231 acl-2013-Linggle: a Web-scale Linguistic Search Engine for Words in Context
10 0.68281537 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
11 0.6827116 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
12 0.68103135 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
13 0.68055499 354 acl-2013-Training Nondeficient Variants of IBM-3 and IBM-4 for Word Alignment
14 0.67980486 238 acl-2013-Measuring semantic content in distributional vectors
15 0.67937887 62 acl-2013-Automatic Term Ambiguity Detection
16 0.67819238 291 acl-2013-Question Answering Using Enhanced Lexical Semantic Models
18 0.67777288 237 acl-2013-Margin-based Decomposed Amortized Inference
19 0.67760324 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics
20 0.67748797 191 acl-2013-Improved Bayesian Logistic Supervised Topic Models with Data Augmentation