acl acl2010 acl2010-66 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sebastian Rudolph ; Eugenie Giesbrecht
Abstract: We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. [sent-2, score-0.502]
2 We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms. [sent-3, score-0.473]
3 They embody the distributional hypothesis of meaning (Firth, 1957), according to which the meaning of words is defined by contexts in which they (co-)occur. [sent-8, score-0.151]
4 Until recently, little attention has been paid – – to the task of modeling more complex conceptual structures with such models, which constitutes a crucial barrier for semantic vector models Eugenie Giesbrecht FZI Forschungszentrum Informatik Karlsuhe, Germany giesbrecht@fzi de . [sent-11, score-0.148]
5 An emerging area of research receiving more and more attention among the advocates of distributional models addresses the methods, algorithms, and evaluation strategies for representing compositional aspects of language within a VSM framework. [sent-13, score-0.216]
6 There are approaches under way to work out a combined framework for meaning representation using both the advantages of symbolic and distributional methods. [sent-15, score-0.276]
7 Clark and Pulman (2007) suggest a conceptual model which unites symbolic and distributional representations by means of traversing the parse tree of a sentence and applying a tensor product for combining vectors of the meanings of words with the vectors of their roles. [sent-16, score-0.67]
8 Section 4 shows how common VSM approaches to compositionality can be cap- tured by CMSMs while Section 5 illustrates the capabilities of our model to likewise cover symbolic approaches. [sent-22, score-0.311]
9 Given two real numbers n, m, an n×m matrix over tGheiv e rnea twls ois r an array eorsf rne,a ml ,n aunm nb×erms with n rows and m columns. [sent-42, score-0.321]
10 We will use capital letters to denote matrices and, given a matrix M we will write M(i, j) to refer to the entry in the ith row and the jth column: M= M (n12. [sent-43, score-0.617]
11 ,m) ×× × The set of all n m matrices with real numbTehre e snetrtie osf i sa lde nno ×te md by Rtrinc×ems. [sent-45, score-0.261]
12 nAs monaatlri vxe can s b cea transposed by exchanging columns and rows: given the n m matrix M, its transposed vde rrosiwons: M giTv ins a m n mm mataritxri xd Mefin,e itds by MT(i, j) = M(j, i). [sent-47, score-0.397]
13 Beyond being merely arraylike data structures, matrices correspond to certain type of functions, so-called linear mappings, having vectors as in- and output. [sent-49, score-0.423]
14 Formally, the matrix product of the n lmatrix M1 and the l m matrix M2 disu an n m m×alt rmixa t Mrix = M1M2 hdeef iln×edm by M(i, j) =kX=l1M1(i,k) · M2(k, j) Note that the matrix product is associative (i. [sent-51, score-1.124]
15 Likewise, a permutation can be applied to a vector resulting in a rearrangement of the entries. [sent-76, score-0.252]
16 We write Φn to denote the permutation corresponding to the n-fold application of Φ and Φ−1 to denote the permutation that “undoes” Φ. [sent-77, score-0.311]
17 Given a permutation Φ, the corresponding permutation matrix MΦ is defined by MΦ(i, j) =(10 iofth Φe(rjw)i =se i. [sent-78, score-0.597]
18 , Then, obviously permuting a vector according to Φ can be expressed in terms of matrix multiplication as well as we obtain for any vector v ∈ Rn: Φ(v) = vMΦ Likewise, iterated application (Φn) and the inverses Φ−n carry over naturally to the corresponding notions in matrices. [sent-79, score-0.731]
19 908 3 Compositionality and Matrices The underlying principle of compositional semantics is that the meaning of a sentence (or a word phrase) can be derived from the meaning of its constituent tokens by applying a composition operation. [sent-80, score-0.504]
20 /: S∗ → S mapping sequences of meanings to meanings →suc Sh mthaapt pthineg meaning eofs a sequence of tokens σ1σ2 . [sent-82, score-0.249]
21 / A great variety of linguistic models are subsumed by this general idea ranging from purely symbolic approaches (like type systems and categorial grammars) to rather statistical models (like vector space and word space models). [sent-101, score-0.371]
22 At the first glance, the underlying encodings of word semantics as well as the composition operations differ significantly. [sent-102, score-0.288]
23 However, we argue that a great variety of them can be incorporated and even freely inter-combined into a unified model where the semantics of simple tokens and complex phrases is expressed by matrices and the composition operation is standard matrix multiplication. [sent-103, score-0.907]
24 the semantical space consists of quadratic matrices, and the composition operator . [sent-106, score-0.229]
25 / coincides with matrix multiplication as introduced in Section 2. [sent-107, score-0.461]
26 1 Algebraic Plausibility – – – Structural Operation Properties Most linear-algebra-based operations that have been proposed to model composition in language models are associative and commutative. [sent-110, score-0.295]
27 As mentioned before, matrix multiplication is associative but non-commutative, whence we propose it as more adequate for modeling compositional semantics of language. [sent-113, score-0.687]
28 Suppose the mental state of a person at one spe- cific moment in time can be encoded by a vector v of numerical values; one might, e. [sent-116, score-0.312]
29 Then, an external stimulus or signal, such as a perceived word, will result in a change of the mental state. [sent-119, score-0.2]
30 Thus, the external stimulus can be seen as a function being applied to v yielding as result the vector v0 that corresponds to the persons mental state after receiving the signal. [sent-120, score-0.347]
31 Therefore, it seems sensible to associate with every signal (in our case: token σ) a respective function (a linear mapping, represented by a matrix M = [[σ]] that maps mental states to mental states (i. [sent-121, score-0.784]
32 Consequently, the subsequent reception of inputs σ, σ0 associated to matrices M and M0 will transform a mental vector v into the vector (vM)M0 which by associativity equals v(MM0). [sent-124, score-0.697]
33 Therefore, MM0 represents the mental state transition triggered by the signal sequence σσ0. [sent-125, score-0.232]
34 This way, abstracting from specific initial mental state vectors, our semantic space S can be seen as a function space of mental transformations represented by matrices, whereby matrix multiplication realizes subsequent execution of those transformations triggered by the input token sequence. [sent-127, score-1.028]
35 The mental state vector can be seen as representation of a person’s working memory which gets transformed by external input. [sent-131, score-0.381]
36 Note that matrices can perform standard memory operations such as storing, deleting, copying etc. [sent-132, score-0.365]
37 For instance, the matrix Mcopy(k,l) defined by – Mcopy(k,l)(i, j) =( 10 i oft ihe =rw jis ,e l. [sent-133, score-0.321]
38 4 CMSMs Encode Vector Space Models In VSMs numerous vector operations have been used to model composition (Widdows, 2008), some of the more advanced ones being related to quantum mechanics. [sent-138, score-0.476]
39 We show how these common composition operators can be modeled by CMSMs. [sent-139, score-0.183]
40 / : ×RnR → R Rn0×n0 that translates the vector representation→ in Rto a matrix representation in a way such that for all v1, . [sent-142, score-0.471]
41 /(vj) denotes matrix multiplication of the matrices assigned to vi and vj. [sent-158, score-0.758]
42 Thereby, × tokens σ get assigned (usually high-dimensional) vectors vσ and to obtain a representation of the meaning of a phrase or a sentence w = σ1 . [sent-161, score-0.242]
43 σk, the vector sum of the vectors associated to the constituent tokens is calculated: vw = vσi . [sent-164, score-0.279]
44 Pik=1 1In our investigations we will focus oPn VSM composition operations which preserve the format (i. [sent-165, score-0.254]
45 which yield a vector of the same dimensionality), as our notion of compositionality requires models that allow for iterated composition. [sent-167, score-0.279]
46 In particular, this rules out dot product and tensor product. [sent-168, score-0.216]
47 However the convolution product can be seen as a condensed version of the tensor product. [sent-169, score-0.289]
48 This kind of composition operation is subsumed by CMSMs; suppose in the original model, a token σ gets assigned the vector vσ, then by defining ψ+(vσ)= 10. [sent-170, score-0.494]
49 By using a different encoding into matrices, CMSMs can simulate this type of composition op- × eration as well. [sent-211, score-0.183]
50 0· vσ0 (n) , we obtain an n n matrix representation for which ψ−? [sent-215, score-0.357]
51 910 of convolution products with the benefit of preserving dimensionality: given two vectors v1, v2 ∈ Rn, their circular convolution product v1 v2 i∈s again an n-dimensional vector v3 defined by ~ v3(i + 1) =Xnk−=10v1(k + 1) · v2((i − k mod n) + 1) for 0 ≤ i≤ n − 1. [sent-231, score-0.427]
52 , a vector representation of a token is endowed with contextual representations of surrounding tokens. [sent-265, score-0.228]
53 Now, by assigning every vσ the matrix ψΦ(vσ)= MvσΦ0 1. [sent-267, score-0.321]
54 From the perspective of our compositionality framework, those approaches employ a group (or pre-group) (G, ·) as semantical space S wa ghreorue pth (eo group operation (often wmarnittteicna as pmacuelti Splication) is used as composition operation . [sent-284, score-0.52]
55 Hence, assuming finiteness of G and consequently S , we can encode group-based grammar formalisms into CMSMs in a straightforward way by using permutation matrices of size |S | |S |. [sent-287, score-0.435]
56 2 Regular Languages Regular languages constitute a basic type of languages characterized by a symbolic formalism. [sent-289, score-0.208]
57 We will show how to select the assignment [[ · ]] fWore a iCllM shSoMw su hochw t thoat s ethleec tm tahteri axs saisgsnomcieantetd [[ to a token sequence exhibits whether this sequence belongs to a given regular language, that is if it is accepted by a given finite state automaton. [sent-290, score-0.218]
58 j) ∈ ∆, × n Hence essentially, the matrix M encodes all state transitions which can be caused by the input σ. [sent-299, score-0.354]
59 Finally, if we define vectors vI and vF by vI(i) =(01 o iftqh ier∈wi QseI, vF(i) =(01 o iftqh ier∈wi QseF,, then we find that w is accepted by A exactly if vIMwvFT ≥ 1d. [sent-307, score-0.193]
60 3 The General Case: Matrix Grammars Motivated by the above findings, we now define a general notion of matrix grammars as follows: Definition 1 Let Σ be an alphabet. [sent-309, score-0.374]
61 A matrix grammar M of degree n is defined as the pair h [[ · ]], ACi Mwh oefre d [[ · ]] eis a mapping from Σ th to n×n mh [[at ·r ]]i,ce As Cain dw h AeCre = {hv01 , v1, r1i, . [sent-310, score-0.391]
62 The following properties of matrix grammars and matricible language are straightforward. [sent-348, score-0.546]
63 Proposition 3 The intersection of two matricible languages is again a matricible language. [sent-380, score-0.383]
64 We proceed by giving another account of the expressivity of matrix grammars by showing undecidability of the emptiness problem. [sent-387, score-0.456]
65 Proposition 4 The problem whether there is a word which is accepted by a given matrix grammar is undecidable. [sent-388, score-0.357]
66 We now reduce this problem to the emptiness problem of a matrix grammar. [sent-403, score-0.359]
67 These results demonstrate that matrix grammars cover a wide range of formal languages. [sent-451, score-0.374]
68 3 Note that this question is directly related to the question whether Lambek calculus can be modeled by matrix grammars. [sent-454, score-0.321]
69 That is: given two arbitrary matricible languages L1, L2, is the language L = {w1w2 | w1 ∈ L1, w2 ∈ L2} again em laantrigcuibaglee? [sent-456, score-0.211]
70 fr Bomei nthge a Chomsky hierarchy, answering this question is surprisingly non-trivial for matrix grammars. [sent-458, score-0.321]
71 For example, allowing for some nondeterminism by associating several matrices to one token would ensure closure under concatenation. [sent-460, score-0.339]
72 How do the theoretical properties of matrix grammars depend on the underlying algebraic structure? [sent-461, score-0.454]
73 Remember that we considered matrices containing real numbers as entries. [sent-462, score-0.261]
74 In general, matrices can be defined on top of any mathematical structure that is (at least) a semiring (Golan, 1992). [sent-463, score-0.295]
75 Therefore, it would be interesting to investigate the influence of the choice of the underlying semiring on the properties of the matrix grammars possibly nonstandard structures turn out to be more appropriate for capturing certain compositional language properties. [sent-465, score-0.555]
76 – 6 Combination of Different Approaches Another central advantage of the proposed matrixbased models for word meaning is that several matrix models can be easily combined into one. [sent-466, score-0.362]
77 3For instance, we have not been able to find a matrix grammar that recognizes the language of all well-formed parenthesis expressions. [sent-467, score-0.357]
78 , [[σk]] according to one specific model and matrices ([σ1]), . [sent-474, score-0.261]
79 Then we can combine the two models into one {[ · ]} by assigning to σi the matrix {[σi]}= 0. [sent-478, score-0.321]
80 ” Mark that by providing non-zero entries for the upper right and lower left matrix part, information exchange between the two models can be easily realized. [sent-490, score-0.321]
81 Distributional models based on matrices or even higher-dimensional arrays have been proposed in information retrieval (Gao et al. [sent-492, score-0.261]
82 However, to the best of our knowledge, the approach of realizing compositionality via matrix multiplication seems to be entirely original. [sent-494, score-0.584]
83 Mitchell and Lapata (2008) formulate semantic composition as a function m = f(w1, w2, R, K) where R is a relation between w1 and w2 and K is additional knowledge. [sent-496, score-0.217]
84 They evaluate the model with a number of addition and multiplication operations for vector combination on a sentence similarity task proposed by Kintsch (2001). [sent-497, score-0.325]
85 Widdows (2008) proposes a number of more advanced vector operations well-known from quantum mechanics, such as tensor product and convolution, to model composition in vector spaces. [sent-498, score-0.806]
86 Giesbrecht (2009) evaluates four vector compo- sition operations (+, ? [sent-500, score-0.185]
87 , tensor product, convolution) on tehrea tiaonsks (o+f identifying pmroudltuic-wt,o crdon uvnolitus. [sent-501, score-0.156]
88 The evaluation results of the three studies are not conclusive in terms of which vector operation performs best; the different outcomes might be attributed to the underlying word space models; e. [sent-502, score-0.278]
89 In the light of these findings, our CMSMs provide the benefit of just one composition operation that is able to mimic all the others as well as combinations thereof. [sent-505, score-0.267]
90 8 Conclusion and Future Work We have introduced a generic model for compositionality in language where matrices are associated with tokens and the matrix representation of a token sequence is obtained by iterated matrix multiplication. [sent-506, score-1.274]
91 We have shown that the proposed model is expressive enough to cover and combine a variety of distributional and symbolic aspects of natural language. [sent-508, score-0.199]
92 This nourishes the hope that matrix models can serve as a kind of lingua franca for compositional models. [sent-509, score-0.468]
93 Presumably, hybrid approaches have to be considered, where parts of 914 the matrix representation are learned whereas others are stipulated in advance guided by external sources (such as lexical information). [sent-515, score-0.357]
94 nT htheen s tensor decomposition atenc bheniques can be applied in order to find a compact representation, reduce noise, and cluster together similar tokens (Tucker, 1966; Rendle et al. [sent-517, score-0.214]
95 In Section 3, we justified our model by taking the perspective of tokens being functions which realize mental state transitions. [sent-521, score-0.297]
96 Yet, using matrices to represent those functions restricts them to linear mappings. [sent-522, score-0.316]
97 Instead, we might need some in-between application of simple nonlinear functions in the spirit of quantum-collapsing of a "superposed" mental state (such as the winner takes it all, survival of the top-k vector entries, and so forth). [sent-525, score-0.312]
98 Exploring term-document matrices from matrix models in text mining. [sent-530, score-0.582]
99 On the theory of groups as depending on the symbolic equation θn = 1. [sent-539, score-0.164]
100 Learning optimal ranking with tensor factorization for tag recommendation. [sent-646, score-0.156]
wordName wordTfidf (topN-words)
[('cmsms', 0.388), ('matrix', 0.321), ('matrices', 0.261), ('vsms', 0.194), ('composition', 0.183), ('matricible', 0.172), ('mental', 0.165), ('tensor', 0.156), ('compositional', 0.147), ('rn', 0.145), ('multiplication', 0.14), ('permutation', 0.138), ('symbolic', 0.13), ('compositionality', 0.123), ('vector', 0.114), ('vm', 0.113), ('quantum', 0.108), ('vectors', 0.107), ('operation', 0.084), ('giesbrecht', 0.081), ('plausibility', 0.078), ('token', 0.078), ('convolution', 0.073), ('operations', 0.071), ('distributional', 0.069), ('bijection', 0.069), ('widdows', 0.069), ('acceptance', 0.065), ('aci', 0.065), ('antonellis', 0.065), ('rendle', 0.065), ('rudolph', 0.065), ('uhm', 0.065), ('product', 0.06), ('likewise', 0.058), ('tokens', 0.058), ('eugenie', 0.057), ('holographic', 0.057), ('hopcroft', 0.057), ('linear', 0.055), ('ik', 0.054), ('grammars', 0.053), ('proposition', 0.053), ('sahlgren', 0.052), ('vk', 0.049), ('deerwester', 0.049), ('hh', 0.049), ('landauer', 0.048), ('space', 0.046), ('algebraic', 0.046), ('vsm', 0.046), ('lund', 0.046), ('clark', 0.046), ('expressivity', 0.044), ('ambmcm', 0.043), ('associativity', 0.043), ('cayley', 0.043), ('cmsm', 0.043), ('fzi', 0.043), ('hadamard', 0.043), ('iftqh', 0.043), ('karlsruhe', 0.043), ('mcopy', 0.043), ('vhm', 0.043), ('algebra', 0.042), ('iterated', 0.042), ('meanings', 0.041), ('meaning', 0.041), ('justified', 0.041), ('salton', 0.041), ('associative', 0.041), ('regular', 0.039), ('languages', 0.039), ('lambek', 0.038), ('emptiness', 0.038), ('neurological', 0.038), ('qf', 0.038), ('transposed', 0.038), ('whence', 0.038), ('xb', 0.038), ('vi', 0.036), ('grammar', 0.036), ('cognitive', 0.036), ('representation', 0.036), ('write', 0.035), ('pik', 0.035), ('stimulus', 0.035), ('subsumed', 0.035), ('ier', 0.035), ('semirings', 0.035), ('mapping', 0.034), ('theory', 0.034), ('semantic', 0.034), ('mathematical', 0.034), ('dimensionality', 0.034), ('underlying', 0.034), ('sequence', 0.034), ('pad', 0.033), ('state', 0.033), ('memory', 0.033)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 66 acl-2010-Compositional Matrix-Space Models of Language
Author: Sebastian Rudolph ; Eugenie Giesbrecht
Abstract: We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
2 0.14928181 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
Author: Michael Lamar ; Yariv Maron ; Mark Johnson ; Elie Bienenstock
Abstract: We revisit the algorithm of Schütze (1995) for unsupervised part-of-speech tagging. The algorithm uses reduced-rank singular value decomposition followed by clustering to extract latent features from context distributions. As implemented here, it achieves state-of-the-art tagging accuracy at considerably less cost than more recent methods. It can also produce a range of finer-grained taggings, with potential applications to various tasks. 1
3 0.1477582 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal
Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.
4 0.13663986 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller
Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.
5 0.12557639 232 acl-2010-The S-Space Package: An Open Source Package for Word Space Models
Author: David Jurgens ; Keith Stevens
Abstract: We present the S-Space Package, an open source framework for developing and evaluating word space algorithms. The package implements well-known word space algorithms, such as LSA, and provides a comprehensive set of matrix utilities and data structures for extending new or existing models. The package also includes word space benchmarks for evaluation. Both algorithms and libraries are designed for high concurrency and scalability. We demonstrate the efficiency of the reference implementations and also provide their results on six benchmarks.
6 0.086056903 161 acl-2010-Learning Better Data Representation Using Inference-Driven Metric Learning
7 0.082205251 128 acl-2010-Grammar Prototyping and Testing with the LinGO Grammar Matrix Customization System
8 0.072370499 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
9 0.066208087 158 acl-2010-Latent Variable Models of Selectional Preference
10 0.064262919 51 acl-2010-Bilingual Sense Similarity for Statistical Machine Translation
11 0.06269972 238 acl-2010-Towards Open-Domain Semantic Role Labeling
12 0.062321499 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
13 0.060509022 228 acl-2010-The Importance of Rule Restrictions in CCG
14 0.059380382 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
15 0.058830019 217 acl-2010-String Extension Learning
16 0.05698619 87 acl-2010-Discriminative Modeling of Extraction Sets for Machine Translation
17 0.055743612 263 acl-2010-Word Representations: A Simple and General Method for Semi-Supervised Learning
18 0.05489086 59 acl-2010-Cognitively Plausible Models of Human Language Processing
19 0.054507267 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms
20 0.054424915 264 acl-2010-Wrapping up a Summary: From Representation to Generation
topicId topicWeight
[(0, -0.172), (1, 0.04), (2, -0.001), (3, -0.026), (4, 0.011), (5, -0.065), (6, 0.063), (7, -0.002), (8, 0.107), (9, -0.032), (10, -0.058), (11, 0.068), (12, 0.097), (13, -0.022), (14, -0.079), (15, 0.063), (16, 0.011), (17, -0.02), (18, -0.031), (19, 0.03), (20, 0.142), (21, 0.028), (22, -0.062), (23, 0.024), (24, 0.083), (25, 0.048), (26, -0.112), (27, -0.02), (28, 0.069), (29, -0.069), (30, -0.049), (31, -0.09), (32, 0.029), (33, 0.13), (34, 0.015), (35, 0.068), (36, -0.041), (37, -0.224), (38, -0.01), (39, -0.005), (40, -0.126), (41, 0.049), (42, 0.029), (43, -0.044), (44, 0.075), (45, 0.038), (46, 0.025), (47, -0.128), (48, -0.027), (49, -0.096)]
simIndex simValue paperId paperTitle
same-paper 1 0.95304906 66 acl-2010-Compositional Matrix-Space Models of Language
Author: Sebastian Rudolph ; Eugenie Giesbrecht
Abstract: We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
2 0.72693515 232 acl-2010-The S-Space Package: An Open Source Package for Word Space Models
Author: David Jurgens ; Keith Stevens
Abstract: We present the S-Space Package, an open source framework for developing and evaluating word space algorithms. The package implements well-known word space algorithms, such as LSA, and provides a comprehensive set of matrix utilities and data structures for extending new or existing models. The package also includes word space benchmarks for evaluation. Both algorithms and libraries are designed for high concurrency and scalability. We demonstrate the efficiency of the reference implementations and also provide their results on six benchmarks.
3 0.57637298 263 acl-2010-Word Representations: A Simple and General Method for Semi-Supervised Learning
Author: Joseph Turian ; Lev-Arie Ratinov ; Yoshua Bengio
Abstract: If we take an existing supervised NLP system, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words on both NER and chunking. We use near state-of-the-art supervised baselines, and find that each of the three word representations improves the accuracy of these baselines. We find further improvements by combining different word representations. You can download our word features, for off-the-shelf use in existing NLP systems, as well as our code, here: http ://metaoptimize com/proj ects/wordreprs/ .
4 0.57431424 205 acl-2010-SVD and Clustering for Unsupervised POS Tagging
Author: Michael Lamar ; Yariv Maron ; Mark Johnson ; Elie Bienenstock
Abstract: We revisit the algorithm of Schütze (1995) for unsupervised part-of-speech tagging. The algorithm uses reduced-rank singular value decomposition followed by clustering to extract latent features from context distributions. As implemented here, it achieves state-of-the-art tagging accuracy at considerably less cost than more recent methods. It can also produce a range of finer-grained taggings, with potential applications to various tasks. 1
5 0.55945402 182 acl-2010-On the Computational Complexity of Dominance Links in Grammatical Formalisms
Author: Sylvain Schmitz
Abstract: Dominance links were introduced in grammars to model long distance scrambling phenomena, motivating the definition of multiset-valued linear indexed grammars (MLIGs) by Rambow (1994b), and inspiring quite a few recent formalisms. It turns out that MLIGs have since been rediscovered and reused in a variety of contexts, and that the complexity of their emptiness problem has become the key to several open questions in computer science. We survey complexity results and open issues on MLIGs and related formalisms, and provide new complexity bounds for some linguistically motivated restrictions.
6 0.53714138 183 acl-2010-Online Generation of Locality Sensitive Hash Signatures
7 0.53037512 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
8 0.47494647 107 acl-2010-Exemplar-Based Models for Word Meaning in Context
9 0.4733839 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
10 0.46746278 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
11 0.46353722 95 acl-2010-Efficient Inference through Cascades of Weighted Tree Transducers
12 0.46216288 7 acl-2010-A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices
13 0.45950544 117 acl-2010-Fine-Grained Genre Classification Using Structural Learning Algorithms
14 0.43266562 222 acl-2010-SystemT: An Algebraic Approach to Declarative Information Extraction
15 0.42798746 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
16 0.41858143 67 acl-2010-Computing Weakest Readings
17 0.41417244 161 acl-2010-Learning Better Data Representation Using Inference-Driven Metric Learning
18 0.40713739 248 acl-2010-Unsupervised Ontology Induction from Text
19 0.39437407 144 acl-2010-Improved Unsupervised POS Induction through Prototype Discovery
20 0.39013851 238 acl-2010-Towards Open-Domain Semantic Role Labeling
topicId topicWeight
[(7, 0.013), (14, 0.018), (25, 0.051), (33, 0.016), (39, 0.013), (42, 0.024), (44, 0.01), (59, 0.116), (65, 0.3), (71, 0.011), (73, 0.034), (76, 0.012), (78, 0.072), (83, 0.064), (84, 0.058), (98, 0.105)]
simIndex simValue paperId paperTitle
same-paper 1 0.77209055 66 acl-2010-Compositional Matrix-Space Models of Language
Author: Sebastian Rudolph ; Eugenie Giesbrecht
Abstract: We propose CMSMs, a novel type of generic compositional models for syntactic and semantic aspects of natural language, based on matrix multiplication. We argue for the structural and cognitive plausibility of this model and show that it is able to cover and combine various common compositional NLP approaches ranging from statistical word space models to symbolic grammar formalisms.
2 0.64192581 93 acl-2010-Dynamic Programming for Linear-Time Incremental Parsing
Author: Liang Huang ; Kenji Sagae
Abstract: Incremental parsing techniques such as shift-reduce have gained popularity thanks to their efficiency, but there remains a major problem: the search is greedy and only explores a tiny fraction of the whole space (even with beam search) as opposed to dynamic programming. We show that, surprisingly, dynamic programming is in fact possible for many shift-reduce parsers, by merging “equivalent” stacks based on feature values. Empirically, our algorithm yields up to a five-fold speedup over a state-of-the-art shift-reduce depen- dency parser with no loss in accuracy. Better search also leads to better learning, and our final parser outperforms all previously reported dependency parsers for English and Chinese, yet is much faster.
3 0.5576967 1 acl-2010-"Ask Not What Textual Entailment Can Do for You..."
Author: Mark Sammons ; V.G.Vinod Vydiswaran ; Dan Roth
Abstract: We challenge the NLP community to participate in a large-scale, distributed effort to design and build resources for developing and evaluating solutions to new and existing NLP tasks in the context of Recognizing Textual Entailment. We argue that the single global label with which RTE examples are annotated is insufficient to effectively evaluate RTE system performance; to promote research on smaller, related NLP tasks, we believe more detailed annotation and evaluation are needed, and that this effort will benefit not just RTE researchers, but the NLP community as a whole. We use insights from successful RTE systems to propose a model for identifying and annotating textual infer- ence phenomena in textual entailment examples, and we present the results of a pilot annotation study that show this model is feasible and the results immediately useful.
4 0.53926861 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
Author: Stefan Thater ; Hagen Furstenau ; Manfred Pinkal
Abstract: We present a syntactically enriched vector model that supports the computation of contextualized semantic representations in a quasi compositional fashion. It employs a systematic combination of first- and second-order context vectors. We apply our model to two different tasks and show that (i) it substantially outperforms previous work on a paraphrase ranking task, and (ii) achieves promising results on a wordsense similarity task; to our knowledge, it is the first time that an unsupervised method has been applied to this task.
5 0.53911912 158 acl-2010-Latent Variable Models of Selectional Preference
Author: Diarmuid O Seaghdha
Abstract: This paper describes the application of so-called topic models to selectional preference induction. Three models related to Latent Dirichlet Allocation, a proven method for modelling document-word cooccurrences, are presented and evaluated on datasets of human plausibility judgements. Compared to previously proposed techniques, these models perform very competitively, especially for infrequent predicate-argument combinations where they exceed the quality of Web-scale predictions while using relatively little data.
6 0.53285408 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
7 0.52983928 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
8 0.52773893 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
9 0.52733552 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
10 0.52715892 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences
11 0.52643317 130 acl-2010-Hard Constraints for Grammatical Function Labelling
12 0.52536696 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
13 0.52454293 169 acl-2010-Learning to Translate with Source and Target Syntax
14 0.52430397 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences
15 0.52422196 59 acl-2010-Cognitively Plausible Models of Human Language Processing
16 0.52277499 216 acl-2010-Starting from Scratch in Semantic Role Labeling
17 0.52214754 162 acl-2010-Learning Common Grammar from Multilingual Corpus
18 0.52074313 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
19 0.51959306 17 acl-2010-A Structured Model for Joint Learning of Argument Roles and Predicate Senses
20 0.51953071 248 acl-2010-Unsupervised Ontology Induction from Text