emnlp emnlp2011 emnlp2011-80 knowledge-graph by maker-knowledge-mining

80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context


Source: pdf

Author: Tim Van de Cruys ; Thierry Poibeau ; Anna Korhonen

Abstract: This paper presents a novel method for the computation of word meaning in context. We make use of a factorization model in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. The evaluation on a lexical substitution task carried out for both English and French – indicates that our approach is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. –

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We make use of a factorization model in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. [sent-8, score-0.629]

2 The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. [sent-9, score-0.632]

3 The evaluation on a lexical substitution task carried out for both English and French – indicates that our approach is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. [sent-10, score-0.519]

4 Up till now, the majority of computational approaches to semantic similarity represent the meaning of a word as the aggregate of the word’s contexts, and hence do not differentiate between the different senses of a word. [sent-13, score-0.322]

5 , 2006) and graded sense assignment (Erk and McCarthy, 2009), over word sense induction (Schu¨tze, 1998; Pantel and Lin, 2002; Agirre et al. [sent-26, score-0.288]

6 , 2006), to the computation of individual word meaning in context (Erk and Pado´, 2008; Thater et al. [sent-27, score-0.297]

7 To be able to do so, we build a factorization model in which words, together with their – – window-based context words and their dependency ProceedEindgisnb oufr tghhe, 2 S0c1o1tl Canodn,f eUrKen,c Jeuol yn 2 E7m–3p1ir,ic 2a0l1 M1. [sent-30, score-0.504]

8 The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. [sent-33, score-0.632]

9 The evaluation on a lexical substitution task carried out for both English and French indicates that our method is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. [sent-34, score-0.519]

10 Section 3 describes the methodology of our method, focusing on the factorization model, and the computation of meaning in context. [sent-37, score-0.502]

11 Section 4 presents a thorough evaluation on a lexical substitution task, both for English and French. [sent-38, score-0.261]

12 – 2 Related work – One of the best known computational models of semantic similarity is latent semantic analysis LSA (Landauer and Dumais, 1997; Landauer et al. [sent-40, score-0.28]

13 This matrix is then decomposed into three other matrices with a mathematical factorization technique called singular value decomposition (SVD). [sent-43, score-0.519]

14 The most important dimensions that come out of the SVD are said to represent latent semantic dimensions, according to which nouns and documents can be represented more efficiently. [sent-44, score-0.351]

15 Our model also applies a factorization technique (albeit a different one) in order to find a reduced semantic space. [sent-45, score-0.366]

16 — A number of researchers have exploited the no1013 tion of context to differentiate between the different senses of a word in an unsupervised way (a task labeled word sense induction or WSI). [sent-54, score-0.388]

17 Van de Cruys (2008) proposed a model for sense induction based on latent semantic dimensions. [sent-59, score-0.348]

18 Using a factorization technique based on non-negative matrix factorization, the model induces a latent semantic space according to which both dependency features and broad contextual features are classified. [sent-60, score-0.723]

19 Using the latent space, the model is able to discriminate between different word senses. [sent-61, score-0.243]

20 Our approach makes use of a similar factorization model, but we extend the approach with a probabilistic framework that is able to adapt the original vector according to the context of the instance. [sent-62, score-0.567]

21 Erk and Pado´ (2008, 2009) make use of selectional preferences to express the meaning of a word in context; the meaning of a word in the presence of an argument is computed by multiplying the word’s vector with a vector that captures the inverse selectional preferences of the argument. [sent-64, score-0.526]

22 (2010) extend the approach based on selectional preferences by incorporating second-order co-occurrences in their model; their model allows first-order co-occurrences to act as a filter upon the second-order vector space, which allows for the computation of meaning in context. [sent-67, score-0.34]

23 Erk and Pado´ (2010) propose an exemplar-based approach, in which the meaning of a word in context is represented by the activated exemplars that are most similar to it. [sent-68, score-0.248]

24 Finally, Dinu and Lapata (2010) propose a probabilistic framework that models the meaning of words as a probability distribution over latent dimensions (‘senses’). [sent-71, score-0.411]

25 This allows for a more precise and more distinct computation of word meaning in context. [sent-74, score-0.246]

26 Secondly, Dinu and Lapata use windowbased context features to build their latent model, while our approach combines both window-based and dependency-based features. [sent-75, score-0.321]

27 1 Non-negative Matrix Factorization Our model uses non-negative matrix factorization (Lee and Seung, 2000) in order to find latent dimen- sions. [sent-77, score-0.596]

28 Secondly, the non-negative nature of the factorization ensures that only additive and no subtractive relations are allowed. [sent-82, score-0.327]

29 Non-negative matrix factorization enforces the constraint that all three matrices must be non-negative, so all elements must be greater than or equal to zero. [sent-86, score-0.519]

30 This factorization is carried out through the iterative application of update rules. [sent-88, score-0.327]

31 2 Combining syntax andP context words Using an extension of non-negative matrix factorization (Van de Cruys, 2008), it is possible to jointly induce latent factors for three different modes: words, their window-based context words, and their dependency-based context features. [sent-92, score-0.92]

32 the results of the former factorization are used to initialize the factorization of the next matrix). [sent-97, score-0.654]

33 A graphical representation of the interleaved factorization algorithm is given in figure 1. [sent-98, score-0.417]

34 When the factorization is finished, the three different modes (words, window-based context words and dependency-based features) are all represented according to a limited number of latent factors. [sent-99, score-0.625]

35 Figure 1: A graphical representation of the interleaved NMF The factorization that comes out of the NMF model can be interpreted probabilistically (Gaussier and Goutte, 2005; Ding et al. [sent-100, score-0.417]

36 More specifically, we can transform the factorization into a standard latent variable model of the form × p(wi,dj) =zX=K1p(z)p(wi|z)p(dj|z) (4) by introducing two K K diaPgonal scaling matrices X and Y, such thaKt Xkk = Pi Wik and Ykk = Pj Hkj. [sent-102, score-0.58]

37 1 Overview Using the results of the factorization model described above, we can now adapt a word’s feature vec- 1015 tor according to the context in which it appears. [sent-108, score-0.465]

38 the window-based context words or dependency-based context features) pinpoint the important semantic dimensions of the particular instance, creating a probability distribution over latent factors. [sent-111, score-0.492]

39 For a number of context words of a particular instance, we determine the probability distribution over latent factors given the context, p(z|C), as the average of the context words’ probability )distributions over latent factors (equation 8). [sent-112, score-0.742]

40 p(z|C) =Pwi∈C|Cp|(z|wi) (8) The probability distribution over latent factors given a number of dependency-based context features can be computed in a similar fashion, replacing wi with dj. [sent-113, score-0.416]

41 Additionally, this step allows us to combine both windows-based context words and dependencybased context features in order to determine the latent probability distribution (e. [sent-114, score-0.474]

42 The resulting probability distribution over latent factors can be interpreted as a semantic fingerprint of the passage in which the target word appears. [sent-117, score-0.435]

43 p(d|C) = p(z|C)p(d|z) (9) The last step is to weight the original probability vector of the word according to the probability vector of the dependency features given the word’s context, by taking the pointwise multiplication of probability vectors p(d|wi) and p(d|C). [sent-119, score-0.357]

44 We do not just build a model based on latent factors, but we use the latent factors to determine which of the features in the original word vector are the salient ones given a particular context. [sent-121, score-0.534]

45 First of all, it allows to take multiple context features into account, each of which contributes to the probability distribution over latent factors. [sent-125, score-0.316]

46 1 we compute p(z|C1) taend p(z|C2) the probadistributions over latent fapc(tzo|rCs given the conby averaging over the latent probability dis– tributions of the individual context features. [sent-136, score-0.448]

47 2 Using these probability distributions over latent factors, we can now determine the probability of each dependency feature given the different contexts p(d|C1) adendn p(d| C2). [sent-137, score-0.359]

48 dT phe( df|oCrmer step yields a general probability distribution over dependency features that tells us how likely a particular dependency feature is given the context that our target word appears in. [sent-138, score-0.359]

49 Our last step is now to weight the original probability vector of the target word (the aggregate of dependency-based context features over all contexts of the target word) according to the new distribution given the context in which the target word appears. [sent-139, score-0.516]

50 2In this case, the sets of context features contain only one item, so the average probability distribution of the sets is just the latent probability distribution of their respective item. [sent-141, score-0.317]

51 1016 features associated with unrelated latent factors are leveled out. [sent-142, score-0.237]

52 For the second sentence, features that are associated with the administrative sense of record (dependency features associated with latent factors that are related to the feature are emphasized, while unrelated featu{reups are played down. [sent-143, score-0.342]

53 The task’s goal is to find suitable substitutes for a target word in a particular context. [sent-151, score-0.27]

54 Five annotators provided suitable substitutes for each target word in the different contexts. [sent-155, score-0.27]

55 For French, we developed a small-scale lexical substitution task ourselves, closely following the guidelines of the original English task. [sent-156, score-0.261]

56 Four different native French speakers were then asked to provide suitable substitutes for the nouns in context. [sent-159, score-0.248]

57 The sentences of the English lexical substitution task have been tagged, lemmatized and parsed in the same way. [sent-167, score-0.261]

58 For each model, the matrices needed for our interleaved NMF factorization are extracted from the corpus. [sent-170, score-0.505]

59 For French, we only constructed a model for nouns, as our lexical substitution task for French is limited to this part of speech. [sent-173, score-0.261]

60 The interleaved NMF model was carried out using K = 600 (the number of factorized dimensions in the model), and applying 100 iterations. [sent-174, score-0.225]

61 6 The interleaved NMF algorithm was implemented in Matlab; the preprocessing scripts and scripts for vector computation in context were written in Python. [sent-175, score-0.285]

62 This means that all possible substitutes for a given target word (extracted from the gold standard) are lumped together, and the system then has to produce a ranking for the complete set of substitutes. [sent-183, score-0.352]

63 We also adopt this approach in our evaluation framework, but we complement it with the original evaluation measures of the lexical substitution task, in which the system is not given a list of possible substitutes beforehand, but has to come up with the suitable candidates itself. [sent-184, score-0.448]

64 We coin the former approach paraphrase ranking, and the latter one paraphrase induction. [sent-186, score-0.458]

65 Paraphrase ranking Following Dinu and Lapata (2010), we compare the ranking produced by our model with the gold standard ranking using Kendall’s τb (which is adjusted for ties). [sent-188, score-0.246]

66 We compare the results for paraphrase ranking to two different baselines. [sent-192, score-0.311]

67 The second baseline is a dependency-based vector space model that does not take the context of the particular instance into account (and thus returns the same ranking for each instance of the target word). [sent-194, score-0.308]

68 Paraphrase induction To evaluate the system’s ability to come up with suitable substitutes from scratch, we use the measures designed to evaluate systems that took part in the original English lexical substitution task (McCarthy and Navigli, 2007). [sent-197, score-0.528]

69 The strict best measure allows the system to give as many candidate substitutes as it considers appropriate, but the credit for each correct substitute is divided by the total number of guesses. [sent-199, score-0.22]

70 The more liberal measure was introduced to account for the fact that the lexical substitution task’s gold standard is susceptible to a considerate amount of variation, and there is only a limited number of annotators. [sent-203, score-0.261]

71 1 English Table 1presents the paraphrase ranking results of our approach, comparing them to the two baselines and to a number of previous approaches to meaning computation in context. [sent-209, score-0.486]

72 – – Table 1: Kendall’s τb and GAP paraphrase ranking scores for the English lexical substitution task. [sent-239, score-0.572]

73 The second baseline is a standard dependency-based vector space model, which yields the same ranking for all instances of a target word. [sent-246, score-0.276]

74 8 Note that the reproduced results (EP09, EP10 and TFP) are not entirely comparable, because the authors only use a subset of the lexical substitution task. [sent-252, score-0.261]

75 66 Table 2: Kendall’s τb paraphrase ranking scores for the English lexical substitution task across different parts of speech Table 2 shows the performance of the three model instantiations on paraphrase ranking across different parts of speech. [sent-278, score-0.932]

76 26 Table 3: Rbest and P10 paraphrase induction scores for the English lexical substitution task Table 3 shows the performance of the different models on the paraphrase induction task. [sent-298, score-0.879]

77 Note once again that our baseline vectordep dependency-based vector space model 1019 a simple is a highly – – competitive one. [sent-299, score-0.277]

78 The results indicate that our approach, after vector adaptation in context, is still able to provide accurate similarity calculations across the complete word space. [sent-302, score-0.234]

79 While other algorithms are able to rank candidate substitutes at the expense of accurate similarity calculations, our approach is able to do both. [sent-303, score-0.304]

80 For reasons of comparison, we also included the scores of the best performing models that participated in the SEMEVAL 2007 lexical substitution task (KU (Yuret, 2007) and IRST2 (Giuliano et al. [sent-305, score-0.261]

81 Note, however, that all participants of the SEMEVAL 2007 lexical substitution task relied on a predefined sense inventory (i. [sent-308, score-0.325]

82 To our knowledge, this is the first time a fully unsupervised system is tested on the paraphrase induction task. [sent-312, score-0.309]

83 81 Table 4: P10 paraphrase induction scores for the English lexical substitution task across different parts of speech. [sent-337, score-0.57]

84 Table 4 presents the results for paraphrase induction (oot) across the different parts of speech. [sent-344, score-0.309]

85 The results indicate that paraphrase induction works best for nouns and verbs, with statistically significant improvements over the baseline. [sent-345, score-0.37]

86 Note, however, that the NMFcontext model is still quite apt for meaning computation, yielding results that are only slightly lower than the dependency-based vector space model. [sent-348, score-0.223]

87 2 French This section presents the results on the French lexical substitution task. [sent-351, score-0.261]

88 Table 5 presents the results for paraphrase ranking, while table 6 shows the models’ performance on the paraphrase induction task. [sent-352, score-0.538]

89 Finally, the results for paraphrase induction in French (table 6) interestingly show a significant and large improvement over the baseline. [sent-368, score-0.309]

90 The improvements indicate once again that the models are able to carry out precise similarity computations over the whole word space, while at the same time providing an adequately adapted contextualized meaning vector. [sent-369, score-0.322]

91 Dinu and Lapata’s model, which performs similarity calculations in the latent space, is not able to provide accurate word vectors, and thus perform worse at the paraphrase induction task. [sent-370, score-0.646]

92 32 Table 6: Rbest and P10 paraphrase induction scores for the French lexical substitution task 5 Conclusion In this paper, we presented a novel method for the modeling of word meaning in context. [sent-381, score-0.734]

93 We make use of a factorization model based on non-negative matrix factorization, in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. [sent-382, score-0.733]

94 The factorization model allows us to determine which particular dimensions are important for a target word in a particular context. [sent-383, score-0.561]

95 A key feature of the algorithm is that we adapt the original dependency-based feature vector of the target word through the latent semantic space. [sent-384, score-0.403]

96 By doing so, our model is able to make accurate similarity calculations for word meaning in context across the whole word space. [sent-385, score-0.42]

97 Our evaluation shows that the approach presented here is able to improve upon the state-of-the art performance on paraphrase ranking. [sent-386, score-0.269]

98 Moreover, our approach scores well for both paraphrase ranking and paraphrase induction, whereas previous approaches only seem capable of improving performance on the former task at the expense of the latter. [sent-387, score-0.54]

99 On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. [sent-405, score-0.635]

100 A structured vector space model for word meaning in context. [sent-417, score-0.261]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('factorization', 0.327), ('dinu', 0.234), ('paraphrase', 0.229), ('substitution', 0.217), ('nmfcontext', 0.209), ('erk', 0.201), ('nmfc', 0.188), ('substitutes', 0.187), ('pado', 0.179), ('nmfdep', 0.167), ('latent', 0.165), ('nmf', 0.163), ('french', 0.152), ('rbest', 0.146), ('vectordep', 0.146), ('thater', 0.13), ('meaning', 0.126), ('lapata', 0.117), ('matrix', 0.104), ('interleaved', 0.09), ('matrices', 0.088), ('dimensions', 0.086), ('mccarthy', 0.085), ('context', 0.084), ('ranking', 0.082), ('contextualized', 0.081), ('induction', 0.08), ('kendall', 0.075), ('factors', 0.072), ('windowbased', 0.072), ('svd', 0.071), ('dl', 0.071), ('katrin', 0.07), ('sense', 0.064), ('schu', 0.063), ('tfp', 0.063), ('vector', 0.062), ('wi', 0.061), ('nouns', 0.061), ('calculations', 0.057), ('adapt', 0.054), ('cruys', 0.054), ('dependency', 0.053), ('yields', 0.052), ('dj', 0.051), ('landauer', 0.051), ('senses', 0.049), ('modes', 0.049), ('factorized', 0.049), ('computation', 0.049), ('diana', 0.049), ('instantiations', 0.049), ('reach', 0.048), ('target', 0.045), ('ku', 0.045), ('lexical', 0.044), ('dependencybased', 0.042), ('divergence', 0.042), ('crossclassified', 0.042), ('fingerprint', 0.042), ('graded', 0.042), ('listenp', 0.042), ('oot', 0.042), ('rceal', 0.042), ('recordn', 0.042), ('tendencies', 0.042), ('updateo', 0.042), ('villemonte', 0.042), ('contexts', 0.041), ('record', 0.041), ('adverbs', 0.041), ('multiplication', 0.04), ('able', 0.04), ('sebastian', 0.04), ('semantic', 0.039), ('word', 0.038), ('semeval', 0.038), ('navigli', 0.038), ('gap', 0.038), ('english', 0.038), ('selectional', 0.037), ('similarity', 0.037), ('baroni', 0.037), ('ens', 0.036), ('ide', 0.036), ('georgiana', 0.036), ('researchers', 0.035), ('secondly', 0.035), ('space', 0.035), ('probability', 0.034), ('competitive', 0.034), ('allows', 0.033), ('toutanova', 0.033), ('thierry', 0.033), ('till', 0.033), ('beforehand', 0.033), ('emphasized', 0.033), ('gaussier', 0.033), ('giuliano', 0.033), ('determine', 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000031 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context

Author: Tim Van de Cruys ; Thierry Poibeau ; Anna Korhonen

Abstract: This paper presents a novel method for the computation of word meaning in context. We make use of a factorization model in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. The evaluation on a lexical substitution task carried out for both English and French – indicates that our approach is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. –

2 0.33207965 107 emnlp-2011-Probabilistic models of similarity in syntactic context

Author: Diarmuid O Seaghdha ; Anna Korhonen

Abstract: This paper investigates novel methods for incorporating syntactic information in probabilistic latent variable models of lexical choice and contextual similarity. The resulting models capture the effects of context on the interpretation of a word and in particular its effect on the appropriateness of replacing that word with a potentially related one. Evaluating our techniques on two datasets, we report performance above the prior state of the art for estimating sentence similarity and ranking lexical substitutes.

3 0.13426964 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing

Author: Prodromos Malakasiotis ; Ion Androutsopoulos

Abstract: We present a method that paraphrases a given sentence by first generating candidate paraphrases and then ranking (or classifying) them. The candidates are generated by applying existing paraphrasing rules extracted from parallel corpora. The ranking component considers not only the overall quality of the rules that produced each candidate, but also the extent to which they preserve grammaticality and meaning in the particular context of the input sentence, as well as the degree to which the candidate differs from the input. We experimented with both a Maximum Entropy classifier and an SVR ranker. Experimental results show that incorporating features from an existing paraphrase recognizer in the ranking component improves performance, and that our overall method compares well against a state of the art paraphrase generator, when paraphrasing rules apply to the input sentences. We also propose a new methodology to evaluate the ranking components of generate-and-rank paraphrase generators, which evaluates them across different combinations of weights for grammaticality, meaning preservation, and diversity. The paper is accompanied by a paraphrasing dataset we constructed for evaluations of this kind.

4 0.13382365 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

Author: Edward Grefenstette ; Mehrnoosh Sadrzadeh

Abstract: Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (2010) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors . in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.

5 0.12447324 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels

Author: Chao Shen ; Tao Li

Abstract: In active dual supervision, not only informative examples but also features are selected for labeling to build a high quality classifier with low cost. However, how to measure the informativeness for both examples and feature on the same scale has not been well solved. In this paper, we propose a non-negative matrix factorization based approach to address this issue. We first extend the matrix factorization framework to explicitly model the corresponding relationships between feature classes and examples classes. Then by making use of the reconstruction error, we propose a unified scheme to determine which feature or example a classifier is most likely to benefit from having labeled. Empirical results demonstrate the effectiveness of our proposed methods.

6 0.11587283 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

7 0.077128381 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions

8 0.071368068 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases

9 0.068779141 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis

10 0.066404767 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning

11 0.062314931 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming

12 0.061044555 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

13 0.060956001 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

14 0.058468122 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification

15 0.058036111 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation

16 0.056728397 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

17 0.0565589 10 emnlp-2011-A Probabilistic Forest-to-String Model for Language Generation from Typed Lambda Calculus Expressions

18 0.056335557 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

19 0.056109618 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models

20 0.055718567 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.23), (1, -0.055), (2, -0.122), (3, -0.106), (4, 0.076), (5, 0.049), (6, -0.118), (7, 0.187), (8, 0.115), (9, 0.144), (10, -0.05), (11, -0.138), (12, 0.255), (13, -0.139), (14, -0.163), (15, -0.027), (16, 0.145), (17, 0.253), (18, -0.001), (19, 0.074), (20, 0.067), (21, -0.1), (22, -0.014), (23, -0.018), (24, 0.043), (25, 0.022), (26, -0.03), (27, 0.005), (28, -0.134), (29, 0.061), (30, 0.21), (31, -0.194), (32, 0.026), (33, 0.059), (34, -0.011), (35, 0.098), (36, 0.058), (37, -0.04), (38, -0.067), (39, 0.016), (40, 0.045), (41, 0.035), (42, -0.106), (43, -0.022), (44, 0.033), (45, -0.004), (46, -0.062), (47, 0.067), (48, 0.0), (49, 0.023)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.93675888 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context

Author: Tim Van de Cruys ; Thierry Poibeau ; Anna Korhonen

Abstract: This paper presents a novel method for the computation of word meaning in context. We make use of a factorization model in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. The evaluation on a lexical substitution task carried out for both English and French – indicates that our approach is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. –

2 0.85652661 107 emnlp-2011-Probabilistic models of similarity in syntactic context

Author: Diarmuid O Seaghdha ; Anna Korhonen

Abstract: This paper investigates novel methods for incorporating syntactic information in probabilistic latent variable models of lexical choice and contextual similarity. The resulting models capture the effects of context on the interpretation of a word and in particular its effect on the appropriateness of replacing that word with a potentially related one. Evaluating our techniques on two datasets, we report performance above the prior state of the art for estimating sentence similarity and ranking lexical substitutes.

3 0.67236471 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

Author: Edward Grefenstette ; Mehrnoosh Sadrzadeh

Abstract: Modelling compositional meaning for sentences using empirical distributional methods has been a challenge for computational linguists. We implement the abstract categorical model of Coecke et al. (2010) using data from the BNC and evaluate it. The implementation is based on unsupervised learning of matrices for relational words and applying them to the vectors of their arguments. The evaluation is based on the word disambiguation task developed by Mitchell and Lapata (2008) for intransitive sentences, and on a similar new experiment designed for transitive sentences. Our model matches the results of its competitors . in the first experiment, and betters them in the second. The general improvement in results with increase in syntactic complexity showcases the compositional power of our model.

4 0.40541923 6 emnlp-2011-A Generate and Rank Approach to Sentence Paraphrasing

Author: Prodromos Malakasiotis ; Ion Androutsopoulos

Abstract: We present a method that paraphrases a given sentence by first generating candidate paraphrases and then ranking (or classifying) them. The candidates are generated by applying existing paraphrasing rules extracted from parallel corpora. The ranking component considers not only the overall quality of the rules that produced each candidate, but also the extent to which they preserve grammaticality and meaning in the particular context of the input sentence, as well as the degree to which the candidate differs from the input. We experimented with both a Maximum Entropy classifier and an SVR ranker. Experimental results show that incorporating features from an existing paraphrase recognizer in the ranking component improves performance, and that our overall method compares well against a state of the art paraphrase generator, when paraphrasing rules apply to the input sentences. We also propose a new methodology to evaluate the ranking components of generate-and-rank paraphrase generators, which evaluates them across different combinations of weights for grammaticality, meaning preservation, and diversity. The paper is accompanied by a paraphrasing dataset we constructed for evaluations of this kind.

5 0.33077428 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels

Author: Chao Shen ; Tao Li

Abstract: In active dual supervision, not only informative examples but also features are selected for labeling to build a high quality classifier with low cost. However, how to measure the informativeness for both examples and feature on the same scale has not been well solved. In this paper, we propose a non-negative matrix factorization based approach to address this issue. We first extend the matrix factorization framework to explicitly model the corresponding relationships between feature classes and examples classes. Then by making use of the reconstruction error, we propose a unified scheme to determine which feature or example a classifier is most likely to benefit from having labeled. Empirical results demonstrate the effectiveness of our proposed methods.

6 0.32477677 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

7 0.31785157 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming

8 0.31461322 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

9 0.29519165 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases

10 0.2901184 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics

11 0.2898798 30 emnlp-2011-Compositional Matrix-Space Models for Sentiment Analysis

12 0.28746647 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition

13 0.28082705 73 emnlp-2011-Improving Bilingual Projections via Sparse Covariance Matrices

14 0.26653469 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP

15 0.25164622 91 emnlp-2011-Literal and Metaphorical Sense Identification through Concrete and Abstract Context

16 0.24653412 55 emnlp-2011-Exploiting Syntactic and Distributional Information for Spelling Correction with Web-Scale N-gram Models

17 0.24565093 102 emnlp-2011-Parse Correction with Specialized Models for Difficult Attachment Types

18 0.24333625 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

19 0.242283 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models

20 0.23311017 127 emnlp-2011-Structured Lexical Similarity via Convolution Kernels on Dependency Trees


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(23, 0.071), (36, 0.015), (37, 0.019), (45, 0.056), (53, 0.019), (54, 0.024), (57, 0.018), (62, 0.014), (64, 0.016), (66, 0.534), (69, 0.013), (79, 0.041), (82, 0.017), (87, 0.011), (96, 0.036), (98, 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.98549968 60 emnlp-2011-Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation

Author: Jason Riesa ; Ann Irvine ; Daniel Marcu

Abstract: unkown-abstract

same-paper 2 0.88601571 80 emnlp-2011-Latent Vector Weighting for Word Meaning in Context

Author: Tim Van de Cruys ; Thierry Poibeau ; Anna Korhonen

Abstract: This paper presents a novel method for the computation of word meaning in context. We make use of a factorization model in which words, together with their window-based context words and their dependency relations, are linked to latent dimensions. The factorization model allows us to determine which dimensions are important for a particular context, and adapt the dependency-based feature vector of the word accordingly. The evaluation on a lexical substitution task carried out for both English and French – indicates that our approach is able to reach better results than state-of-the-art methods in lexical substitution, while at the same time providing more accurate meaning representations. –

3 0.86850309 141 emnlp-2011-Unsupervised Dependency Parsing without Gold Part-of-Speech Tags

Author: Valentin I. Spitkovsky ; Hiyan Alshawi ; Angel X. Chang ; Daniel Jurafsky

Abstract: We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering algorithms, our method allows a word to have different tags in different contexts. In an ablative analysis, we first demonstrate that this context-dependence is crucial to the superior performance of gold tags — requiring a word to always have the same part-ofspeech significantly degrades the performance of manual tags in grammar induction, eliminating the advantage that human annotation has over unsupervised tags. We then introduce a sequence modeling technique that combines the output of a word clustering algorithm with context-colored noise, to allow words to be tagged differently in different contexts. With these new induced tags as input, our state-of- the-art dependency grammar inducer achieves 59. 1% directed accuracy on Section 23 (all sentences) of the Wall Street Journal (WSJ) corpus — 0.7% higher than using gold tags.

4 0.76028854 140 emnlp-2011-Universal Morphological Analysis using Structured Nearest Neighbor Prediction

Author: Young-Bum Kim ; Joao Graca ; Benjamin Snyder

Abstract: In this paper, we consider the problem of unsupervised morphological analysis from a new angle. Past work has endeavored to design unsupervised learning methods which explicitly or implicitly encode inductive biases appropriate to the task at hand. We propose instead to treat morphological analysis as a structured prediction problem, where languages with labeled data serve as training examples for unlabeled languages, without the assumption of parallel data. We define a universal morphological feature space in which every language and its morphological analysis reside. We develop a novel structured nearest neighbor prediction method which seeks to find the morphological analysis for each unlabeled lan- guage which lies as close as possible in the feature space to a training language. We apply our model to eight inflecting languages, and induce nominal morphology with substantially higher accuracy than a traditional, MDLbased approach. Our analysis indicates that accuracy continues to improve substantially as the number of training languages increases.

5 0.49381205 107 emnlp-2011-Probabilistic models of similarity in syntactic context

Author: Diarmuid O Seaghdha ; Anna Korhonen

Abstract: This paper investigates novel methods for incorporating syntactic information in probabilistic latent variable models of lexical choice and contextual similarity. The resulting models capture the effects of context on the interpretation of a word and in particular its effect on the appropriateness of replacing that word with a potentially related one. Evaluating our techniques on two datasets, we report performance above the prior state of the art for estimating sentence similarity and ranking lexical substitutes.

6 0.47773409 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features

7 0.46247938 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases

8 0.44935566 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing

9 0.43181342 39 emnlp-2011-Discovering Morphological Paradigms from Plain Text Using a Dirichlet Process Mixture Model

10 0.43087459 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning

11 0.42459229 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics

12 0.4077425 138 emnlp-2011-Tuning as Ranking

13 0.40387723 97 emnlp-2011-Multiword Expression Identification with Tree Substitution Grammars: A Parsing tour de force with French

14 0.40326512 83 emnlp-2011-Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

15 0.38936183 67 emnlp-2011-Hierarchical Verb Clustering Using Graph Factorization

16 0.38906014 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax

17 0.38411602 54 emnlp-2011-Exploiting Parse Structures for Native Language Identification

18 0.38373277 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction

19 0.37486592 78 emnlp-2011-Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

20 0.37339956 16 emnlp-2011-Accurate Parsing with Compact Tree-Substitution Grammars: Double-DOP