acl acl2011 acl2011-198 knowledge-graph by maker-knowledge-mining

198 acl-2011-Latent Semantic Word Sense Induction and Disambiguation

Source: pdf

Author: Tim Van de Cruys ; Marianna Apidianaki

Abstract: In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. The intuition is that a particular sense is associated with a particular topic, so that different senses can be discriminated through their association with particular topical dimensions; in a similar vein, a particular instance of a word can be disambiguated by determining its most important topical dimensions. The model is evaluated on the SEMEVAL-20 10 word sense induction and disambiguation task, on which it reaches stateof-the-art results.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract In this paper, we present a unified model for the automatic induction of word senses from text, and the subsequent disambiguation of particular word instances using the automatically extracted sense inventory. [sent-3, score-1.226]

2 The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. [sent-4, score-0.981]

3 The model is evaluated on the SEMEVAL-20 10 word sense induction and disambiguation task, on which it reaches stateof-the-art results. [sent-6, score-0.725]

4 1 Introduction Word sense induction (WSI) is the task of automatically identifying the senses of words in texts, without the need for handcrafted resources or manually annotated data. [sent-7, score-0.857]

5 The manual construction of a sense inventory is a tedious and time-consuming job, and the result is highly dependent on the annotators and the domain at hand. [sent-8, score-0.344]

6 By applying an automatic procedure, we are able to only extract the senses that are objectively present in a particular corpus, and it allows for the sense inventory to be straightforwardly adapted to a new domain. [sent-9, score-0.796]

7 Word sense disambiguation (WSD), on the other hand, is the closely related task of assigning a sense Marianna Apidianaki Alpage, INRIA & Univ Paris Diderot Sorbonne Paris Cit e´, UMRI-001 75013 Paris, France mari anna . [sent-10, score-0.703]

8 fr label to a particular instance of a word in context, using an existing sense inventory. [sent-12, score-0.415]

9 The bulk of WSD algorithms up till now use pre-defined sense inventories (such as WordNet) that often contain finegrained sense distinctions, which poses serious problems for computational semantic processing (Ide and Wilks, 2007). [sent-13, score-0.699]

10 The model presented here induces the senses of words in a fully unsupervised way, and subsequently uses the induced sense inventory for the unsupervised disambiguation of particular occurrences of words. [sent-15, score-1.061]

11 The induction step and the disambiguation step are based on the same principle: words and contexts are mapped to a limited number of topical dimensions in a latent semantic word space. [sent-16, score-0.981]

12 The key idea is that the model combines tight, synonymlike similarity (based on dependency relations) with broad, topical similarity (based on a large ‘bag of words’ context window). [sent-17, score-0.354]

13 Section 2 presents some previous research on distributional similarity and word sense induction. [sent-20, score-0.392]

14 Section 3 gives an overview of our method for word sense induction and disambiguation. [sent-21, score-0.53]

15 Section 4 provides a quantitative evaluation and comparison to other algorithms in the framework of the SEMEVAL-20 10 word sense 1476 ProceedingPso orftla thned 4,9 Otrhe Agonnn,u Jauln Mee 1e9t-i2ng4, o 2f0 t1h1e. [sent-22, score-0.384]

16 c s 2o0ci1a1ti Aonss foocria Ctioomnp fourta Ctioomnaplu Ltaintigouniaslti Lcisn,g puaigsetsic 1s476–1485, induction and disambiguation (WSI/WSD) task. [sent-24, score-0.355]

17 This matrix is then decomposed into three other matrices with a mathematical factorization technique called singular value decomposition (SVD). [sent-32, score-0.566]

18 The most important dimensions that come out of the SVD are said to represent latent semantic dimensions, according to which nouns and documents can be represented more efficiently. [sent-33, score-0.392]

19 Our model also applies a factorization technique (albeit a different one) in order to find a reduced semantic space. [sent-34, score-0.322]

20 a paragraph or document) yields broad, topical similarity, whereas a small context yields tight, synonym-like similarity. [sent-38, score-0.315]

21 2 Word sense induction The following paragraphs provide a succinct overview of word sense induction research. [sent-43, score-1.019]

22 A thorough survey on word sense disambiguation (including unsupervised induction algorithms) is presented in Navigli (2009). [sent-44, score-0.712]

23 Algorithms for word sense induction can roughly 1477 be divided into local and global ones. [sent-45, score-0.578]

24 Local WSI algorithms extract the different senses of a word on a per-word basis, i. [sent-46, score-0.473]

25 the different senses for each word are determined separately. [sent-48, score-0.409]

26 In the context-clustering approach, context vectors are created for the different instances of a particular word, and those contexts are grouped into a number of clusters, representing the different senses of the word. [sent-50, score-0.636]

27 The first one to propose this idea of context-group discrimination was Sch u¨tze (1998), and many researchers followed a similar approach to sense induction (Purandare and Pedersen, 2004). [sent-54, score-0.528]

28 The senses of a word may then be discovered using graph clustering techniques (Widdows and Dorow, 2002), or algorithms such as HyperLex (V´ eronis, 2004) or Pagerank (Agirre et al. [sent-56, score-0.569]

29 Finally, Bordag (2006) recently proposed an approach that uses word triplets to perform word sense induction. [sent-58, score-0.402]

30 The underlying idea is the ‘one sense per collocation’ assumption, and co-occurrence triplets are clustered based on the words they have in common. [sent-59, score-0.32]

31 Global algorithms take an approach in which the different senses of a particular word are determined by comparing them to, and demarcating them from, the senses of other words in a full-blown word space model. [sent-60, score-0.934]

32 They present a global clustering algorithm coined clustering by commit– tee (CBC) that automatically discovers word senses from text. [sent-62, score-0.649]

33 This way, less frequent senses of the word may be discovered. [sent-65, score-0.409]

34 Van de Cruys (2008) proposes a model for sense induction based on latent semantic dimensions. [sent-66, score-0.655]

35 Using an extension of non-negative matrix factorization, the model induces a latent semantic space according to which both dependency features and broad contextual features are classified. [sent-67, score-0.489]

36 The model presented below is an extension of this approach: whereas the model described in Van de Cruys (2008) is only able to perform word sense induction, our model is capable of performing both word sense induction and disambiguation. [sent-69, score-0.882]

37 1 Non-negative Matrix Factorization Our model uses non-negative matrix factorization NMF (Lee and Seung, 2000) in order to find latent dimensions. [sent-71, score-0.528]

38 Secondly, the non-negative nature of the factorization ensures that only additive and no subtractive relations are allowed. [sent-76, score-0.296]

39 Non-negative matrix factorization enforces the constraint that all three matrices must be non-negative, so all elements must be greater than or equal to zero. [sent-80, score-0.566]

40 2 Word sense inductionP Using an extension of non-negative matrix factorization, we are able to jointly induce latent factors for three different modes: words, their window-based (‘bag of words’) context words, and their dependency relations. [sent-86, score-0.786]

41 the results of the former factorization are used to initialize the factorization of the next matrix). [sent-91, score-0.49]

42 A graphical representation of the interleaved factorization algorithm is given in figure 1. [sent-92, score-0.329]

43 Matrix W is then copied to matrix V, and the update of matrix G is computed (using equation 2). [sent-96, score-0.525]

44 The transpose of matrix G is again copied to matrix U, and the update ofF is computed (again using equation 2). [sent-97, score-0.525]

45 As a last step, matrix F is copied to matrix H, and we restart the iteration loop until a stopping criterion (e. [sent-98, score-0.441]

46 1 When the factorization is finished, the three different modes (words, windowbased context words and dependency relations) are all represented according to a limited number of latent factors. [sent-101, score-0.469]

47 Next, the factorization that is thus created is used for word sense induction. [sent-102, score-0.602]

48 The intuition is that a par- ticular, dominant dimension of an ambiguous word is ‘switched off’, in order to reveal other possible senses of the word. [sent-103, score-0.618]

49 With this knowledge, the dependency relations that are responsible for a certain dimension can be subtracted from the original noun vector. [sent-106, score-0.433]

50 t = v(u1 − hk) (4) Equation 4 multiplies each dependency feature of the original noun vector v with a scaling factor, according to the load of the feature on the subtracted dimension (hk the vector of matrix H that corresponds to the dimension we want to subtract). [sent-108, score-0.835]

51 1479 evant to the particular topical dimension have been scaled down. [sent-111, score-0.328]

52 In order to determine which dimension(s) are responsible for a particular sense of the word, the method is embedded in a clustering approach. [sent-112, score-0.427]

53 First, a specific word is assigned to its predominant sense (i. [sent-113, score-0.32]

54 Next, the dominant semantic dimension(s) for this cluster are subtracted from the word vector, and the resulting vector is fed to the clustering algorithm again, to see if other word senses emerge. [sent-116, score-0.935]

55 The dominant semantic dimension(s) can be identified by folding vector c representing the cluster centroid into the factorization (equation 5). [sent-117, score-0.675]

56 The centroid of the cluster is computed by averaging the frequencies of all cluster elements except for the target word we want to reassign. [sent-123, score-0.443]

57 After subtracting the salient dimensions from the noun vector, we check whether the vector is reassigned to another cluster centroid. [sent-124, score-0.435]

58 The target element is removed from the centroid to make sure that only the dimensions associated with the sense of the cluster are subtracted. [sent-127, score-0.707]

59 The first method, NMFcon, takes a conservative approach, and only selects candidate senses if after the subtraction of salient dimensions another sense is found that is more similar2 to the adapted noun vector than the – – 2We use the cosine measure for our similarity calculations. [sent-130, score-1.065]

60 The second method, NMFlib, is more liberal, and also selects the next best cluster centroid as candidate sense until a certain similarity threshold φ is reached. [sent-132, score-0.582]

61 3 Word sense disambiguation The sense inventory that results from the induction step can now be used for the disambiguation of individual instances as follows. [sent-134, score-1.247]

62 Using matrix G from our factorization model (which represents context words by semantic dimensions), this vector can be folded into the semantic space, thus representing a probability vector over latent factors for the particular instance of the target noun (equation 6). [sent-138, score-1.188]

63 d = fGT (6) Likewise, the candidate senses of the noun (represented as centroids) can be folded into our semantic space using matrix H (equation 5). [sent-139, score-0.88]

64 As a last step, we compute the Kullback-Leibler divergence between the context vector and the candidate centroids, and select the candidate centroid that yields the lowest divergence as the correct sense. [sent-141, score-0.506]

65 The sense induction algorithm finds the following candidate senses:4 1. [sent-145, score-0.54]

66 2 4Note that we do not use the word sense to hint at a lexicographic meaning distinction; rather, sense in this case should be regarded as a more coarse-grained and topic-related entity. [sent-149, score-0.599]

67 Figure 2: Graphical representation of the disambiguation process Each candidate sense is associated with a centroid (the average frequency vector of the cluster’s members), that is folded into the semantic space, which yields a ‘semantic fingerprint’, i. [sent-150, score-0.857]

68 Likewise, for the second and the third sense the ‘food’ dimension and the ‘manufacturing’ dimension will be the most important. [sent-154, score-0.545]

69 Looking at the context of the particular instance of chip, a context vector is created which represents the semantic content words that appear in the same paragraph (the extracted content words are printed in boldface). [sent-161, score-0.412]

70 This context vector is again folded into the semantic space, yielding a distribution over the semantic dimensions. [sent-162, score-0.36]

71 By selecting the lowest 5In the majority of cases, the induced dimensions indeed contain such clear-cut semantics, so that the dimensions can be rightfully labeled as above. [sent-163, score-0.416]

72 1480 Kullback-Leibler divergence between the semantic probability distribution of the target instance and the semantic probability distributions of the candidate senses, the algorithm is able to assign the ‘computer’ sense of the target noun chip. [sent-164, score-0.817]

73 1 Dataset Our word sense induction and disambiguation model is trained and tested on the dataset of the SEMEVAL-2010 WSI/WSD task (Manandhar et al. [sent-166, score-0.675]

74 For each target word, a training set is provided from which the senses of the word have to be induced without using any other resources. [sent-169, score-0.532]

75 The training set for a target word consists of a set of target word instances in context (sentences or paragraphs). [sent-170, score-0.312]

76 The senses induced during training are used for disambiguation in the testing phase. [sent-173, score-0.591]

77 The instances in the test set are tagged with OntoNotes senses (Hovy et al. [sent-176, score-0.458]

78 The system needs to disambiguate these instances using the senses acquired during training. [sent-178, score-0.458]

79 The SEMEVAL test set has only been tagged and lemmatized, as our disambiguation model does not use dependency triples as features (contrary to the induction model). [sent-183, score-0.44]

80 For each model, the matrices needed for our interleaved NMF factorization are extracted from the corpus. [sent-187, score-0.456]

81 The noun model was built using 5K nouns, 80K dependency relations, and 2K context words (excluding stop words) with highest frequency in the training set, which yields matrices of 5K nouns 80K dependency relations, 5K nouns 2K cnoonutnesxt × words, and 80K dependency relations 22K context words. [sent-188, score-0.785]

82 The mapping between clusters and gold standard senses is used to tag the evaluation corpus with gold standard tags. [sent-196, score-0.632]

83 In the unsupervised evaluation, the induced senses are evaluated as clusters of instances which are compared to the sets of instances tagged with the gold standard senses (corresponding to classes). [sent-198, score-1.218]

84 Homogeneity refers to the degree that each cluster consists of data points primarily belonging to a single gold standard class, while completeness refers to the degree that each gold standard class consists of data points primarily assigned to a single cluster. [sent-205, score-0.301]

85 The most frequent sense (MFS) baseline groups all testing instances of a target word into one cluster. [sent-219, score-0.455]

86 7The number of clusters in Random was chosen to be roughly equal to the average number of senses in the gold standard. [sent-222, score-0.555]

87 NMFcon our model that takes a conservative approach in the induction of candidate senses does not beat the random baseline. [sent-230, score-0.661]

88 NMFlib our model that is more liberal in inducing senses reaches better results. [sent-231, score-0.452]

89 This means that the random systems used for testing were ranked low when a high number of random senses was used. [sent-273, score-0.432]

90 The paired F-Score penalizes systems when they produce a higher number of clusters (low recall) or a lower number of clusters (low precision) than the gold standard number of senses. [sent-275, score-0.354]

91 2 Supervised evaluation In the supervised evaluation, the automatically induced clusters are mapped to gold standard senses, using the mapping corpus (i. [sent-308, score-0.308]

92 According to Pedersen (2010), the supervised learning algorithm that underlies this evaluation method tends to converge to the Most Frequent Sense (MFS) baseline, because the number of senses that the classifier assigns to the test instances is rather low. [sent-344, score-0.501]

93 5 Conclusion and future work In this paper, we presented a model based on latent semantics that is able to perform word sense induction as well as disambiguation. [sent-347, score-0.651]

94 Using latent topical dimensions, the model is able to discriminate between different senses of a word, and subsequently disambiguate particular instances of a word. [sent-348, score-0.774]

95 The evaluation results indicate that our model reaches state-of-the-art performance compared to other systems that participated in the SEMEVAL-20 10 word sense induction and disambiguation task. [sent-349, score-0.725]

96 The evaluation set con- tains an enormous amount of contexts for only a small number of target words, favouring methods that induce senses on a per-word basis. [sent-351, score-0.509]

97 A global approach like ours is likely to induce a more balanced sense inventory using an unbiased corpus, and is likely to outperform local methods when such an unbiased corpus is used as input. [sent-352, score-0.545]

98 We therefore think that the global, unified approach to word sense induction and disambiguation presented here provides a genuine and powerful solution to the problem at hand. [sent-353, score-0.675]

99 For now, the disambiguation step only uses a word’s context words; enriching the feature set with dependency information is likely to improve the performance of the disambiguation. [sent-357, score-0.314]

100 SemEval-2007 Task 02: Evaluating word sense induction and discrimination systems. [sent-361, score-0.569]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('senses', 0.368), ('sense', 0.279), ('factorization', 0.245), ('induction', 0.21), ('matrix', 0.194), ('nmf', 0.185), ('dimensions', 0.169), ('nmfcon', 0.162), ('nmflib', 0.162), ('disambiguation', 0.145), ('topical', 0.143), ('mfs', 0.139), ('dimension', 0.133), ('matrices', 0.127), ('clusters', 0.11), ('centroid', 0.108), ('cluster', 0.106), ('semeval', 0.104), ('noun', 0.097), ('clustering', 0.096), ('pedersen', 0.093), ('folded', 0.093), ('instances', 0.09), ('latent', 0.089), ('dependency', 0.085), ('interleaved', 0.084), ('induced', 0.078), ('gold', 0.077), ('semantic', 0.077), ('dominant', 0.076), ('wsi', 0.075), ('wsd', 0.074), ('divergence', 0.071), ('svd', 0.07), ('apidianaki', 0.069), ('cruys', 0.069), ('subtracted', 0.067), ('agirre', 0.066), ('inventory', 0.065), ('algorithms', 0.064), ('vector', 0.063), ('artiles', 0.061), ('nouns', 0.057), ('induce', 0.057), ('paired', 0.057), ('homogeneity', 0.056), ('manandhar', 0.056), ('landauer', 0.055), ('copied', 0.053), ('particular', 0.052), ('equation', 0.051), ('relations', 0.051), ('candidate', 0.051), ('reaches', 0.05), ('context', 0.05), ('global', 0.048), ('unbiased', 0.048), ('factorizations', 0.046), ('hyperlex', 0.046), ('marianna', 0.046), ('meval', 0.046), ('uoy', 0.046), ('widdows', 0.046), ('target', 0.045), ('eneko', 0.045), ('broad', 0.044), ('instance', 0.043), ('supervised', 0.043), ('toutanova', 0.042), ('word', 0.041), ('yields', 0.041), ('tight', 0.041), ('completeness', 0.041), ('paris', 0.041), ('chip', 0.041), ('triplets', 0.041), ('testset', 0.041), ('paragraph', 0.04), ('contexts', 0.039), ('discrimination', 0.039), ('similarity', 0.038), ('disambiguated', 0.038), ('van', 0.038), ('rosenberg', 0.038), ('centroids', 0.038), ('created', 0.037), ('unsupervised', 0.037), ('frequencies', 0.037), ('purandare', 0.035), ('aitor', 0.035), ('distributional', 0.034), ('step', 0.034), ('metrics', 0.034), ('hk', 0.034), ('liberal', 0.034), ('update', 0.033), ('vein', 0.032), ('able', 0.032), ('random', 0.032), ('window', 0.031)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999893 198 acl-2011-Latent Semantic Word Sense Induction and Disambiguation

Author: Tim Van de Cruys ; Marianna Apidianaki

2 0.31637478 158 acl-2011-Identification of Domain-Specific Senses in a Machine-Readable Dictionary

Author: Fumiyo Fukumoto ; Yoshimi Suzuki

Abstract: This paper focuses on domain-specific senses and presents a method for assigning category/domain label to each sense of words in a dictionary. The method first identifies each sense of a word in the dictionary to its corresponding category. We used a text classification technique to select appropriate senses for each domain. Then, senses were scored by computing the rank scores. We used Markov Random Walk (MRW) model. The method was tested on English and Japanese resources, WordNet 3.0 and EDR Japanese dictionary. For evaluation of the method, we compared English results with the Subject Field Codes (SFC) resources. We also compared each English and Japanese results to the first sense heuristics in the WSD task. These results suggest that identification of domain-specific senses (IDSS) may actually be of benefit.

3 0.31237796 307 acl-2011-Towards Tracking Semantic Change by Visual Analytics

Author: Christian Rohrdantz ; Annette Hautli ; Thomas Mayer ; Miriam Butt ; Daniel A. Keim ; Frans Plank

Abstract: This paper presents a new approach to detecting and tracking changes in word meaning by visually modeling and representing diachronic development in word contexts. Previous studies have shown that computational models are capable of clustering and disambiguating senses, a more recent trend investigates whether changes in word meaning can be tracked by automatic methods. The aim of our study is to offer a new instrument for investigating the diachronic development of word senses in a way that allows for a better understanding of the nature of semantic change in general. For this purpose we combine techniques from the field of Visual Analytics with unsupervised methods from Natural Language Processing, allowing for an interactive visual exploration of semantic change.

4 0.22120477 240 acl-2011-ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation

Author: Els Lefever ; Veronique Hoste ; Martine De Cock

Abstract: This paper describes a set of exploratory experiments for a multilingual classificationbased approach to Word Sense Disambiguation. Instead of using a predefined monolingual sense-inventory such as WordNet, we use a language-independent framework where the word senses are derived automatically from word alignments on a parallel corpus. We built five classifiers with English as an input language and translations in the five supported languages (viz. French, Dutch, Italian, Spanish and German) as classification output. The feature vectors incorporate both the more traditional local context features, as well as binary bag-of-words features that are extracted from the aligned translations. Our results show that the ParaSense multilingual WSD system shows very competitive results compared to the best systems that were evaluated on the SemEval-2010 Cross-Lingual Word Sense Disambiguation task for all five target languages.

5 0.21363546 224 acl-2011-Models and Training for Unsupervised Preposition Sense Disambiguation

Author: Dirk Hovy ; Ashish Vaswani ; Stephen Tratz ; David Chiang ; Eduard Hovy

Abstract: We present a preliminary study on unsupervised preposition sense disambiguation (PSD), comparing different models and training techniques (EM, MAP-EM with L0 norm, Bayesian inference using Gibbs sampling). To our knowledge, this is the first attempt at unsupervised preposition sense disambiguation. Our best accuracy reaches 56%, a significant improvement (at p <.001) of 16% over the most-frequent-sense baseline.

6 0.21030551 167 acl-2011-Improving Dependency Parsing with Semantic Classes

7 0.20423689 334 acl-2011-Which Noun Phrases Denote Which Concepts?

8 0.17399131 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

9 0.14595191 96 acl-2011-Disambiguating temporal-contrastive connectives for machine translation

10 0.10879067 145 acl-2011-Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling

11 0.10432235 39 acl-2011-An Ensemble Model that Combines Syntactic and Semantic Clustering for Discriminative Dependency Parsing

12 0.10296988 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

13 0.10149345 148 acl-2011-HITS-based Seed Selection and Stop List Construction for Bootstrapping

14 0.098638356 2 acl-2011-AM-FM: A Semantic Framework for Translation Quality Assessment

15 0.096216403 29 acl-2011-A Word-Class Approach to Labeling PSCFG Rules for Machine Translation

16 0.090387642 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

17 0.089165531 204 acl-2011-Learning Word Vectors for Sentiment Analysis

18 0.088304609 304 acl-2011-Together We Can: Bilingual Bootstrapping for WSD

19 0.086817652 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning

20 0.086786143 137 acl-2011-Fine-Grained Class Label Markup of Search Queries

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.238), (1, 0.061), (2, -0.113), (3, -0.039), (4, 0.014), (5, -0.057), (6, 0.155), (7, 0.109), (8, -0.053), (9, -0.015), (10, 0.061), (11, -0.276), (12, 0.26), (13, 0.037), (14, -0.035), (15, -0.212), (16, 0.139), (17, 0.207), (18, -0.062), (19, 0.17), (20, 0.007), (21, -0.098), (22, -0.002), (23, 0.001), (24, -0.037), (25, 0.013), (26, 0.045), (27, -0.073), (28, -0.047), (29, 0.011), (30, -0.089), (31, 0.05), (32, -0.044), (33, 0.097), (34, -0.045), (35, -0.042), (36, -0.011), (37, 0.002), (38, -0.024), (39, -0.054), (40, -0.012), (41, -0.015), (42, 0.029), (43, 0.068), (44, 0.053), (45, 0.003), (46, -0.005), (47, 0.022), (48, 0.041), (49, 0.006)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95744747 198 acl-2011-Latent Semantic Word Sense Induction and Disambiguation

Author: Tim Van de Cruys ; Marianna Apidianaki

2 0.93593252 307 acl-2011-Towards Tracking Semantic Change by Visual Analytics

Author: Christian Rohrdantz ; Annette Hautli ; Thomas Mayer ; Miriam Butt ; Daniel A. Keim ; Frans Plank

3 0.9210251 158 acl-2011-Identification of Domain-Specific Senses in a Machine-Readable Dictionary

Author: Fumiyo Fukumoto ; Yoshimi Suzuki

4 0.83067328 334 acl-2011-Which Noun Phrases Denote Which Concepts?

Author: Jayant Krishnamurthy ; Tom Mitchell

Abstract: Resolving polysemy and synonymy is required for high-quality information extraction. We present ConceptResolver, a component for the Never-Ending Language Learner (NELL) (Carlson et al., 2010) that handles both phenomena by identifying the latent concepts that noun phrases refer to. ConceptResolver performs both word sense induction and synonym resolution on relations extracted from text using an ontology and a small amount of labeled data. Domain knowledge (the ontology) guides concept creation by defining a set of possible semantic types for concepts. Word sense induction is performed by inferring a set of semantic types for each noun phrase. Synonym detection exploits redundant informa- tion to train several domain-specific synonym classifiers in a semi-supervised fashion. When ConceptResolver is run on NELL’s knowledge base, 87% of the word senses it creates correspond to real-world concepts, and 85% of noun phrases that it suggests refer to the same concept are indeed synonyms.

5 0.72156835 224 acl-2011-Models and Training for Unsupervised Preposition Sense Disambiguation

Author: Dirk Hovy ; Ashish Vaswani ; Stephen Tratz ; David Chiang ; Eduard Hovy

6 0.66855806 96 acl-2011-Disambiguating temporal-contrastive connectives for machine translation

7 0.6307736 240 acl-2011-ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation

8 0.61969042 167 acl-2011-Improving Dependency Parsing with Semantic Classes

9 0.5258674 319 acl-2011-Unsupervised Decomposition of a Document into Authorial Components

10 0.48816568 341 acl-2011-Word Maturity: Computational Modeling of Word Knowledge

11 0.48110735 304 acl-2011-Together We Can: Bilingual Bootstrapping for WSD

12 0.4698742 148 acl-2011-HITS-based Seed Selection and Stop List Construction for Bootstrapping

13 0.46114555 222 acl-2011-Model-Portability Experiments for Textual Temporal Analysis

14 0.45550886 120 acl-2011-Even the Abstract have Color: Consensus in Word-Colour Associations

15 0.43065459 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

16 0.38381031 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

17 0.38341162 145 acl-2011-Good Seed Makes a Good Crop: Accelerating Active Learning Using Language Modeling

18 0.37056547 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing

19 0.36469078 229 acl-2011-NULEX: An Open-License Broad Coverage Lexicon

20 0.36255518 288 acl-2011-Subjective Natural Language Problems: Motivations, Applications, Characterizations, and Implications

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.026), (17, 0.045), (26, 0.024), (37, 0.133), (39, 0.044), (41, 0.059), (55, 0.033), (59, 0.095), (72, 0.029), (79, 0.168), (91, 0.045), (96, 0.173), (97, 0.03)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.86798954 198 acl-2011-Latent Semantic Word Sense Induction and Disambiguation

Author: Tim Van de Cruys ; Marianna Apidianaki

2 0.82856941 116 acl-2011-Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers

Author: Deyi Xiong ; Min Zhang ; Haizhou Li

Abstract: In this paper, with a belief that a language model that embraces a larger context provides better prediction ability, we present two extensions to standard n-gram language models in statistical machine translation: a backward language model that augments the conventional forward language model, and a mutual information trigger model which captures long-distance dependencies that go beyond the scope of standard n-gram language models. We integrate the two proposed models into phrase-based statistical machine translation and conduct experiments on large-scale training data to investigate their effectiveness. Our experimental results show that both models are able to significantly improve transla- , tion quality and collectively achieve up to 1 BLEU point over a competitive baseline.

3 0.82173312 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

Author: Joel Lang ; Mirella Lapata

Abstract: In this paper we describe an unsupervised method for semantic role induction which holds promise for relieving the data acquisition bottleneck associated with supervised role labelers. We present an algorithm that iteratively splits and merges clusters representing semantic roles, thereby leading from an initial clustering to a final clustering of better quality. The method is simple, surprisingly effective, and allows to integrate linguistic knowledge transparently. By combining role induction with a rule-based component for argument identification we obtain an unsupervised end-to-end semantic role labeling system. Evaluation on the CoNLL 2008 benchmark dataset demonstrates that our method outperforms competitive unsupervised approaches by a wide margin.

4 0.82023674 170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization

Author: Harr Chen ; Edward Benson ; Tahira Naseem ; Regina Barzilay

Abstract: We present a novel approach to discovering relations and their instantiations from a collection of documents in a single domain. Our approach learns relation types by exploiting meta-constraints that characterize the general qualities of a good relation in any domain. These constraints state that instances of a single relation should exhibit regularities at multiple levels of linguistic structure, including lexicography, syntax, and document-level context. We capture these regularities via the structure of our probabilistic model as well as a set of declaratively-specified constraints enforced during posterior inference. Across two domains our approach successfully recovers hidden relation structure, comparable to or outperforming previous state-of-the-art approaches. Furthermore, we find that a small , set of constraints is applicable across the domains, and that using domain-specific constraints can further improve performance. 1

5 0.81962812 164 acl-2011-Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features

Author: Yuval Marton ; Nizar Habash ; Owen Rambow

Abstract: We explore the contribution of morphological features both lexical and inflectional to dependency parsing of Arabic, a morphologically rich language. Using controlled experiments, we find that definiteness, person, number, gender, and the undiacritzed lemma are most helpful for parsing on automatically tagged input. We further contrast the contribution of form-based and functional features, and show that functional gender and number (e.g., “broken plurals”) and the related rationality feature improve over form-based features. It is the first time functional morphological features are used for Arabic NLP. – –

6 0.81265593 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

7 0.81088102 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

8 0.80957371 190 acl-2011-Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations

9 0.80562353 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

10 0.80418903 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

11 0.80372995 222 acl-2011-Model-Portability Experiments for Textual Temporal Analysis

12 0.80143189 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

13 0.80016237 167 acl-2011-Improving Dependency Parsing with Semantic Classes

14 0.79961276 85 acl-2011-Coreference Resolution with World Knowledge

15 0.79915887 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates

16 0.79890454 311 acl-2011-Translationese and Its Dialects

17 0.79830915 44 acl-2011-An exponential translation model for target language morphology

18 0.79804885 269 acl-2011-Scaling up Automatic Cross-Lingual Semantic Role Annotation

19 0.79754317 158 acl-2011-Identification of Domain-Specific Senses in a Machine-Readable Dictionary

20 0.79735506 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters