acl acl2013 acl2013-111 knowledge-graph by maker-knowledge-mining

111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD


Source: pdf

Author: Koichi Tanigaki ; Mitsuteru Shiba ; Tatsuji Munaka ; Yoshinori Sagisaka

Abstract: This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. [sent-2, score-0.412]

2 Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. [sent-3, score-0.96]

3 1 Introduction Word Sense Disambiguation (WSD) is a task to identify the intended sense of a word based on its context. [sent-5, score-0.351]

4 They compute dictionary-based sense similarity to find the most related senses among the words within a certain range of text. [sent-12, score-0.501]

5 However, those methods mainly focus on modeling sense distribution and have less attention to contextual smoothing/generalization beyond immediate context. [sent-17, score-0.372]

6 (2004) proposed a method to combine sense similarity with distributional similarity and configured predominant sense score. [sent-20, score-0.87]

7 (2009) used a k-nearest words on distributional similarity as context words. [sent-24, score-0.246]

8 They apply a LKB graph-based WSD to a target word together with the distributional context words, and showed that it yields better results on a domain dataset than just using immediate context words. [sent-25, score-0.423]

9 c A2s0s1o3ci Aatsiosonc fioartio Cno fmorpu Ctoamtiopnuatalt Lioin gauli Lsitnicgsu,i psatgices 8 4–893, studies are word-by-word WSD for target words, they demonstrated the effectiveness to enrich immediate context by corpus statistics. [sent-28, score-0.195]

10 This paper proposes a smoothing model that integrates dictionary-based semantic similarity and corpus-based context statistics, where a combinatorial optimization scheme is employed to deal with sense interdependency of the all-words WSD task. [sent-29, score-0.778]

11 A specific implementation of these metrics is described later in this paper, for now the context metric is generalized with a distance function dx(·, ·) and the sense metric with ds(·, ·). [sent-40, score-0.604]

12 For each xi, the intended sense of the word is to be found in a set of sense candidates Si = ⊆ S, where Mi is the number of sense c{ asnd}idates⊆ ⊆fo Sr t,h we hie-rthe word, S is the whole set of sense inventories in a dictionary. [sent-45, score-1.278]

13 Let the two-tuple hij = (xi, sij) be the hypothesis that the intended sense in xi is sij. [sent-46, score-0.88]

14 As (X, dx) eanntd o (S, ds) iereaccth composes a =m Xetri ×c space, H is also a metric space, provided a proper distance definition with dx and ds. [sent-48, score-0.197]

15 Here, we treat the space H as a continuous one, which means that we assume the relationship between context and sense can be generalized in continuous fashion. [sent-49, score-0.435]

16 According to the nature of continuity, once given a hypothesis hij for a certain word, we can extrapolate the hypothesis for another word of another sense hi′j′ = (xi′ , si′j′) sufficiently close to hij. [sent-53, score-0.931]

17 They control the∈ smoothing intensity in context and in sense, respectively. [sent-55, score-0.282]

18 Our objective is to determine the optimal sense for all the target words simultaneously. [sent-56, score-0.402]

19 We relax the integer constraints by introducing a sense probability parameter πij corresponding to each hij. [sent-58, score-0.354]

20 πij denotes the probability by which hij is true. [sent-59, score-0.486]

21 The probabilit∑y density extrapolated a ≤t hi′j′ by a probabilistic hypothesis hij is given as follows: Qij (hi′j′) ∝ πij K(hij , hi′j′) . [sent-61, score-0.694]

22 Due to the limitation of drawing, both the context metric space and the sense metric space are drawn schematically as 1-dimensional spaces (axes), actually arbitrary metric spaces similarity-based or feature-based are applicable. [sent-63, score-0.713]

23 The product metric space of the context metric space and the sense metric space composes a hypothesis space. [sent-64, score-0.817]

24 In the hypothesis space, n sense hypotheses for a certain word is represented as n points on the hyperplane that spreads across the sense metric space. [sent-65, score-0.9]

25 The two small circles in the middle of the figure represent the two sense hypotheses for a single word. [sent-66, score-0.445]

26 The position of a hypothesis represents which sense is assigned to the current word in 885 "Invasive, exotic plants cause particular problems for wildlife. [sent-67, score-0.44]

27 In accordance with geometric intuition, intensity of extrapolation is affected by the distance from a hypothesis, and by the probability of the hypothesis by itself. [sent-73, score-0.277]

28 Extrapolated probability density is represented by shadow thickness and surface height. [sent-74, score-0.259]

29 If there is another word in nearby context, the kernels can validate the sense of that word. [sent-75, score-0.505]

30 In the figure, there are two kernels in the context “Invasive, exotic . [sent-76, score-0.229]

31 They are two competing hypothesis for the senses decoy and flora of the word plants. [sent-80, score-0.201]

32 These kernels affect the senses of another ambiguous word tree in nearby context “Exotic . [sent-81, score-0.419]

33 ”, and extrapolate the most at the sense t ree nearby flora. [sent-84, score-0.486]

34 It affects little to the word far away in context or in sense as is the case for the background word in the figure. [sent-86, score-0.397]

35 Wider bandwidths bring stronger effect of generalization to further hypotheses, but too wide bandwidths smooth out detailed structure. [sent-88, score-0.43]

36 The bandwidths are the key for disambiguation, therefore they are to be optimized on a dataset together with sense probabilities. [sent-89, score-0.563]

37 3 Simultaneous Optimization of All-words WSD Given the smoothing model to extrapolate the senses of other words, we now make its instances interact to obtain the optimal combination of senses for all the words. [sent-90, score-0.527]

38 The parameters consist of a context bandwidth σx, a sense bandwidth σs, and sense probabilities πij for all i and j. [sent-93, score-0.874]

39 For convenience of description, the sense probabilities are all together denoted as a vector π = (. [sent-94, score-0.347]

40 We consider all the mappings from context to sense are latent, and find the optimal parameters by maximizing marginal pseudo likelihood based on probability density. [sent-102, score-0.592]

41 The likelihood is defined as follows: L(π,σx,σs;X) ≡ ln∏∑πijQ(hij), ∏i (3) ∑j where ∏i denotes the product over xi ∈ X, ∑j denotes∏ the summation over all possible sen∑ses sij ∈ Si for the current i-th context. [sent-103, score-0.266]

42 W∑e take as the unit of LOOCV not a wor∈d i Snstance∑ but a word type, because the instances of the same word type invariably have the same sense candidates, which still cause over-fitting when optimizing the sense bandwidth. [sent-107, score-0.618]

43 (6) When we optimize the parameters, the first term of Equation (6) in the right-hand side acts to reinforce nearby hypotheses among different words, whereas the second term acts to suppress conflicting hypotheses of the same word. [sent-113, score-0.48]

44 (10) Qi′j′ (hij) denotes the∑ probability density at hij extrapolated by hi′j′ alone, defined as follows: Qi′j′(hij) ≡ N −1 Niπi′j′K(hij,hi′j′). [sent-116, score-0.727]

45 The right term requires πij to agree with the ratio of responsibility of hij to the whole. [sent-120, score-0.45]

46 As for sense probabilities, we set the uniform probability in accordance with the number of sense candidates, thereby πij ← |Si |−1, where |Si | denotes the size of Si. [sent-128, score-0.717]

47 |ASs for, bandwidths, we set the mean squared distance in each metric; thereby σx2 ← N−1 ∑i,i′ dx2(xi, xi′) for context ban∑dwidth, and σs2 ← (∑i|Si|)−1 ∑i,i′ ∑j,j′ ds2(sij,si′j′) s∑ense ba|)ndw∑idth. [sent-129, score-0.183]

48 The sense hypotheses are depicted by twelve upward arrows. [sent-141, score-0.555]

49 Through the iterative parameter update, sense probabilities and kernel bandwidths were optimized to the dataset. [sent-144, score-0.562]

50 Figure 2(a) illustrates the initial status, where all the sense hypothesis are equivalently probable, thus they are in the most ambiguous status. [sent-145, score-0.375]

51 Initial bandwidths are set to the mean squared distance of all the hypotheses pairs, 887 Figure 2: Pseudo 2D data simulation to visualize the dynamics of the proposed simultaneous WSD with ambiguous five words and twelve sense hypotheses. [sent-146, score-0.902]

52 This is because our method is aiming at modeling not the disambiguation of clustermemberships but the disambiguation of senses for each word. [sent-159, score-0.393]

53 The context of word instances are tied to the distributional context of the word type in a large corpus. [sent-164, score-0.277]

54 To calculate sense similarities, we used the WordNet similarity package by Pedersen et al. [sent-165, score-0.366]

55 1, to obtain grammatical relations for the distributional similarity, as well as to obtain lemmata and part-of-speech (POS) tags which are required to look up the sense inventory of WordNet. [sent-174, score-0.41]

56 Based on the distributional similarity, we just used k-nearest neighbor words as the context of each target word. [sent-175, score-0.272]

57 To treat the above similarity functions of context and of sense as distance functions, we use the conversion: d(·, ·) ≡ −α ln(f(·, ·)/fmax), where cdo dnveneorsteiosn t:he d objective αdilstna(nfc(e· function, i. [sent-180, score-0.501]

58 , dx for context and ds for sense, while f and fmax denote the original similarity function and its maximum, respectively. [sent-182, score-0.215]

59 5 Evaluation To confirm the effect of the proposed smoothing model and its combinatorial optimization scheme, we conducted WSD evaluations. [sent-185, score-0.326]

60 For this reason, we evaluated all the sense probabilities as they were. [sent-200, score-0.347]

61 The context metric space was composed by knearest neighbor words of distributional similarity (Lin, 1998), as is described in Section 4. [sent-203, score-0.403]

62 Aluas efodr sense m 3,e 5tr,ic 10 space, we 0e,v 1a0lu0,a2t0e0d, t 3w0o0 measures i. [sent-205, score-0.309]

63 In every condition, stopping criterion of iteration is always the number of iteration (500 times), irrespective of the convergence in likelihood. [sent-208, score-0.198]

64 (2004), which determines the word sense based on sense similarity and distributional similarity to the k-nearest neighbor words of a target word by distributional similarity. [sent-212, score-1.017]

65 Our major advantage is the combinatorial optimization framework, while the conventional one employs word-by-word scheme. [sent-213, score-0.214]

66 (2007), which determines the word sense by maximizing the sum of sense similarity to the k immediate neighbor words of a target word. [sent-215, score-0.821]

67 Our major advantages are the combinatorial optimization scheme and the smoothing model to integrate distributional similarity. [sent-219, score-0.425]

68 Thus we can conclude that, though significance depends on metrics, our smoothing model and the optimization scheme are effective to improve accuracies. [sent-293, score-0.248]

69 Let us start by looking at the upper half of Figure 5, which shows the change of sense probabilities through iteration. [sent-314, score-0.387]

70 As iteration proceeded, the probabilities gradually spread out to either side of 1 or 0, and finally at iteration 500, we can observe that almost all the words were clearly disambiguated. [sent-316, score-0.236]

71 Vertical axis on the left is for the sense bandwidth, and on the right is for the context bandwidth. [sent-318, score-0.397]

72 We can observe those bandwidths became narrower as iteration proceeded. [sent-319, score-0.314]

73 Intensity of smoothing was dynamically adjusted by the whole disambiguation status. [sent-320, score-0.28]

74 6 Discussion This section discusses the validity of the proposed method as to i) sense-interdependent disambiguation and ii) reliability of data smoothing. [sent-322, score-0.197]

75 This means that the ranks of sense candidates for each word were frequently altered through iteration, which further means that some new information not obtained earlier was delivered one after another to sense disambiguation for each word. [sent-331, score-0.747]

76 From these results, we could confirm the expected sense-interdependency effect that a sense disambiguation of certain word affected to other words. [sent-332, score-0.483]

77 2 Reliability of Smoothing as Supervision Let us now discuss the reliability of our smoothing model. [sent-334, score-0.219]

78 In our method, sense disambiguation of a word is guided by its nearby words’ extrapolation (smoothing). [sent-335, score-0.634]

79 If we take sufficient number of random words as nearby words, the sense distribution comes close to the true distribution, and then we expect the statistically true sense distribution should find out the true sense of the target word, according to the distributional hypotheses (Harris, 1954). [sent-338, score-1.328]

80 On the contrary, if we take nearby words that are biased to particular words, the sense distribution also becomes biased, and the extrapolation becomes less reliable. [sent-339, score-0.505]

81 We can compute the randomness of words that affect for sense disambiguation, by word perplexity. [sent-340, score-0.309]

82 T Hhe| condit∑ional probability p(w′|w) denotes the probability with which a cert|wai)n dwenoordte sw ′t ∈ oVb \ {w} determines the sense of w, whi∈ch can {bew }def dienteedrm as sthe t d∑ensity r oaftio: w p(w′|w) ∝ ∑i:wi=w ∑i′:wi′=w′ ∑j,j′ Qi′j′(hij). [sent-344, score-0.488]

83 The rela∑tion betwee∑n worQd perplexity and probability change for ground-truth senses of nouns (JCN/k = 30) is shown in Figure 7. [sent-345, score-0.274]

84 The upper histogram shows the change in iteration 1-100, and the lower shows that of iteration 101-500. [sent-346, score-0.238]

85 We divide the analysis at iteration 100, because roughly until the 100th iteration, the change in bandwidths converged, and the number of words to interact settled, as can be seen in Figure 5. [sent-347, score-0.354]

86 I onc contrast, argte tlhye ( l7o9w%er) tloef tth oef c tohrefigure, where perplexity is small (< 30) and bandwidths has been narrowed at iteration 101-500, correct change occupied only 32% of the whole. [sent-353, score-0.408]

87 Therefore, we can conclude that if sufficiently random samples of nearby words are provided, our smoothing model is reliable, though it is trained in an unsupervised fashion. [sent-354, score-0.353]

88 knowledge-based approaches typically regard senses as vertices (see Section 1), and corpusbased approaches such as (V´ eronis, 2004) regard words as vertices or (Niu et al. [sent-358, score-0.275]

89 Mihalcea (2005) proposed graph-based methods, whose vertices are sense label hypotheses on word sequence. [sent-361, score-0.515]

90 They disambiguated each target word using its distributionally similar words instead of its immediate context words. [sent-374, score-0.195]

91 Second, we extend the definition of density from Euclidean distance to general metric, which makes the proposed method applicable to a wide variety of corpus-based context similarities and dictionarybased sense similarities. [sent-379, score-0.577]

92 8 Conclusions We proposed a novel smoothing model with a combinatorial optimization scheme for all-words WSD from untagged corpora. [sent-380, score-0.384]

93 Moreover, our smoothing model, though unsupervised, provides reliable supervision when sufficiently random samples of words are available as nearby words. [sent-383, score-0.317]

94 Thus it was confirmed that this method is valid for finding the optimal combination of word senses with large untagged corpora. [sent-384, score-0.244]

95 Knowledgebased wsd on specific domains: performing better than generic supervised wsd. [sent-396, score-0.188]

96 Semeval-2010 task 17: All-words word sense disambiguation on a specific domain. [sent-400, score-0.438]

97 Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. [sent-428, score-0.438]

98 Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. [sent-444, score-0.438]

99 Word sense disambiguation using label propagation based semi-supervised learning. [sent-456, score-0.438]

100 UMND1 : Unsupervised word sense disambiguation using contextual semantic relatedness. [sent-464, score-0.438]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('hij', 0.387), ('sense', 0.309), ('bandwidths', 0.215), ('wsd', 0.188), ('ij', 0.188), ('smoothing', 0.151), ('jcn', 0.151), ('rii', 0.151), ('agirre', 0.144), ('hypotheses', 0.136), ('senses', 0.135), ('density', 0.133), ('disambiguation', 0.129), ('nearby', 0.12), ('recalls', 0.116), ('extrapolated', 0.108), ('lesk', 0.101), ('distributional', 0.101), ('iteration', 0.099), ('context', 0.088), ('hi', 0.087), ('dynamics', 0.084), ('conventional', 0.084), ('metric', 0.08), ('combinatorial', 0.076), ('xi', 0.076), ('extrapolation', 0.076), ('kernels', 0.076), ('soroa', 0.075), ('dx', 0.07), ('tran', 0.07), ('vertices', 0.07), ('reliability', 0.068), ('mccarthy', 0.068), ('hypothesis', 0.066), ('bandwidth', 0.065), ('exotic', 0.065), ('mfs', 0.065), ('pknn', 0.065), ('immediate', 0.063), ('twelve', 0.063), ('responsibility', 0.063), ('wi', 0.062), ('untagged', 0.06), ('eneko', 0.058), ('similarity', 0.057), ('lacalle', 0.057), ('oier', 0.057), ('continuity', 0.057), ('extrapolate', 0.057), ('knowledgebased', 0.057), ('arrow', 0.056), ('denotes', 0.054), ('optimization', 0.054), ('pseudo', 0.054), ('si', 0.054), ('perplexity', 0.054), ('status', 0.053), ('gaussian', 0.05), ('optimal', 0.049), ('squared', 0.048), ('equation', 0.048), ('likelihood', 0.047), ('summation', 0.047), ('upward', 0.047), ('distance', 0.047), ('sufficiently', 0.046), ('conflicting', 0.045), ('probability', 0.045), ('confirm', 0.045), ('target', 0.044), ('reinforce', 0.043), ('intensity', 0.043), ('scheme', 0.043), ('invasive', 0.043), ('loocv', 0.043), ('parzen', 0.043), ('pknnrii', 0.043), ('thickness', 0.043), ('sij', 0.042), ('kulkarni', 0.042), ('intended', 0.042), ('change', 0.04), ('dataset', 0.039), ('neighbor', 0.039), ('space', 0.038), ('shadow', 0.038), ('responsibilities', 0.038), ('cwor', 0.038), ('lkb', 0.038), ('probabilities', 0.038), ('predominant', 0.037), ('unsupervised', 0.036), ('aitor', 0.036), ('lopez', 0.036), ('patwardhan', 0.035), ('poss', 0.035), ('sthe', 0.035), ('official', 0.035), ('navigli', 0.035)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999923 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

Author: Koichi Tanigaki ; Mitsuteru Shiba ; Tatsuji Munaka ; Yoshinori Sagisaka

Abstract: This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.

2 0.27855733 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

Author: Mohammad Taher Pilehvar ; David Jurgens ; Roberto Navigli

Abstract: Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: seman- tic textual similarity, word similarity, and word sense coarsening.

3 0.26304674 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation

Author: Tristan Miller ; Nicolai Erbs ; Hans-Peter Zorn ; Torsten Zesch ; Iryna Gurevych

Abstract: Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and extensible. DKPro WSD abstracts the WSD process in such a way that test corpora, sense inventories, and algorithms can be freely swapped. Its UIMA-based architecture makes it easy to add support for new resources and algorithms. Related tasks such as word sense induction and entity linking are also supported.

4 0.21892568 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context

Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya

Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.

5 0.16620719 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain

Author: Marine Carpuat ; Hal Daume III ; Katharine Henry ; Ann Irvine ; Jagadeesh Jagarlamudi ; Rachel Rudinger

Abstract: Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SENSESPOTTING, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a goldstandard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains.

6 0.12301037 116 acl-2013-Detecting Metaphor by Contextual Analogy

7 0.11508482 113 acl-2013-Derivational Smoothing for Syntactic Distributional Semantics

8 0.11012921 62 acl-2013-Automatic Term Ambiguity Detection

9 0.11009402 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection

10 0.10458881 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

11 0.10385488 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

12 0.099779323 53 acl-2013-Annotation of regular polysemy and underspecification

13 0.094884656 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses

14 0.08849749 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

15 0.084745616 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

16 0.083998412 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

17 0.083977029 238 acl-2013-Measuring semantic content in distributional vectors

18 0.082207121 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction

19 0.082084298 325 acl-2013-Smoothed marginal distribution constraints for language modeling

20 0.078188457 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.213), (1, 0.051), (2, 0.053), (3, -0.156), (4, -0.04), (5, -0.19), (6, -0.136), (7, 0.096), (8, -0.016), (9, -0.05), (10, 0.016), (11, 0.088), (12, -0.111), (13, -0.133), (14, 0.137), (15, 0.045), (16, 0.003), (17, 0.09), (18, -0.042), (19, -0.008), (20, 0.074), (21, -0.065), (22, 0.071), (23, -0.023), (24, -0.044), (25, -0.138), (26, 0.074), (27, -0.009), (28, -0.019), (29, -0.069), (30, 0.103), (31, 0.051), (32, -0.023), (33, 0.092), (34, -0.044), (35, -0.055), (36, 0.046), (37, -0.0), (38, -0.069), (39, -0.015), (40, -0.026), (41, -0.044), (42, -0.027), (43, -0.088), (44, 0.067), (45, -0.01), (46, 0.023), (47, 0.08), (48, -0.075), (49, -0.044)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96453673 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

Author: Koichi Tanigaki ; Mitsuteru Shiba ; Tatsuji Munaka ; Yoshinori Sagisaka

Abstract: This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.

2 0.89593685 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context

Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya

Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.

3 0.86256671 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation

Author: Tristan Miller ; Nicolai Erbs ; Hans-Peter Zorn ; Torsten Zesch ; Iryna Gurevych

Abstract: Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and extensible. DKPro WSD abstracts the WSD process in such a way that test corpora, sense inventories, and algorithms can be freely swapped. Its UIMA-based architecture makes it easy to add support for new resources and algorithms. Related tasks such as word sense induction and entity linking are also supported.

4 0.81436294 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

Author: Mohammad Taher Pilehvar ; David Jurgens ; Roberto Navigli

Abstract: Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: seman- tic textual similarity, word similarity, and word sense coarsening.

5 0.76988095 53 acl-2013-Annotation of regular polysemy and underspecification

Author: Hector Martinez Alonso ; Bolette Sandford Pedersen ; Nuria Bel

Abstract: We present the result of an annotation task on regular polysemy for a series of semantic classes or dot types in English, Danish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods: majority voting with a theory-compliant backoff strategy, and MACE, an unsupervised system to choose the most likely sense from all the annotations.

6 0.75028539 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain

7 0.65866947 62 acl-2013-Automatic Term Ambiguity Detection

8 0.62066817 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection

9 0.61226761 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses

10 0.58607203 234 acl-2013-Linking and Extending an Open Multilingual Wordnet

11 0.55319262 116 acl-2013-Detecting Metaphor by Contextual Analogy

12 0.52975786 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

13 0.52921349 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri

14 0.51802599 12 acl-2013-A New Set of Norms for Semantic Relatedness Measures

15 0.50948292 262 acl-2013-Offspring from Reproduction Problems: What Replication Failure Teaches Us

16 0.50350338 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora

17 0.49469891 371 acl-2013-Unsupervised joke generation from big data

18 0.47212148 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords

19 0.46803024 281 acl-2013-Post-Retrieval Clustering Using Third-Order Similarity Measures

20 0.46493846 104 acl-2013-DKPro Similarity: An Open Source Framework for Text Similarity


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(0, 0.085), (6, 0.031), (11, 0.07), (17, 0.06), (24, 0.029), (26, 0.027), (28, 0.014), (35, 0.074), (42, 0.03), (48, 0.058), (56, 0.013), (64, 0.018), (70, 0.046), (88, 0.218), (90, 0.038), (95, 0.089)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.94686753 327 acl-2013-Sorani Kurdish versus Kurmanji Kurdish: An Empirical Comparison

Author: Kyumars Sheykh Esmaili ; Shahin Salavati

Abstract: Resource scarcity along with diversity– both in dialect and script–are the two primary challenges in Kurdish language processing. In this paper we aim at addressing these two problems by (i) building a text corpus for Sorani and Kurmanji, the two main dialects of Kurdish, and (ii) highlighting some of the orthographic, phonological, and morphological differences between these two dialects from statistical and rule-based perspectives.

2 0.93780816 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation

Author: Srinivasan Janarthanam ; Oliver Lemon ; Phil Bartie ; Tiphaine Dalmas ; Anna Dickinson ; Xingkun Liu ; William Mackaness ; Bonnie Webber

Abstract: We present a city navigation and tourist information mobile dialogue app with integrated question-answering (QA) and geographic information system (GIS) modules that helps pedestrian users to navigate in and learn about urban environments. In contrast to existing mobile apps which treat these problems independently, our Android app addresses the problem of navigation and touristic questionanswering in an integrated fashion using a shared dialogue context. We evaluated our system in comparison with Samsung S-Voice (which interfaces to Google navigation and Google search) with 17 users and found that users judged our system to be significantly more interesting to interact with and learn from. They also rated our system above Google search (with the Samsung S-Voice interface) for tourist information tasks.

3 0.93371451 106 acl-2013-Decentralized Entity-Level Modeling for Coreference Resolution

Author: Greg Durrett ; David Hall ; Dan Klein

Abstract: Efficiently incorporating entity-level information is a challenge for coreference resolution systems due to the difficulty of exact inference over partitions. We describe an end-to-end discriminative probabilistic model for coreference that, along with standard pairwise features, enforces structural agreement constraints between specified properties of coreferent mentions. This model can be represented as a factor graph for each document that admits efficient inference via belief propagation. We show that our method can use entity-level information to outperform a basic pairwise system.

4 0.91820848 41 acl-2013-Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation

Author: Or Biran ; Kathleen McKeown

Abstract: We present a reformulation of the word pair features typically used for the task of disambiguating implicit relations in the Penn Discourse Treebank. Our word pair features achieve significantly higher performance than the previous formulation when evaluated without additional features. In addition, we present results for a full system using additional features which achieves close to state of the art performance without resorting to gold syntactic parses or to context outside the relation.

5 0.9171865 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts

Author: Ryo Nagata ; Edward Whittaker

Abstract: Mother tongue interference is the phenomenon where linguistic systems of a mother tongue are transferred to another language. Although there has been plenty of work on mother tongue interference, very little is known about how strongly it is transferred to another language and about what relation there is across mother tongues. To address these questions, this paper explores and visualizes mother tongue interference preserved in English texts written by Indo-European language speakers. This paper further explores linguistic features that explain why certain relations are preserved in English writing, and which contribute to related tasks such as native language identification.

6 0.90831047 136 acl-2013-Enhanced and Portable Dependency Projection Algorithms Using Interlinear Glossed Text

same-paper 7 0.89320922 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD

8 0.88563341 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis

9 0.8174237 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution

10 0.79722261 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context

11 0.76658732 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation

12 0.74517196 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification

13 0.74248713 130 acl-2013-Domain-Specific Coreference Resolution with Lexicalized Features

14 0.7407918 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning

15 0.73621154 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction

16 0.73063326 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity

17 0.7251209 292 acl-2013-Question Classification Transfer

18 0.71799272 97 acl-2013-Cross-lingual Projections between Languages from Different Families

19 0.71727318 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering

20 0.71503526 382 acl-2013-Variational Inference for Structured NLP Models