acl acl2013 acl2013-258 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
Reference: text
sentIndex sentText sentNum sentScore
1 Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). [sent-7, score-0.405]
2 But the performance of this approach is poor on verbs with accuracy level at 25-38%. [sent-8, score-0.143]
3 We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. [sent-9, score-0.292]
4 An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. [sent-10, score-0.119]
5 On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization. [sent-11, score-0.24]
6 1 Introduction The importance of unsupervised approaches in WSD is well known, because they do not need sense tagged corpus. [sent-12, score-0.451]
7 In multilingual unsupervised scenario, either comparable or parallel corpora have been used by past researchers for disambiguation (Dagan et al. [sent-13, score-0.251]
8 , (201 1) has shown that, in comparable corpora, sense distribution of a word in one language can be estimated using the raw counts of translations of the target words in the other language; such sense distributions contribute to the ranking of senses. [sent-18, score-0.92]
9 Since translations can themselves be ambiguous, Expectation Maximization based formulation is used to determine the sense frequencies. [sent-19, score-0.621]
10 Using this approach every instance of a word is tagged with the most probable sense according to the algorithm. [sent-20, score-0.433]
11 That would do, had the accuracy of disambiguation on verbs not been poor 25-35%. [sent-22, score-0.242]
12 This motivated us to propose and investigate use of context in the formulation by Khapra et al. [sent-23, score-0.307]
13 For example consider the sentence in chemistry domain, “Keep the beaker on the flat table. [sent-25, score-0.056]
14 ” In this sentence, the target word ‘table’ will be tagged as ‘the tabular array’ sense since it is dominant in the chemistry domain by their algorithm. [sent-26, score-0.537]
15 But its actual sense is ‘a piece of furniture’ which can be captured only if context is taken into consideration. [sent-27, score-0.455]
16 In our approach we tackle this problem by taking into account the words from the context of the target word. [sent-28, score-0.122]
17 We use semantic relatedness between translations of the target word and those of its context words to determine its sense. [sent-29, score-0.477]
18 Verb disambiguation has proved to be extremely difficult (Jean, 2004), because of high degree of polysemy (Khapra et al. [sent-30, score-0.137]
19 , 2010), too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. [sent-31, score-0.584]
20 On the other hand, verb disambiguation is very important for NLP applications like MT and IR. [sent-32, score-0.184]
21 Our approach has shown significant improvement in verb accuracy as compared to Khapra’s (201 1) approach. [sent-33, score-0.119]
22 Section 4 explains the modified EM formulation using context and semantic relatedness. [sent-37, score-0.361]
23 They are directly dependent on availability of good amount of sense tagged data. [sent-47, score-0.389]
24 Hence, unsupervised WSD approaches (Diab and Resnik, 2002; Kaji and Morimoto, 2002; Mihalcea et al. [sent-49, score-0.062]
25 It uses EM algorithm for estimating sense distributions in comparable corpora. [sent-54, score-0.352]
26 Every polysemous word is disambiguated using the raw counts of its translations in different senses. [sent-55, score-0.273]
27 In this dictionary, synsets are linked, and after that the words inside the synsets are also linked. [sent-58, score-0.106]
28 For example, for the concept of ‘boy’, the Hindi synset {ladakaa, balak, bachhaa} is linked with tshyen sMetar {alathdi synset {mulagaa, poragaa, por}. [sent-59, score-0.235]
29 Tinhdei word ‘ladakaa’ which is its exact lexical substitution. [sent-61, score-0.044]
30 Suppose words u in language L1 and v in language L2 are translations of each other and their senses are required. [sent-62, score-0.303]
31 crosslinksL1 (a, SL2) is the set of possible tcrraonssslalitinonkss of the word ‘a’ from language L1 to L2 in the sense SL2. [sent-64, score-0.339]
32 (SL1) means the linked synset of the sense SL1 in L2. [sent-65, score-0.43]
33 In both the steps, we estimate sense distribution in one language using raw counts of translations in another language. [sent-67, score-0.524]
34 But this approach has following limitations: Poor performance on verbs: This approach gives poor performance on verbs (25%-38%). [sent-68, score-0.109]
35 Same sense throughout the corpus: Every occurrence of a word is tagged with the single sense found by the algorithm, throughout the corpus. [sent-70, score-0.796]
36 Closed loop of translations: This formulation does not work for some common words which have the same translations in all senses. [sent-71, score-0.326]
37 For example, the verb ‘karna’ in Hindi has two different senses in the corpus viz. [sent-72, score-0.247]
38 The word ‘karne’ also back translates to ‘karna’ in Hindi through both its senses. [sent-75, score-0.077]
39 In this case, the formulation works out as follows: The probabilities are initialized uniformly. [sent-76, score-0.185]
40 Now, in first i|tkearartinoan) )th =e sense 2o|kf ‘akranrnae)’ =will 0 . [sent-79, score-0.295]
41 To address these problems we have introduced contextual clues in their formulation by using semantic relatedness. [sent-86, score-0.272]
42 4 Modified Bilingual EM approach We introduce context in the EM formulation stated above and treat the context as a bag of words. [sent-87, score-0.464]
43 We assume that each word in the context influences the sense of the target word independently. [sent-88, score-0.505]
44 Hence, p(S|w,C) = Y p(S|w,ci) cYi∈C where, w is the target word, S is one of the candidate synsets of w, C is the set of words in context (sentence in our case) and ci is one of the context words. [sent-89, score-0.297]
45 Suppose we would have sense tagged data, p(S|w, c) could have been computed as: p(S|w,c) =##(S(,ww,c,)c) But since the sense tagged corpus is not available, we cannot find #(S, w, c) from the corpus directly. [sent-90, score-0.778]
46 However, we can estimate it using the comparable corpus in other language. [sent-91, score-0.057]
47 Here, we assume that given a word and its context word in language L1, the sense distribution in L1 will be same as that in L2 given the translation of a word and the translation of its context word in L2. [sent-92, score-0.715]
48 But these translations can be ambiguous, hence we can use Expectation Maximization approach similar to (Khapra et al. [sent-93, score-0.187]
49 σ(v, b) is the semantic relatedness between the senses of v and senses of b. [sent-95, score-0.494]
50 Since, v and b go over all possible translations of u and a respectively. [sent-96, score-0.141]
51 σ(v, b) has the effect of indirectly capturing the semantic similarity between the senses of u and a. [sent-97, score-0.216]
52 A symetric formulation in the M-step below takes the computation back from language L2 to language L1. [sent-98, score-0.218]
53 The semantic relatedness comes as an additional weighing factor, capturing context, in the probablistic score. [sent-99, score-0.202]
54 Note how the computation moves back and forth between L1 and L2 considering translations of both target words and their context words. [sent-101, score-0.296]
55 In the above formulation, we could have considered the term #(word, context word) (i. [sent-102, score-0.122]
56 , the co-occurrence count of the translations of the word and the context word) instead of σ(word, context word). [sent-104, score-0.429]
57 But it is very unlikely that every translation of a word will co-occur with 540 AlgorithmHIN-HEALTHMAR-HEALTH NOUNADVADJVERBOverallNOUNADVADJVERBOverall ERWEBM FS-C56 39023. [sent-105, score-0.044]
58 70457357 Table 1: Comparison(F-Score) of EM-C and EM for Health domain AlgorithmHIN-TOURISMMAR-TOURISM NOUNADVADJVERBOverallNOUNADVADJVERBOverall ERWEBM FS-C6 321 3. [sent-115, score-0.048]
59 69 47057 Table 2: Comparison(F-Score) of EM-C and EM for Tourism domain every translation of its context word considerable number of times. [sent-125, score-0.214]
60 This term may make sense only if we have arbitrarily large comparable corpus in the other language. [sent-126, score-0.352]
61 1 Computation of semantic relatedness The semantic relatedness is computed by taking the inverse of the length of the shortest path among two senses in the wordnet graph (Pedersen et al. [sent-128, score-0.502]
62 Sense scores thus obtained are used to disambiguate all words in the corpus. [sent-133, score-0.036]
63 We consider all the content words from the context for disambiguation of a word. [sent-134, score-0.221]
64 The winner sense is the one with the highest probability. [sent-135, score-0.295]
65 5 Experimental setup We have used freely available in-domain comparable corpora1 in Hindi and Marathi languages. [sent-136, score-0.057]
66 These corpora are available for health and tourism domains. [sent-137, score-0.204]
67 Results clearly show that EM-C outperforms EM especially in case of verbs in all language-domain pairs. [sent-153, score-0.061]
68 In health domain, verb accuracy is increased by 35% for Hindi and 17% for Marathi, while in tourism domain, it is increased by 23% for Hindi and 17% for Marathi. [sent-154, score-0.391]
69 Since there are less number of verbs, the improved accuracy is not directly reflected in the overall performance. [sent-160, score-0.034]
70 7 Error analysis and phenomena study Our approach tags all the instances of a word de- pending on its context as apposed to basic EM approach. [sent-161, score-0.166]
71 For example, consider the following sentence from the tourism domain: pt ? [sent-162, score-0.215]
72 T। । (vaha patte khel rahe the) (They were playing cards/leaves) vh Here, the word pt ? [sent-166, score-0.535]
73 In tourism domain, the ‘leaf’ sense is more dominant. [sent-169, score-0.433]
74 The true sense is captured only if context is considered. [sent-173, score-0.455]
75 l) endorses the ‘playing card’ sense of the word ptA. [sent-176, score-0.339]
76 This phenomenon is captured by our approach through semantic relatedness. [sent-177, score-0.092]
77 T। । (vaha ped ke niche patte khel rahe the) vh (They were playing cards/leaves below the tree) Here, two strong context words ? [sent-186, score-0.568]
78 l (play) are influencing the sense of the word pt ? [sent-188, score-0.416]
79 This problem occurred because we considered the context as a bag of words. [sent-194, score-0.157]
80 This problem can be solved by considering the semantic structure of the sentence. [sent-195, score-0.054]
81 In this example, the word ptA (leaf/playing card) is the subject of the verb х ? [sent-196, score-0.129]
82 8 Conclusion and Future Work We have presented a context aware EM formulation building on the framework of Khapra et al (201 1). [sent-201, score-0.307]
83 Our formulation solves the problems of “inhibited progress due to lack oftranslation diversity” and “uniform sense assignment, irrespective of context” that the previous EM based formulation of Khapra et al. [sent-202, score-0.665]
84 More importantly our accuracy on verbs is much higher and more than the state of the art, to the best of our knowledge. [sent-204, score-0.095]
85 Future directions also point to usage of semantic role clues, investigation of familialy apart pair of languages and effect of variation of measures of semantic relatedness. [sent-206, score-0.108]
86 An unsupervised method for word sense tagging using parallel corpora. [sent-215, score-0.401]
87 Unsupervised word sense disambiguation using bilingual comparable corpora. [sent-224, score-0.552]
88 All words domain adapted wsd: Finding a middle ground between supervision and unsupervision. [sent-230, score-0.048]
89 It takes two to tango: A bilingual unsupervised approach for estimating sense distributions using expectation maximization. [sent-235, score-0.47]
90 Supervised word sense disambiguation with support vector machines and multiple knowledge sources. [sent-243, score-0.438]
91 Pagerank on semantic networks, with application to word sense disambiguation. [sent-252, score-0.393]
92 Integrating multiple knowledge sources to disambiguate word sense: an exemplar-based approach. [sent-260, score-0.08]
93 Exploiting parallel texts to produce a multilingual sense tagged corpus for word sense disambiguation. [sent-272, score-0.761]
wordName wordTfidf (topN-words)
[('khapra', 0.368), ('sense', 0.295), ('karna', 0.292), ('wsd', 0.188), ('formulation', 0.185), ('em', 0.183), ('senses', 0.162), ('pta', 0.161), ('karne', 0.146), ('xxp', 0.146), ('playing', 0.142), ('translations', 0.141), ('tourism', 0.138), ('card', 0.133), ('hindi', 0.132), ('context', 0.122), ('relatedness', 0.116), ('lna', 0.109), ('synset', 0.1), ('disambiguation', 0.099), ('tagged', 0.094), ('marathi', 0.089), ('pushpak', 0.087), ('verb', 0.085), ('leaf', 0.085), ('px', 0.083), ('pt', 0.077), ('kaji', 0.077), ('xp', 0.074), ('mitesh', 0.074), ('erwebm', 0.073), ('kaa', 0.073), ('khel', 0.073), ('ladakaa', 0.073), ('mulagaa', 0.073), ('nounadvadjverboverallnounadvadjverboverall', 0.073), ('patte', 0.073), ('rahe', 0.073), ('vaha', 0.073), ('health', 0.066), ('lefever', 0.064), ('sudha', 0.064), ('mohanty', 0.064), ('morimoto', 0.064), ('play', 0.063), ('unsupervised', 0.062), ('verbs', 0.061), ('comparable', 0.057), ('bilingual', 0.057), ('chemistry', 0.056), ('expectation', 0.056), ('raw', 0.054), ('semantic', 0.054), ('rh', 0.053), ('vh', 0.053), ('synsets', 0.053), ('diab', 0.053), ('poor', 0.048), ('domain', 0.048), ('hence', 0.046), ('specia', 0.046), ('bhattacharyya', 0.045), ('word', 0.044), ('grained', 0.043), ('ka', 0.042), ('inter', 0.041), ('morristown', 0.04), ('polysemy', 0.038), ('captured', 0.038), ('pedersen', 0.038), ('jean', 0.037), ('disambiguate', 0.036), ('bag', 0.035), ('maximization', 0.035), ('linked', 0.035), ('fine', 0.035), ('counts', 0.034), ('dagan', 0.034), ('throughout', 0.034), ('accuracy', 0.034), ('increased', 0.034), ('back', 0.033), ('clues', 0.033), ('multilingual', 0.033), ('hiroyuki', 0.032), ('furniture', 0.032), ('hoste', 0.032), ('salil', 0.032), ('anup', 0.032), ('cartography', 0.032), ('eronis', 0.032), ('hyperlex', 0.032), ('saurabh', 0.032), ('sohoney', 0.032), ('volpe', 0.032), ('niche', 0.032), ('probablistic', 0.032), ('kalele', 0.032), ('pande', 0.032), ('roadmap', 0.032)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
2 0.23423588 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation
Author: Tristan Miller ; Nicolai Erbs ; Hans-Peter Zorn ; Torsten Zesch ; Iryna Gurevych
Abstract: Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and extensible. DKPro WSD abstracts the WSD process in such a way that test corpora, sense inventories, and algorithms can be freely swapped. Its UIMA-based architecture makes it easy to add support for new resources and algorithms. Related tasks such as word sense induction and entity linking are also supported.
3 0.22197095 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
Author: Mohammad Taher Pilehvar ; David Jurgens ; Roberto Navigli
Abstract: Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: seman- tic textual similarity, word similarity, and word sense coarsening.
4 0.21892568 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
Author: Koichi Tanigaki ; Mitsuteru Shiba ; Tatsuji Munaka ; Yoshinori Sagisaka
Abstract: This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.
5 0.16770864 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
Author: Kashyap Popat ; Balamurali A.R ; Pushpak Bhattacharyya ; Gholamreza Haffari
Abstract: Expensive feature engineering based on WordNet senses has been shown to be useful for document level sentiment classification. A plausible reason for such a performance improvement is the reduction in data sparsity. However, such a reduction could be achieved with a lesser effort through the means of syntagma based word clustering. In this paper, the problem of data sparsity in sentiment analysis, both monolingual and cross-lingual, is addressed through the means of clustering. Experiments show that cluster based data sparsity reduction leads to performance better than sense based classification for sentiment analysis at document level. Similar idea is applied to Cross Lingual Sentiment Analysis (CLSA), and it is shown that reduction in data sparsity (after translation or bilingual-mapping) produces accuracy higher than Machine Translation based CLSA and sense based CLSA.
6 0.15500522 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
7 0.14411363 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora
8 0.13164936 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
9 0.12174355 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
10 0.10737292 62 acl-2013-Automatic Term Ambiguity Detection
11 0.10506759 116 acl-2013-Detecting Metaphor by Contextual Analogy
12 0.10332312 53 acl-2013-Annotation of regular polysemy and underspecification
13 0.10097215 92 acl-2013-Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages
14 0.099153794 12 acl-2013-A New Set of Norms for Semantic Relatedness Measures
15 0.085885249 234 acl-2013-Linking and Extending an Open Multilingual Wordnet
16 0.076392256 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
17 0.072776757 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords
18 0.069894567 374 acl-2013-Using Context Vectors in Improving a Machine Translation System with Bridge Language
19 0.069120519 154 acl-2013-Extracting bilingual terminologies from comparable corpora
20 0.068917118 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
topicId topicWeight
[(0, 0.174), (1, 0.031), (2, 0.08), (3, -0.112), (4, -0.047), (5, -0.188), (6, -0.186), (7, 0.098), (8, 0.067), (9, -0.07), (10, 0.027), (11, 0.066), (12, -0.118), (13, -0.049), (14, 0.129), (15, -0.019), (16, 0.018), (17, 0.066), (18, -0.068), (19, -0.012), (20, 0.04), (21, -0.098), (22, 0.039), (23, -0.055), (24, -0.042), (25, -0.171), (26, 0.025), (27, -0.05), (28, 0.035), (29, -0.052), (30, 0.136), (31, 0.021), (32, 0.061), (33, 0.103), (34, -0.065), (35, -0.05), (36, 0.008), (37, 0.02), (38, -0.022), (39, -0.027), (40, -0.031), (41, -0.103), (42, 0.037), (43, -0.053), (44, 0.079), (45, 0.056), (46, 0.095), (47, 0.104), (48, -0.107), (49, 0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.96385479 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
2 0.85619903 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
Author: Koichi Tanigaki ; Mitsuteru Shiba ; Tatsuji Munaka ; Yoshinori Sagisaka
Abstract: This paper proposes a novel smoothing model with a combinatorial optimization scheme for all-words word sense disambiguation from untagged corpora. By generalizing discrete senses to a continuum, we introduce a smoothing in context-sense space to cope with data-sparsity resulting from a large variety of linguistic context and sense, as well as to exploit senseinterdependency among the words in the same text string. Through the smoothing, all the optimal senses are obtained at one time under maximum marginal likelihood criterion, by competitive probabilistic kernels made to reinforce one another among nearby words, and to suppress conflicting sense hypotheses within the same word. Experimental results confirmed the superiority of the proposed method over conventional ones by showing the better performances beyond most-frequent-sense baseline performance where none of SemEval2 unsupervised systems reached.
3 0.84846777 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation
Author: Tristan Miller ; Nicolai Erbs ; Hans-Peter Zorn ; Torsten Zesch ; Iryna Gurevych
Abstract: Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and extensible. DKPro WSD abstracts the WSD process in such a way that test corpora, sense inventories, and algorithms can be freely swapped. Its UIMA-based architecture makes it easy to add support for new resources and algorithms. Related tasks such as word sense induction and entity linking are also supported.
4 0.8342129 53 acl-2013-Annotation of regular polysemy and underspecification
Author: Hector Martinez Alonso ; Bolette Sandford Pedersen ; Nuria Bel
Abstract: We present the result of an annotation task on regular polysemy for a series of semantic classes or dot types in English, Danish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods: majority voting with a theory-compliant backoff strategy, and MACE, an unsupervised system to choose the most likely sense from all the annotations.
5 0.76025182 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
Author: Marine Carpuat ; Hal Daume III ; Katharine Henry ; Ann Irvine ; Jagadeesh Jagarlamudi ; Rachel Rudinger
Abstract: Words often gain new senses in new domains. Being able to automatically identify, from a corpus of monolingual text, which word tokens are being used in a previously unseen sense has applications to machine translation and other tasks sensitive to lexical semantics. We define a task, SENSESPOTTING, in which we build systems to spot tokens that have new senses in new domain text. Instead of difficult and expensive annotation, we build a goldstandard by leveraging cheaply available parallel corpora, targeting our approach to the problem of domain adaptation for machine translation. Our system is able to achieve F-measures of as much as 80%, when applied to word types it has never seen before. Our approach is based on a large set of novel features that capture varied aspects of how words change when used in new domains.
6 0.70743018 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
7 0.69491506 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
8 0.68901509 62 acl-2013-Automatic Term Ambiguity Detection
9 0.67491627 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
10 0.61954719 234 acl-2013-Linking and Extending an Open Multilingual Wordnet
11 0.52738911 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
12 0.51502669 116 acl-2013-Detecting Metaphor by Contextual Analogy
13 0.51499027 92 acl-2013-Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages
14 0.49114653 93 acl-2013-Context Vector Disambiguation for Bilingual Lexicon Extraction from Comparable Corpora
15 0.45655099 198 acl-2013-IndoNet: A Multilingual Lexical Knowledge Network for Indian Languages
16 0.45320043 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
17 0.4463701 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords
18 0.40778822 371 acl-2013-Unsupervised joke generation from big data
19 0.384646 293 acl-2013-Random Walk Factoid Annotation for Collective Discourse
20 0.36026564 262 acl-2013-Offspring from Reproduction Problems: What Replication Failure Teaches Us
topicId topicWeight
[(0, 0.064), (6, 0.022), (11, 0.057), (15, 0.028), (24, 0.037), (26, 0.059), (35, 0.079), (42, 0.043), (48, 0.055), (56, 0.269), (70, 0.023), (88, 0.108), (90, 0.025), (95, 0.066)]
simIndex simValue paperId paperTitle
1 0.87001723 108 acl-2013-Decipherment
Author: Kevin Knight
Abstract: The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. This tutorial explores connections between early decipherment research and today’s NLP work. We cover classic military and diplomatic ciphers, automatic decipherment algorithms, unsolved ciphers, language translation as decipherment, and analyzing ancient writing as decipherment. 1 Tutorial Overview The first natural language processing systems had a straightforward goal: decipher coded messages sent by the enemy. Sixty years later, we have many more applications, including web search, question answering, summarization, speech recognition, and language translation. This tutorial explores connections between early decipherment research and today’s NLP work. We find that many ideas from the earlier era have become core to the field, while others still remain to be picked up and developed. We first cover classic military and diplomatic cipher types, including complex substitution ciphers implemented in the first electro-mechanical encryption machines. We look at mathematical tools (language recognition, frequency counting, smoothing) developed to decrypt such ciphers on proto-computers. We show algorithms and extensive empirical results for solving different types of ciphers, and we show the role of algorithms in recent decipherments of historical documents. We then look at how foreign language can be viewed as a code for English, a concept developed by Alan Turing and Warren Weaver. We describe recently published work on building automatic translation systems from non-parallel data. We also demonstrate how some of the same algorithmic tools can be applied to natural language tasks like part-of-speech tagging and word alignment. Turning back to historical ciphers, we explore a number of unsolved ciphers, giving results of initial computer experiments on several of them. Finally, we look briefly at writing as a way to encipher phoneme sequences, covering ancient scripts and modern applications. 2 Outline 1. Classical military/diplomatic ciphers (15 minutes) • 60 cipher types (ACA) • Ciphers vs. codes • Enigma cipher: the mother of natural language processing computer analysis of text language recognition Good-Turing smoothing – – – 2. Foreign language as a code (10 minutes) • • Alan Turing’s ”Thinking Machines” Warren Weaver’s Memorandum 3. Automatic decipherment (55 minutes) • Cipher type detection • Substitution ciphers (simple, homophonic, polyalphabetic, etc) plaintext language recognition ∗ how much plaintext knowledge is – nheowede mdu 3 Proce diSnogfsia, of B thuleg5a r1iast, A Anungu aslt M4-9e t2in01g3 o.f ? tc he20 A1s3so Acsiasoticoinat fio rn C fo rm Cpoumtaptuiotantaioln Lainlg Luinisgtuicis ,tpi casges 3–4, – ∗ index of coincidence, unicity distance, oanf dc oointhceidr measures navigating a difficult search space ∗ frequencies of letters and words ∗ pattern words and cribs ∗ pElMin,g ILP, Bayesian models, sam– recent decipherments ∗ Jefferson cipher, Copiale cipher, cJievfifle war ciphers, n Caovaplia Enigma • • • • Application to part-of-speech tagging, Awopprdli alignment Application to machine translation withoAuptp parallel t teoxtm Parallel development of cryptography aPnarda ltrleanls dlaetvioenlo Recently released NSA internal nReewcselnettlyter (1974-1997) 4. *** Break *** (30 minutes) 5. Unsolved ciphers (40 minutes) • Zodiac 340 (1969), including computatZioodnaial cw 3o4r0k • Voynich Manuscript (early 1400s), including computational ewarolyrk • Beale (1885) • Dorabella (1897) • Taman Shud (1948) • Kryptos (1990), including computatKiorynaplt owsor (k1 • McCormick (1999) • Shoeboxes in attics: DuPonceau jour- nal, Finnerana, SYP, Mopse, diptych 6. Writing as a code (20 minutes) • Does writing encode ideas, or does it encDoodees phonemes? • Ancient script decipherment Egyptian hieroglyphs Linear B Mayan glyphs – – – – wUgoarkritic, including computational Chinese N ¨ushu, including computational work • Automatic phonetic decipherment • Application to transliteration 7. Undeciphered writing systems (15 minutes) • Indus Valley Script (3300BC) • Linear A (1900BC) • Phaistos disc (1700BC?) • Rongorongo (1800s?) – 8. Conclusion and further questions (15 minutes) 3 About the Presenter Kevin Knight is a Senior Research Scientist and Fellow at the Information Sciences Institute of the University of Southern California (USC), and a Research Professor in USC’s Computer Science Department. He received a PhD in computer science from Carnegie Mellon University and a bachelor’s degree from Harvard University. Professor Knight’s research interests include natural language processing, machine translation, automata theory, and decipherment. In 2001, he co-founded Language Weaver, Inc., and in 2011, he served as President of the Association for Computational Linguistics. Dr. Knight has taught computer science courses at USC for more than fifteen years and co-authored the widely adopted textbook Artificial Intelligence. 4
2 0.86134487 190 acl-2013-Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs
Author: Adam Vogel ; Christopher Potts ; Dan Jurafsky
Abstract: Conversational implicatures involve reasoning about multiply nested belief structures. This complexity poses significant challenges for computational models of conversation and cognition. We show that agents in the multi-agent DecentralizedPOMDP reach implicature-rich interpretations simply as a by-product of the way they reason about each other to maximize joint utility. Our simulations involve a reference game of the sort studied in psychology and linguistics as well as a dynamic, interactional scenario involving implemented artificial agents.
same-paper 3 0.79146516 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
4 0.7805019 65 acl-2013-BRAINSUP: Brainstorming Support for Creative Sentence Generation
Author: Gozde Ozbal ; Daniele Pighin ; Carlo Strapparava
Abstract: Daniele Pighin Google Inc. Z ¨urich, Switzerland danie le . pighin@ gmai l com . Carlo Strapparava FBK-irst Trento, Italy st rappa@ fbk . eu you”. As another scenario, creative sentence genWe present BRAINSUP, an extensible framework for the generation of creative sentences in which users are able to force several words to appear in the sentences and to control the generation process across several semantic dimensions, namely emotions, colors, domain relatedness and phonetic properties. We evaluate its performance on a creative sentence generation task, showing its capability of generating well-formed, catchy and effective sentences that have all the good qualities of slogans produced by human copywriters.
5 0.73832643 178 acl-2013-HEADY: News headline abstraction through event pattern clustering
Author: Enrique Alfonseca ; Daniele Pighin ; Guillermo Garrido
Abstract: This paper presents HEADY: a novel, abstractive approach for headline generation from news collections. From a web-scale corpus of English news, we mine syntactic patterns that a Noisy-OR model generalizes into event descriptions. At inference time, we query the model with the patterns observed in an unseen news collection, identify the event that better captures the gist of the collection and retrieve the most appropriate pattern to generate a headline. HEADY improves over a state-of-theart open-domain title abstraction method, bridging half of the gap that separates it from extractive methods using humangenerated titles in manual evaluations, and performs comparably to human-generated headlines as evaluated with ROUGE.
6 0.58098584 299 acl-2013-Reconstructing an Indo-European Family Tree from Non-native English Texts
7 0.57752073 41 acl-2013-Aggregated Word Pair Features for Implicit Discourse Relation Disambiguation
8 0.57494152 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
9 0.57079673 136 acl-2013-Enhanced and Portable Dependency Projection Algorithms Using Interlinear Glossed Text
10 0.57064217 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
11 0.56361699 327 acl-2013-Sorani Kurdish versus Kurmanji Kurdish: An Empirical Comparison
12 0.56098318 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
13 0.54561132 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
14 0.54295701 131 acl-2013-Dual Training and Dual Prediction for Polarity Classification
15 0.54148877 196 acl-2013-Improving pairwise coreference models through feature space hierarchy learning
16 0.54107815 318 acl-2013-Sentiment Relevance
17 0.53795266 141 acl-2013-Evaluating a City Exploration Dialogue System with Integrated Question-Answering and Pedestrian Navigation
18 0.53751004 2 acl-2013-A Bayesian Model for Joint Unsupervised Induction of Sentiment, Aspect and Discourse Representations
19 0.53485179 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords
20 0.5337671 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation