acl acl2013 acl2013-366 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Kavitha Rajan
Abstract: Natural language can be easily understood by everyone irrespective of their differences in age or region or qualification. The existence of a conceptual base that underlies all natural languages is an accepted claim as pointed out by Schank in his Conceptual Dependency (CD) theory. Inspired by the CD theory and theories in Indian grammatical tradition, we propose a new set of meaning primitives in this paper. We claim that this new set of primitives captures the meaning inherent in verbs and help in forming an inter-lingual and computable ontological classification of verbs. We have identified seven primitive overlapping verb senses which substantiate our claim. The percentage of coverage of these primitives is 100% for all verbs in Sanskrit and Hindi and 3750 verbs in English. 1
Reference: text
sentIndex sentText sentNum sentScore
1 The existence of a conceptual base that underlies all natural languages is an accepted claim as pointed out by Schank in his Conceptual Dependency (CD) theory. [sent-9, score-0.385]
2 Inspired by the CD theory and theories in Indian grammatical tradition, we propose a new set of meaning primitives in this paper. [sent-10, score-0.523]
3 We claim that this new set of primitives captures the meaning inherent in verbs and help in forming an inter-lingual and computable ontological classification of verbs. [sent-11, score-0.903]
4 We have identified seven primitive overlapping verb senses which substantiate our claim. [sent-12, score-0.884]
5 The percentage of coverage of these primitives is 100% for all verbs in Sanskrit and Hindi and 3750 verbs in English. [sent-13, score-0.927]
6 Looking at the ease to learn and communicate in and across natural languages, the claim of existence of interlingual conceptual base (Schank, 1972) seems plausible . [sent-15, score-0.346]
7 Conceptual Dependency (CD) theory tried to represent a conceptual base using a small set of meaning primitives. [sent-16, score-0.423]
8 To achieve this goal, they put forward a proposal consisting of a small set of 12 primitive actions, a set of dependencies which connects the primitive actions with each other and with their actors, objects, instruments, etc. [sent-17, score-0.509]
9 Their claim was that this small set of representational elements could be used to produce a canonical form for sentences in English as well as other natural languages. [sent-18, score-0.088]
10 Representational theories like Scripts, Plans, Goals and Understanding(SPGU) representations (Schank and Abelson, 1977) were developed from the CD theory. [sent-19, score-0.069]
11 None of the descendant theories of CD could focus on the notion of 'primitives' and the idea faded in the subsequent works. [sent-20, score-0.069]
12 Through our work, we put forward a set of seven meaning primitives and claim that the permutation/combination of these seven meaning primitives along with ontological attributes is sufficient to develop a computational model for meaning representation across languages. [sent-22, score-1.448]
13 This paper looks at the Conceptual Dependency Theory created by Roger Schank (Schank, 1973; Schank, 1975) and compares it with theories in Indian grammatical tradition. [sent-23, score-0.069]
14 We conclude by introducing the small set of meaning primitives which we have found to cover all verbs in Indian languages like Sanskrit, Hindi and almost all verbs in English. [sent-26, score-1.06]
15 2 Conceptual Dependency According to Schank, linguistic and situational contexts in which a sentence is uttered is important for understanding the meaning of that sentence. [sent-27, score-0.133]
16 The CD theory was developed to create a theory of human natural language understanding. [sent-28, score-0.124]
17 The initial premise of the theory 59 Sofia, BulPgraoricea,ed Ainugguss otf 4 t-h9e 2 A0C13L. [sent-29, score-0.097]
18 According to the theory, during communication, to-and-fro mapping happens between linguistic structures and the conceptual base through concepts. [sent-32, score-0.228]
19 It is due to the existence of this conceptual base and concept based mapping that a person, who is multilingual, is able to switch between languages easily. [sent-33, score-0.336]
20 The conceptual base consists of concepts and the relations between concepts. [sent-34, score-0.268]
21 There are three types of concepts: a) nominal; b) action and c) modifier. [sent-36, score-0.064]
22 CD representations use 12 primitive ACTs out of which the meaning of verbs, abstract and complex nouns are constructed. [sent-41, score-0.328]
23 Primitives are elements that can be used in many varied combinations to express the meaning of what underlies a given word. [sent-42, score-0.172]
24 In CD, primitives were arrived at by noticing structural similarities that existed when sentences were put into an actor-action-object framework. [sent-43, score-0.307]
25 Using these acts, set of states and set of conceptual roles, it is possible to express a large amount of the meanings expressible in a natural language. [sent-44, score-0.3]
26 As all words can be derived from verbal roots, we can say that words in a natural language are either activities (verbs) or derived from some activity (nouns). [sent-48, score-0.064]
27 century BC) put forward the bhāva-based definition to define all types of verbs. [sent-55, score-0.12]
28 1 (Sarup, 1920) the characteristic that defines a verb form is its verb having bhāva as its principal meaning. [sent-57, score-0.404]
29 In Sanskrit, bhāva is a morphological form of bhavati and bhavati means 'happening'. [sent-58, score-0.242]
30 So structure of bhāva can be defined as structure of happening which is explained in section 4. [sent-59, score-0.093]
31 2 (Sarup, 1920), there are 6 variants of bhāva or verb which, we believe, can be compared to 6 fundamental processes. [sent-62, score-0.202]
32 1 Form of verb Happening is formally conceived as punctuation between two discrete states in a context. [sent-67, score-0.255]
33 Since every happening consists of minimally two different states, there is an atomic sense of movement in it. [sent-68, score-0.262]
34 Movement means whenever an action takes place two states come into existence. [sent-69, score-0.117]
35 The initial state, at the beginning of an action and a final state, after the completion of the action. [sent-70, score-0.064]
36 Time is an inseparable part of this structure because between initial and final states there can be n number of intermediate states which are sequential. [sent-72, score-0.106]
37 According to Bhartṛihari (5th century CE) every verb has 2 Vaiśésika ontology, due to Kaṇāda (Rensink, 2004), Pras ́astapāda (Hutton, 2010) and Udayana (Kaṇ ā da, 1986) has been formalized by Navjyoti (Tavva and Singh, 2010). [sent-74, score-0.235]
38 Hence, every verb projects a 'sense of happening', making this sense omnipresent in all verbs. [sent-76, score-0.304]
39 Our original contribution is that we have defined an ontological structure (see Figure1) to represent ‘universal verb’ and have used it to represent the seven primary verb senses (primitives) which we have identified. [sent-82, score-0.744]
40 All verbs in a language can be represented formally using this structure. [sent-83, score-0.334]
41 2 Identifying senses Overlapping Verbal Can we have a few number of primitive meaning senses whose permutation / combination will enable us to explain all meanings in a language? [sent-85, score-0.819]
42 Primitive verb senses in language were identified using an approach similar to Lesk’s method (Lesk, 1986) of finding meaning overlaps for solving Word Sense Disambiguation problem. [sent-86, score-0.582]
43 All verbs and definitions of all senses of each verb in Sanskrit (2500) and 3750 verbs in English were collected. [sent-87, score-1.117]
44 The verb senses were collected from various on-line dictionaries in both the languages. [sent-88, score-0.464]
45 From these definitions, verbs which are used to explicate defined verbs were identified. [sent-89, score-0.789]
46 4 The formalization in NVFO is based on the idea of an ontological form which is recursive. [sent-91, score-0.119]
47 identifying frequent verbs is explained using a sample verb 'fall': Definitions of different verb senses of ‘fall’ from two different sources are given below: Source 1(Dictionary. [sent-94, score-0.952]
48 Since movement is the most common concept, ‘move’ is taken as an overlapping primitive verb sense. [sent-98, score-0.597]
49 Other primitives like know, do, is, have, cut, and cover were obtained by similar procedure. [sent-99, score-0.259]
50 In dictionaries, overlapping verb senses used to explicate meaning of defined verbs, show the relatedness of two verbs. [sent-100, score-0.803]
51 In WordNet, the existence of most frequently used verbs is represented through 8 ‘common verbs’ (Miller et. [sent-102, score-0.394]
52 We have modified the ‘common verbs’ concept of WordNet to include the concept of verbiality the ability to denote a process developing in time (Lyudmila, 2010). [sent-105, score-0.096]
53 To analyze the phenomena of overlapping meanings of verbs, we studied verbs from a database of 3750 verbs and two other lexical resources:WordNet, Webster English Dictionary. [sent-106, score-0.864]
54 From the word frequencies of the verbs in these three resources, we calculated the percentages5 of overlapping verb senses used to explicate meaning of defined verbs. [sent-107, score-1.137]
55 Total verbs (unique word forms) in the three resources – – 5 Percentage is calculated taking the frequency of a verb w. [sent-109, score-0.536]
56 61 Our database 3750 Webster Dictionary (Morehead, 2001) 1928 WordNet (Princeton University) 3400 Percentages of overlapping atomic meanings used to explicate meaning of defined verbs in the three resources are shown in Table 1. [sent-112, score-0.784]
57 When verbs and their definitions in English language were analyzed it was found that basic verb senses like 'know', 'do', 'have', 'move', 'is', 'cut', and 'cover' have higher frequency. [sent-114, score-0.783]
58 The occurrence of higher frequencies of some verbs indicated that those were the verbs with maximum meaning sense overlap with other verbs. [sent-115, score-0.903]
59 Percentage of coverage of these seven primitives in Sanskrit and English are given in Table 2. [sent-117, score-0.366]
60 senses (puncts) in English & Sanskrit Using this set of 7 'puncts' it is possible to express meaning inherent in verbs in a language and also to link the related verbs across languages. [sent-118, score-1.015]
61 We will explain this by a deeper analysis of the seven 'puncts' (see Table 3). [sent-119, score-0.107]
62 The 'punct' can be used for identifying similarities between verbs like 'fall', 'plummet', 'flow' all of which have 'move' as primary sense and they can be used for finding out different senses of the same verb like 'break'. [sent-120, score-0.954]
63 Thus 'break' can have primary sense of 'cut' and secondary sense of 'do' when the meaning is 'to destroy or 4. [sent-121, score-0.549]
64 3 The Seven Puncts In order to handle similarities and overlaps in meaning we have developed the concept of overlapping verbal sense or 'punct'. [sent-122, score-0.48]
65 These primitive verbal senses are intended to be building blocks out of which meaning of verbs can be constructed. [sent-123, score-0.94]
66 Two works WordNet (8 common verbs) and Nirukta (6 fundamental processes) were influential in restricting the number of overlapping verb senses to 7. [sent-125, score-0.549]
67 We have modified the 8 common verbs in WordNet (have, be, get, set, make, do, run, take) in a way that each primitive meaning sense can be represented as a combination of ‘state’ and ‘change’. [sent-126, score-0.764]
68 Concepts like exist and un-exist, join and un-join, know and un-know, do and un-do, ascribing some actions to some objects and un-ascribe, movement / change and possess and un-possess are the basic meaning senses we have identified. [sent-127, score-0.446]
69 Each primitive meaning sense consists of a sense and its negation. [sent-129, score-0.532]
70 We have seen that verbs across stop or interrupt or cause something to separate something'. [sent-130, score-0.391]
71 Similarly, 'break' can also have 'move' as primary sense and 'is' as secondary sense when the meaning is 'voice change of a person or day or dawn break or breaking news '. [sent-131, score-0.549]
72 Though a verb can have two to all seven verbal senses, we are grouping verbs looking at just the primary and secondary verb senses. [sent-132, score-1.121]
73 Once they are classified according to their primary and secondary meanings we put verbs in groups, say all verbs having 'move' as primary sense and 'do' (kPKBEnhualoāsnewvcmstiae:n-csgSot)eandrsyea ofsetCKEnroaxs npewslcafen/iparKltunibao leniwfzoer,ma gtirocnupb. [sent-134, score-1.195]
74 setruwcenor 6A verb can be explicated by more than one verb (overlapping meaning component) hence the total of the percentages of the verbs, which have been identified as the overlapping components is not 100. [sent-135, score-0.703]
75 We have observed that there is at least one ontological attribute which makes each word different from the other. [sent-139, score-0.119]
76 They are called ontological attributes as they are concepts like space, time, manner, reason and sub-features like direction-linear, source, destination, effect etc. [sent-140, score-0.212]
77 – 5 Comparison of primitives A comparison of primitives of CD theory and our approach is given in Table 4. [sent-146, score-0.58]
78 Corresponding to each ACT of CD theory the explanation and Puncts in order of priority of meaning senses is 63 ATENDable4(. [sent-147, score-0.409]
79 6 Issue and Solution The uniform identification of verb sense means identifying the most general sense attached to a verb, as done by an ordinary person. [sent-149, score-0.406]
80 One can see that more than one verb can be used to explicate the meaning of a verb and there is an order in which the verbs are used. [sent-150, score-0.992]
81 This order helps in finding the primary, secondary and tertiary meaning senses. [sent-151, score-0.291]
82 The order is found by nominalizing verbs in a simple sentence. [sent-152, score-0.334]
83 This method helps in resolving inconsistencies, if any, while identifying meaning senses. [sent-153, score-0.133]
84 For example: ─ you confuse me -> you create {confusion in me} → ─You create { {confused (state of Knowledge) about something (object of knowledge)} in me} → {You do creation of} {{ ‘Confused (state of Knowledge) about something (object of knowledge)} in me}. [sent-154, score-0.148]
85 In the last sentence: ‘do’ is tertiary sense, ‘know’ is secondary sense and ‘is {state of knowledge} ‘is the primary sense of verb ‘confuse’. [sent-155, score-0.666]
86 The seven verb senses thus identified are the building blocks out of which meanings of verbs are constructed. [sent-156, score-0.953]
87 The primary and secondary senses of all verbs in English and Sanskrit were identified. [sent-157, score-0.76]
88 For English verbs, the entire verb list (3750) enlisted by Levin (Levin, 1993) including extensions (Dang et. [sent-158, score-0.202]
89 For Sanskrit verbs, data (more than 3000 verbs (Sanskrit dhātu7) including variations in accentuation) was collected from various resources (Palsule, 1955; Palsule, 1961; Liebich, ─ 1922; Varma, 1953; Kale, 1961; Apte, 1998; Williams, 2008; Capeller, 1891). [sent-161, score-0.334]
90 The annotation process was to identify the primary and secondary meaning senses of all verbs and ontological attributes of verbs in 7 groups (all verbs with the same primary verb senses formed one group). [sent-163, score-2.251]
91 The annotation of verbs was done for four languages: Sanskrit, English, Hindi and Telugu. [sent-164, score-0.334]
92 Based on this classification the verb groups formed have exhibited similarity in syntactic and semantic behavior. [sent-171, score-0.202]
93 The pattern of Stanford dependency relations formed among verbs of same groups showed a similarity of 60%. [sent-172, score-0.334]
94 If two sentences have same meaning they must have similar representation regardless of the words used. [sent-176, score-0.133]
95 There is a conceptual base underlying all natural languages. [sent-180, score-0.228]
96 'Punct' is a mathematical representation of conceptual base in terms of state and change which can be used for computational purpose. [sent-183, score-0.279]
97 Identification of overlapping verbal sense enables a classification based on meaning. [sent-184, score-0.299]
98 Verbal sense identification along with feature space which includes ontological attributes can give a better classification and understanding of verbs and their behavior. [sent-185, score-0.608]
99 Investigating regular sense extensions based on intersective Levin classes. [sent-229, score-0.102]
100 Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. [sent-295, score-0.102]
wordName wordTfidf (topN-words)
[('sanskrit', 0.354), ('verbs', 0.334), ('primitives', 0.259), ('schank', 0.217), ('senses', 0.214), ('verb', 0.202), ('primitive', 0.195), ('conceptual', 0.184), ('bh', 0.171), ('cd', 0.134), ('meaning', 0.133), ('overlapping', 0.133), ('bhavati', 0.121), ('explicate', 0.121), ('ontological', 0.119), ('secondary', 0.11), ('seven', 0.107), ('sense', 0.102), ('primary', 0.102), ('nirukta', 0.097), ('sika', 0.097), ('happening', 0.093), ('va', 0.089), ('kriy', 0.072), ('puncts', 0.072), ('vais', 0.072), ('roger', 0.07), ('theories', 0.069), ('movement', 0.067), ('indian', 0.066), ('verbal', 0.064), ('action', 0.064), ('meanings', 0.063), ('hindi', 0.062), ('theory', 0.062), ('existence', 0.06), ('suddenly', 0.059), ('webster', 0.059), ('claim', 0.058), ('something', 0.057), ('kipper', 0.056), ('ka', 0.055), ('attributes', 0.053), ('states', 0.053), ('state', 0.051), ('dh', 0.051), ('tradition', 0.049), ('put', 0.048), ('balkrishna', 0.048), ('breathe', 0.048), ('erect', 0.048), ('gajanan', 0.048), ('gravity', 0.048), ('jali', 0.048), ('kale', 0.048), ('kunjunni', 0.048), ('lyudmila', 0.048), ('motilal', 0.048), ('nvfo', 0.048), ('palsule', 0.048), ('sarup', 0.048), ('tavva', 0.048), ('tertiary', 0.048), ('verse', 0.048), ('wierzbicka', 0.048), ('yate', 0.048), ('concept', 0.048), ('dictionaries', 0.048), ('wordnet', 0.046), ('ya', 0.044), ('base', 0.044), ('kavitha', 0.043), ('apte', 0.043), ('levin', 0.042), ('da', 0.04), ('concepts', 0.04), ('forward', 0.039), ('delhi', 0.039), ('descend', 0.039), ('primes', 0.039), ('underlies', 0.039), ('formal', 0.038), ('move', 0.037), ('premise', 0.035), ('sage', 0.034), ('confuse', 0.034), ('karin', 0.034), ('actors', 0.034), ('acts', 0.033), ('ontology', 0.033), ('definitions', 0.033), ('identified', 0.033), ('mel', 0.033), ('century', 0.033), ('lesk', 0.033), ('actions', 0.032), ('wilks', 0.031), ('korhonen', 0.031), ('miller', 0.031), ('representational', 0.03)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999976 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
Author: Kavitha Rajan
Abstract: Natural language can be easily understood by everyone irrespective of their differences in age or region or qualification. The existence of a conceptual base that underlies all natural languages is an accepted claim as pointed out by Schank in his Conceptual Dependency (CD) theory. Inspired by the CD theory and theories in Indian grammatical tradition, we propose a new set of meaning primitives in this paper. We claim that this new set of primitives captures the meaning inherent in verbs and help in forming an inter-lingual and computable ontological classification of verbs. We have identified seven primitive overlapping verb senses which substantiate our claim. The percentage of coverage of these primitives is 100% for all verbs in Sanskrit and Hindi and 3750 verbs in English. 1
2 0.15464245 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
Author: Mohammad Taher Pilehvar ; David Jurgens ; Roberto Navigli
Abstract: Semantic similarity is an essential component of many Natural Language Processing applications. However, prior methods for computing semantic similarity often operate at different levels, e.g., single words or entire documents, which requires adapting the method for each data type. We present a unified approach to semantic similarity that operates at multiple levels, all the way from comparing word senses to comparing text documents. Our method leverages a common probabilistic representation over word senses in order to compare different types of linguistic data. This unified representation shows state-ofthe-art performance on three tasks: seman- tic textual similarity, word similarity, and word sense coarsening.
3 0.13711676 116 acl-2013-Detecting Metaphor by Contextual Analogy
Author: Eirini Florou
Abstract: As one of the most challenging issues in NLP, metaphor identification and its interpretation have seen many models and methods proposed. This paper presents a study on metaphor identification based on the semantic similarity between literal and non literal meanings of words that can appear at the same context.
4 0.13164936 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
5 0.10654249 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
Author: Silvana Hartmann ; Iryna Gurevych
Abstract: We present a new bilingual FrameNet lexicon for English and German. It is created through a simple, but powerful approach to construct a FrameNet in any language using Wiktionary as an interlingual representation. Our approach is based on a sense alignment of FrameNet and Wiktionary, and subsequent translation disambiguation into the target language. We perform a detailed evaluation of the created resource and a discussion of Wiktionary as an interlingual connection for the cross-language transfer of lexicalsemantic resources. The created resource is publicly available at http : / /www . ukp .tu-darmst adt .de / fnwkde / .
6 0.094884656 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
7 0.092440829 186 acl-2013-Identifying English and Hungarian Light Verb Constructions: A Contrastive Approach
8 0.091212161 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
9 0.089678138 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
10 0.085681126 8 acl-2013-A Learner Corpus-based Approach to Verb Suggestion for ESL
11 0.085612372 58 acl-2013-Automated Collocation Suggestion for Japanese Second Language Learners
12 0.0841107 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
13 0.083150387 286 acl-2013-Psycholinguistically Motivated Computational Models on the Organization and Processing of Morphologically Complex Words
14 0.081243351 119 acl-2013-Diathesis alternation approximation for verb clustering
15 0.078339279 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
16 0.077100649 198 acl-2013-IndoNet: A Multilingual Lexical Knowledge Network for Indian Languages
17 0.074094959 234 acl-2013-Linking and Extending an Open Multilingual Wordnet
18 0.06895873 53 acl-2013-Annotation of regular polysemy and underspecification
19 0.06268087 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation
20 0.05930233 345 acl-2013-The Haves and the Have-Nots: Leveraging Unlabelled Corpora for Sentiment Analysis
topicId topicWeight
[(0, 0.141), (1, 0.04), (2, -0.006), (3, -0.121), (4, -0.087), (5, -0.134), (6, -0.088), (7, 0.07), (8, 0.067), (9, -0.029), (10, -0.063), (11, 0.052), (12, -0.076), (13, -0.027), (14, 0.035), (15, -0.019), (16, -0.012), (17, 0.031), (18, 0.007), (19, -0.024), (20, 0.041), (21, -0.023), (22, 0.103), (23, -0.076), (24, 0.127), (25, -0.071), (26, -0.072), (27, -0.066), (28, 0.018), (29, 0.034), (30, 0.127), (31, -0.051), (32, -0.009), (33, -0.023), (34, -0.083), (35, -0.055), (36, -0.029), (37, -0.055), (38, 0.053), (39, -0.007), (40, -0.028), (41, -0.084), (42, 0.015), (43, 0.058), (44, 0.002), (45, 0.008), (46, 0.04), (47, 0.013), (48, -0.094), (49, 0.008)]
simIndex simValue paperId paperTitle
same-paper 1 0.97782534 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
Author: Kavitha Rajan
Abstract: Natural language can be easily understood by everyone irrespective of their differences in age or region or qualification. The existence of a conceptual base that underlies all natural languages is an accepted claim as pointed out by Schank in his Conceptual Dependency (CD) theory. Inspired by the CD theory and theories in Indian grammatical tradition, we propose a new set of meaning primitives in this paper. We claim that this new set of primitives captures the meaning inherent in verbs and help in forming an inter-lingual and computable ontological classification of verbs. We have identified seven primitive overlapping verb senses which substantiate our claim. The percentage of coverage of these primitives is 100% for all verbs in Sanskrit and Hindi and 3750 verbs in English. 1
2 0.72174346 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
Author: Jesse Dunietz ; Lori Levin ; Jaime Carbonell
Abstract: Lexical resources such as WordNet and VerbNet are widely used in a multitude of NLP tasks, as are annotated corpora such as treebanks. Often, the resources are used as-is, without question or examination. This practice risks missing significant performance gains and even entire techniques. This paper addresses the importance of resource quality through the lens of a challenging NLP task: detecting selectional preference violations. We present DAVID, a simple, lexical resource-based preference violation detector. With asis lexical resources, DAVID achieves an F1-measure of just 28.27%. When the resource entries and parser outputs for a small sample are corrected, however, the F1-measure on that sample jumps from 40% to 61.54%, and performance on other examples rises, suggesting that the algorithm becomes practical given refined resources. More broadly, this paper shows that resource quality matters tremendously, sometimes even more than algorithmic improvements.
3 0.69138467 186 acl-2013-Identifying English and Hungarian Light Verb Constructions: A Contrastive Approach
Author: Veronika Vincze ; Istvan Nagy T. ; Richard Farkas
Abstract: Here, we introduce a machine learningbased approach that allows us to identify light verb constructions (LVCs) in Hungarian and English free texts. We also present the results of our experiments on the SzegedParalellFX English–Hungarian parallel corpus where LVCs were manually annotated in both languages. With our approach, we were able to contrast the performance of our method and define language-specific features for these typologically different languages. Our presented method proved to be sufficiently robust as it achieved approximately the same scores on the two typologically different languages.
4 0.68444932 258 acl-2013-Neighbors Help: Bilingual Unsupervised WSD Using Context
Author: Sudha Bhingardive ; Samiulla Shaikh ; Pushpak Bhattacharyya
Abstract: Word Sense Disambiguation (WSD) is one of the toughest problems in NLP, and in WSD, verb disambiguation has proved to be extremely difficult, because of high degree of polysemy, too fine grained senses, absence of deep verb hierarchy and low inter annotator agreement in verb sense annotation. Unsupervised WSD has received widespread attention, but has performed poorly, specially on verbs. Recently an unsupervised bilingual EM based algorithm has been proposed, which makes use only of the raw counts of the translations in comparable corpora (Marathi and Hindi). But the performance of this approach is poor on verbs with accuracy level at 25-38%. We suggest a modifica- tion to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% 35% in the accuracy of verb WSD is obtained compared to the existing EM based approach. On a general note, the work can be looked upon as contributing to the framework of unsupervised WSD through context aware expectation maximization.
5 0.65026534 53 acl-2013-Annotation of regular polysemy and underspecification
Author: Hector Martinez Alonso ; Bolette Sandford Pedersen ; Nuria Bel
Abstract: We present the result of an annotation task on regular polysemy for a series of semantic classes or dot types in English, Danish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods: majority voting with a theory-compliant backoff strategy, and MACE, an unsupervised system to choose the most likely sense from all the annotations.
6 0.64746928 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
7 0.64044851 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
8 0.63305575 119 acl-2013-Diathesis alternation approximation for verb clustering
9 0.6303044 116 acl-2013-Detecting Metaphor by Contextual Analogy
10 0.6001339 234 acl-2013-Linking and Extending an Open Multilingual Wordnet
11 0.58634996 111 acl-2013-Density Maximization in Context-Sense Metric Space for All-words WSD
12 0.56298894 302 acl-2013-Robust Automated Natural Language Processing with Multiword Expressions and Collocations
13 0.55943102 198 acl-2013-IndoNet: A Multilingual Lexical Knowledge Network for Indian Languages
14 0.55116296 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
16 0.51815957 105 acl-2013-DKPro WSD: A Generalized UIMA-based Framework for Word Sense Disambiguation
17 0.51273835 316 acl-2013-SenseSpotting: Never let your parallel data tie you to an old domain
18 0.51183152 43 acl-2013-Align, Disambiguate and Walk: A Unified Approach for Measuring Semantic Similarity
19 0.50160509 62 acl-2013-Automatic Term Ambiguity Detection
20 0.4806464 58 acl-2013-Automated Collocation Suggestion for Japanese Second Language Learners
topicId topicWeight
[(0, 0.053), (5, 0.277), (6, 0.037), (11, 0.061), (14, 0.013), (15, 0.017), (24, 0.043), (26, 0.047), (35, 0.073), (42, 0.053), (48, 0.062), (70, 0.066), (88, 0.051), (90, 0.01), (95, 0.054)]
simIndex simValue paperId paperTitle
1 0.86314869 88 acl-2013-Computational considerations of comparisons and similes
Author: Vlad Niculae ; Victoria Yaneva
Abstract: This paper presents work in progress towards automatic recognition and classification of comparisons and similes. Among possible applications, we discuss the place of this task in text simplification for readers with Autism Spectrum Disorders (ASD), who are known to have deficits in comprehending figurative language. We propose an approach to comparison recognition through the use of syntactic patterns. Keeping in mind the requirements of autistic readers, we discuss the properties relevant for distinguishing semantic criteria like figurativeness and abstractness.
same-paper 2 0.80386001 366 acl-2013-Understanding Verbs based on Overlapping Verbs Senses
Author: Kavitha Rajan
Abstract: Natural language can be easily understood by everyone irrespective of their differences in age or region or qualification. The existence of a conceptual base that underlies all natural languages is an accepted claim as pointed out by Schank in his Conceptual Dependency (CD) theory. Inspired by the CD theory and theories in Indian grammatical tradition, we propose a new set of meaning primitives in this paper. We claim that this new set of primitives captures the meaning inherent in verbs and help in forming an inter-lingual and computable ontological classification of verbs. We have identified seven primitive overlapping verb senses which substantiate our claim. The percentage of coverage of these primitives is 100% for all verbs in Sanskrit and Hindi and 3750 verbs in English. 1
3 0.75959367 3 acl-2013-A Comparison of Techniques to Automatically Identify Complex Words.
Author: Matthew Shardlow
Abstract: Identifying complex words (CWs) is an important, yet often overlooked, task within lexical simplification (The process of automatically replacing CWs with simpler alternatives). If too many words are identified then substitutions may be made erroneously, leading to a loss of meaning. If too few words are identified then those which impede a user’s understanding may be missed, resulting in a complex final text. This paper addresses the task of evaluating different methods for CW identification. A corpus of sentences with annotated CWs is mined from Simple Wikipedia edit histories, which is then used as the basis for several experiments. Firstly, the corpus design is explained and the results of the validation experiments using human judges are reported. Experiments are carried out into the CW identification techniques of: simplifying everything, frequency thresholding and training a support vector machine. These are based upon previous approaches to the task and show that thresholding does not perform significantly differently to the more na¨ ıve technique of simplifying everything. The support vector machine achieves a slight increase in precision over the other two methods, but at the cost of a dramatic trade off in recall.
4 0.5283255 275 acl-2013-Parsing with Compositional Vector Grammars
Author: Richard Socher ; John Bauer ; Christopher D. Manning ; Ng Andrew Y.
Abstract: Natural language parsing has typically been done with small sets of discrete categories such as NP and VP, but this representation does not capture the full syntactic nor semantic richness of linguistic phrases, and attempts to improve on this by lexicalizing phrases or splitting categories only partly address the problem at the cost of huge feature spaces and sparseness. Instead, we introduce a Compositional Vector Grammar (CVG), which combines PCFGs with a syntactically untied recursive neural network that learns syntactico-semantic, compositional vector representations. The CVG improves the PCFG of the Stanford Parser by 3.8% to obtain an F1 score of 90.4%. It is fast to train and implemented approximately as an efficient reranker it is about 20% faster than the current Stanford factored parser. The CVG learns a soft notion of head words and improves performance on the types of ambiguities that require semantic information such as PP attachments.
5 0.52760679 318 acl-2013-Sentiment Relevance
Author: Christian Scheible ; Hinrich Schutze
Abstract: A number of different notions, including subjectivity, have been proposed for distinguishing parts of documents that convey sentiment from those that do not. We propose a new concept, sentiment relevance, to make this distinction and argue that it better reflects the requirements of sentiment analysis systems. We demonstrate experimentally that sentiment relevance and subjectivity are related, but different. Since no large amount of labeled training data for our new notion of sentiment relevance is available, we investigate two semi-supervised methods for creating sentiment relevance classifiers: a distant supervision approach that leverages structured information about the domain of the reviews; and transfer learning on feature representations based on lexical taxonomies that enables knowledge transfer. We show that both methods learn sentiment relevance classifiers that perform well.
6 0.5219568 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
7 0.52135432 225 acl-2013-Learning to Order Natural Language Texts
8 0.51975113 82 acl-2013-Co-regularizing character-based and word-based models for semi-supervised Chinese word segmentation
9 0.51913166 252 acl-2013-Multigraph Clustering for Unsupervised Coreference Resolution
10 0.51796353 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
11 0.51761472 70 acl-2013-Bilingually-Guided Monolingual Dependency Grammar Induction
12 0.51738733 164 acl-2013-FudanNLP: A Toolkit for Chinese Natural Language Processing
13 0.51730746 155 acl-2013-Fast and Accurate Shift-Reduce Constituent Parsing
14 0.5171774 224 acl-2013-Learning to Extract International Relations from Political Context
15 0.51682293 123 acl-2013-Discriminative Learning with Natural Annotations: Word Segmentation as a Case Study
16 0.51679319 62 acl-2013-Automatic Term Ambiguity Detection
17 0.51614761 249 acl-2013-Models of Semantic Representation with Visual Attributes
18 0.51593447 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation
19 0.5156396 80 acl-2013-Chinese Parsing Exploiting Characters
20 0.51559913 254 acl-2013-Multimodal DBN for Predicting High-Quality Answers in cQA portals