acl acl2010 acl2010-126 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Verena Henrich ; Erhard Hinrichs
Abstract: GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. 1
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. [sent-4, score-0.283]
2 GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. [sent-5, score-0.033]
3 The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. [sent-6, score-0.208]
4 GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. [sent-7, score-0.172]
5 This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. [sent-8, score-0.089]
6 GernEdiT replaces the traditional GermaNet development based on lexicographer files (Fellbaum, 1998) by a more user-friendly visual tool that supports versioning and collaborative annotation by several lexicographers working in parallel. [sent-10, score-0.492]
7 Furthermore, GernEdiT facilitates internal consistency of the GermaNet data such as appropriate linking of lexical units with synsets, connectedness of the synset graph, and automatic Erhard Hinrichs University of Tübingen Tübingen, Germany. [sent-11, score-0.683]
8 All these functionalities along with the main aspects of GernEdiT’s usage and usability are presented in this paper. [sent-15, score-0.111]
9 2 The Structure of GermaNet GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English (Fellbaum, 1998). [sent-16, score-0.033]
10 It covers the three word categories of adjectives, nouns, and verbs and partitions the lexical space into a set of concepts that are interlinked by semantic relations. [sent-17, score-0.103]
11 A synset is a set of words (called lexical units) where all the words are taken to have (almost) the same meaning. [sent-19, score-0.375]
12 Thus a synset is a set-representation of the semantic relation of synonymy, which means that it consists of a list of lexical units. [sent-20, score-0.375]
13 There are two types of semantic relations in GermaNet: conceptual and lexical relations. [sent-21, score-0.231]
14 They include relations such as hyperonymy, part-whole relations, entailment, or causation. [sent-25, score-0.058]
15 GermaNet is hierarchically structured in terms of the hyperonymy relation. [sent-26, score-0.034]
16 Antonymy, a pair of opposites, is an example of a lexical relation. [sent-28, score-0.084]
17 The editor represents an interface to a relational database, where all GermaNet data is stored from now on. [sent-30, score-0.114]
18 1 Motivation The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. [sent-36, score-0.208]
19 This is exactly why GernEdiT was developed: It supports lexicographers who need to access, modify, and extend GermaNet data by providing these functions through simple buttonclicks, searches, and form editing. [sent-37, score-0.277]
20 These functionalities allow lexicographers, among other things, to find the appropriate place in the hierarchy for the insertion of new synsets and lexical units. [sent-39, score-0.442]
21 Last but not least, GernEdiT facilitates internal consistency and correctness of the linguistic resource and supports versioning and collaborative annotation of GermaNet by several lexicographers working in parallel. [sent-40, score-0.506]
22 2 The Main User Interface Figure 1 illustrates the main user panel of GernEdiT. [sent-42, score-0.137]
23 It shows a Search panel above, two panels for Synsets and Lexical Units in the middle, and four tabs below: a Conceptual Relations Editor, a Graph with Hyperonyms and Hyponyms, a Lexi20 Figure 2: Filtered list of lexical units. [sent-43, score-0.274]
24 In Figure 1, a search for synsets consisting of lexical units with the word Nuss (German noun for: nut) has been executed. [sent-45, score-0.361]
25 Accordingly, the Synsets panel displays the three resulting synsets that match the search item. [sent-46, score-0.299]
26 Word Category specifies whether a synset is an adjective (adj), a noun (nomen), or a verb (verben), whereas Word Class classifies the synsets into semantic fields. [sent-48, score-0.475]
27 The word class of the selected synset in Figure 1 is Nahrung (German noun for: food). [sent-49, score-0.291]
28 , for the selected synset the paraphrase is: der essbare Kern einer Nuss (German phrase for: the edible kernel of a nut). [sent-52, score-0.291]
29 The column All Orth Forms simply lists all orthographical variants of all its lexical units. [sent-53, score-0.125]
30 Which lexical units are listed in the Lexical Units panel depends on the selected synset in the Synsets panel. [sent-54, score-0.583]
31 Orth Form (short for: orthographic form) represents the correct spelling of a word according to the rules of the spelling reform Neue Deutsche Rechtschreibung (Rat für deutsche Rechtschreibung, 2006), a recently adopted spelling reform. [sent-56, score-0.329]
32 In our example, the main orthographic form is Nuss. [sent-57, score-0.073]
33 Orth Var may contain an alternative spelling that is admissible according to the Neue Deutsche Rechtschreibung. [sent-58, score-0.063]
34 Old Orth Form represents the main orthographic form prior to the Neue Deutsche Rechtschreibung. [sent-59, score-0.073]
35 This means that Nuß was the correct spelling instead of Nuss before the German spelling reform. [sent-60, score-0.126]
36 The Boolean values Named Entity, Artificial, and Style Marking express further properties of a lexical unit, whether the lexical unit is a named entity, an artificial concept node, or a stylistic variant. [sent-63, score-0.26]
37 For both the lexical units and the synsets, there are two buttons Use as From and Use as To, which help to add new relations (see the explanation of Figure 3 in section 3. [sent-64, score-0.314]
38 3 Search Functionalities It is possible to search for words or synset database IDs via the search panel (see Figure 1 at the top). [sent-67, score-0.438]
39 The check box Ignore Case offers the possibility of searching without distinguishing between upper and lower case. [sent-68, score-0.057]
40 Apart from the main form Delfin, there is an orthographic variant Delphin. [sent-70, score-0.073]
41 Via the file menu, lists of all synsets or lexical units with their properties can be accessed. [sent-73, score-0.361]
42 , filtering the lexical units or synsets by parts of their orthographical forms. [sent-76, score-0.402]
43 Only verbs that have a frame that contains NN are chosen (see Frame contains check box and corresponding text field). [sent-78, score-0.117]
44 Furthermore, the resulting filtered list is sorted in descending order by their examples (see the little triangle in the Examples header of the result table). [sent-79, score-0.04]
45 The number in the brackets behind the word category in the tab title indicates the count of the filtered lexical units (in this example 193 verbs pass the filter). [sent-80, score-0.308]
46 4 Visualization of the Graph Hierarchy There is the possibility to display a graph with all hyperonyms and hyponyms of a selected synset. [sent-82, score-0.181]
47 This is shown in the bottom half of Figure 1 in the tab Graph with Hyperonyms and Hyponyms. [sent-83, score-0.092]
48 The graph in Figure 1 visualizes a part of the hierarchical structure of GermaNet centered around the synset containing Nuss and displays the hyperonyms and hyponyms of this synset up to a certain parameterized depth (in this case depth 2 has been chosen). [sent-84, score-0.819]
49 The Hyperonym Depth chooser allows unfolding the graph to the top up to the preselected depth. [sent-85, score-0.053]
50 As it is not possible to visualize the whole GermaNet contents at once, the graph can be seen as a window to GermaNet. [sent-86, score-0.073]
51 A click on any synset node within the graph, navigates to that synset. [sent-87, score-0.324]
52 This functionality supports lexicographers especially in finding the appropriate place in the hierarchy for the insertion of new synsets. [sent-88, score-0.383]
53 5 Modifications of Existing Items If the lexicographers’ task is to modify existing synsets or lexical units, this is done by selecting a synset or lexical unit displayed in the Synsets and the Lexical Units panels shown in Figure 1. [sent-90, score-0.796]
54 For example by clicking in the cell Orth Form the spelling of a lexical unit can be corrected in case of an earlier typo was made. [sent-92, score-0.297]
55 If lexicographers want to edit examples, frames, conceptual, or lexical relations this is done by choosing the appropriate tab indicated at the bottom of Figure 1. [sent-93, score-0.494]
56 By clicking one of these tabs, the corresponding panel appears below these tabs. [sent-94, score-0.173]
57 In Figure 1 the panel for Graph with Hyperonyms and Hyponyms is displayed. [sent-95, score-0.115]
58 It is possible to edit the examples and frames associated with a lexical unit via the Examples and Frames tab. [sent-96, score-0.259]
59 Frames specify the syntactic valence of a lexical unit. [sent-97, score-0.084]
60 Each frame can have an associated example that indicates a possible usage of the lexical unit for that particular frame. [sent-98, score-0.197]
61 The tab Examples and Frames is thus particularly geared towards the editing of verb entries. [sent-99, score-0.236]
62 By clicking on the tab all examples and frames of a lexical unit are listed and can then be modified by choosing the appropriate editing buttons. [sent-100, score-0.581]
63 For more information about these editing functions see Henrich and Hinrichs (2010). [sent-101, score-0.144]
64 6 Editing of Relations If lexicographers want to add new conceptual or lexical relations to a synset or a lexical unit this is done by clicking on the Conceptual Relations Editor or the Lexical Relations Editor shown in Figure 1. [sent-103, score-1.007]
65 Figure 3 shows the panel that appears if the Conceptual Relations Editor has been chosen for the synset containing Nuss. [sent-104, score-0.426]
66 To create a new relation, the lexicographer needs to use the buttons Use as From and Use as To shown in Figure 1. [sent-105, score-0.128]
67 This will insert the ID of the selected synsets from the Synsets panel in the corresponding From or To field in Figure 3. [sent-106, score-0.362]
68 The button Delete ConRel allows deletion of a conceptual relation, if all consistency checks are passed. [sent-107, score-0.267]
69 The Lexical Relations Editor tab supports editing all lexical relations. [sent-108, score-0.365]
70 It is not displayed separately for reasons of space, but it is analogue to the Conceptual Relations Editor tab for editing conceptual relations. [sent-109, score-0.325]
71 When clicking on the button Create Synset, the Lexical Unit Editor (shown in Figure 4, right) pops up. [sent-113, score-0.104]
72 This workflow forces the parallel creation of a lexical unit while creating a synset. [sent-114, score-0.176]
73 8 Consistency Checks GernEdiT facilitates internal consistency of the workflow-oriented design of the editor. [sent-116, score-0.149]
74 It is not possible to create a synset without creating a lexical unit in parallel (as described in section 3. [sent-117, score-0.467]
75 Furthermore, it is not possible to insert a new synset without specifying the place in the GermaNet hierarchy where the new synset should be added. [sent-119, score-0.68]
76 This is achieved by the button Add New Hyponym (see Figure 1) which forces the user to identify the appropriate hyperonym for the new synset to be added. [sent-120, score-0.399]
77 Furthermore, it is not possible to insert a lexical unit without specifying the corresponding synset. [sent-121, score-0.217]
78 On deletion of a synset, all corresponding data such as conceptual relations, lexical units with their lexical relations, frames, and examples, are deleted automatically. [sent-122, score-0.377]
79 Consistency checks also take effect for the table cell editing in the Synsets and Lexical Units panels of the main user interface (see Figure 1), e. [sent-123, score-0.264]
80 , the main orthographic form of a lexical unit may never be empty. [sent-125, score-0.249]
81 All buttons in GernEdiT are enabled only if the corresponding functionalities meet the consistency requirements, e. [sent-126, score-0.228]
82 , if a synset consists only of one lexical unit, it is not possible to de- lete that lexical unit and thus the button Delete LexUnit is disabled. [sent-128, score-0.597]
83 Also, if the deletion of a synset or a relation would violate the complete connectedness of the GermaNet graph, it is not possible to delete that synset. [sent-129, score-0.402]
84 9 Further Functionalities There are further functionalities available through the file menu. [sent-131, score-0.089]
85 Besides retrieving the upto-date statistics of GermaNet, an editing history makes it possible to list all modifications on the GermaNet data, with the information about who made the change and how the modified item looked before. [sent-132, score-0.144]
86 For example, it is possible to export all GermaNet data. [sent-134, score-0.045]
87 This is achieved by GermaNet contents into XML files, which are the used as an exchange format of GermaNet, or to 23 export a list of all verbs with their corresponding frames and examples. [sent-135, score-0.172]
88 4 Tool Evaluation In order to assess the usefulness of GernEdiT, we conducted in depth interviews with the Germa- Net lexicographers and with the senior researcher who oversees all lexicographic development. [sent-136, score-0.349]
89 At the time of the interview all of these researchers had worked with the tool for about eight months. [sent-137, score-0.067]
90 The initial learning curve for getting familiar with GernEdiT is considerably lower compared to the learning curve required for the traditional development based on lexicographer files. [sent-139, score-0.091]
91 The menu-driven and graphics-based navigation through the GermaNet graph is much easier compared to finding the correct entry point in the purely text-based format of lexicographer files. [sent-141, score-0.169]
92 Lexicographers no longer need to learn the complex specification syntax of the lexicographer files. [sent-143, score-0.068]
93 GernEdiT facilitates automatic checking of internal consistency and correctness of the GermaNet data such as appropriate linking of lexical units with synsets, connectedness of the synset graph, and automatic closure among relations and their inverse counterparts. [sent-146, score-0.785]
94 Especially for the senior researcher who is responsible for coordinating the GermaNet lexicographers, it is now much easier to trace back changes and to verify who was responsible for them. [sent-152, score-0.056]
95 The collaborative annotation by several lexicographers working in parallel is now easily possible and does not cause any management overhead as before. [sent-154, score-0.255]
96 In sum, the lexicographers of GermaNet gave very positive feedback about the use of GernEdiT and also made smaller suggestions for improving its user-friendliness further. [sent-155, score-0.253]
97 The extremely positive feedback of the GermaNet lexicographers underscores the practical benefits gained by using the GernEdiT tool in practice. [sent-158, score-0.35]
98 In future work, we plan to adapt the tool so that it can be used with wordnets for other languages as well. [sent-160, score-0.067]
99 This would mean that the wordnet data for a given language would have to be stored in a relational database and that the tool itself can handle the language specific data structures of the wordnet in question. [sent-161, score-0.165]
100 Acknowledgements We would like to thank all GermaNet lexicographers for their willingness to experiment with GernEdiT and to be interviewed about their experiences with the tool. [sent-162, score-0.232]
wordName wordTfidf (topN-words)
[('germanet', 0.587), ('gernedit', 0.459), ('synset', 0.291), ('lexicographers', 0.232), ('synsets', 0.184), ('editing', 0.144), ('panel', 0.115), ('orth', 0.104), ('units', 0.093), ('tab', 0.092), ('unit', 0.092), ('deutsche', 0.089), ('functionalities', 0.089), ('conceptual', 0.089), ('hyperonyms', 0.085), ('lexical', 0.084), ('editor', 0.083), ('consistency', 0.079), ('lexicographer', 0.068), ('henrich', 0.068), ('neue', 0.068), ('nuss', 0.068), ('rechtschreibung', 0.068), ('tool', 0.067), ('spelling', 0.063), ('frames', 0.063), ('buttons', 0.06), ('relations', 0.058), ('clicking', 0.058), ('graph', 0.053), ('orthographic', 0.051), ('lexunit', 0.051), ('bingen', 0.048), ('button', 0.046), ('delete', 0.046), ('supports', 0.045), ('export', 0.045), ('verena', 0.045), ('facilitates', 0.044), ('hyponyms', 0.043), ('insert', 0.041), ('orthographical', 0.041), ('erhard', 0.041), ('panels', 0.041), ('var', 0.038), ('connectedness', 0.038), ('hinrichs', 0.038), ('box', 0.034), ('delfin', 0.034), ('hyperonym', 0.034), ('hyperonymy', 0.034), ('tabs', 0.034), ('tuebingen', 0.034), ('versioning', 0.034), ('lexicographic', 0.033), ('click', 0.033), ('wordnet', 0.033), ('id', 0.033), ('database', 0.032), ('hierarchy', 0.032), ('interface', 0.031), ('kunze', 0.03), ('underscores', 0.03), ('researcher', 0.03), ('german', 0.029), ('appropriate', 0.028), ('depth', 0.028), ('prone', 0.027), ('deletion', 0.027), ('internal', 0.026), ('checks', 0.026), ('senior', 0.026), ('place', 0.025), ('format', 0.025), ('nut', 0.024), ('collaborative', 0.023), ('correctness', 0.023), ('rat', 0.023), ('navigation', 0.023), ('check', 0.023), ('traditional', 0.023), ('main', 0.022), ('visualization', 0.022), ('field', 0.022), ('old', 0.022), ('opportunity', 0.021), ('princeton', 0.021), ('functionality', 0.021), ('closure', 0.021), ('uni', 0.021), ('frame', 0.021), ('feedback', 0.021), ('filtered', 0.02), ('contents', 0.02), ('fellbaum', 0.02), ('chosen', 0.02), ('modify', 0.02), ('examples', 0.02), ('verbs', 0.019), ('add', 0.019)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 126 acl-2010-GernEdiT - The GermaNet Editing Tool
Author: Verena Henrich ; Erhard Hinrichs
Abstract: GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. 1
2 0.16196766 44 acl-2010-BabelNet: Building a Very Large Multilingual Semantic Network
Author: Roberto Navigli ; Simone Paolo Ponzetto
Abstract: In this paper we present BabelNet a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource. –
3 0.063113786 156 acl-2010-Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems
Author: Simone Paolo Ponzetto ; Roberto Navigli
Abstract: One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguation algorithms compete with state-of-the-art supervised WSD systems in a coarse-grained all-words setting and outperform them on gold-standard domain-specific datasets.
4 0.049999364 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
Author: Barbara McGillivray
Abstract: We present a system that automatically induces Selectional Preferences (SPs) for Latin verbs from two treebanks by using Latin WordNet. Our method overcomes some of the problems connected with data sparseness and the small size of the input corpora. We also suggest a way to evaluate the acquired SPs on unseen events extracted from other Latin corpora.
5 0.048471153 237 acl-2010-Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection
Author: Linlin Li ; Benjamin Roth ; Caroline Sporleder
Abstract: This paper presents a probabilistic model for sense disambiguation which chooses the best sense based on the conditional probability of sense paraphrases given a context. We use a topic model to decompose this conditional probability into two conditional probabilities with latent variables. We propose three different instantiations of the model for solving sense disambiguation problems with different degrees of resource availability. The proposed models are tested on three different tasks: coarse-grained word sense disambiguation, fine-grained word sense disambiguation, and detection of literal vs. nonliteral usages of potentially idiomatic expressions. In all three cases, we outper- form state-of-the-art systems either quantitatively or statistically significantly.
6 0.047577903 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet
7 0.045515355 164 acl-2010-Learning Phrase-Based Spelling Error Models from Clickthrough Data
8 0.043386247 259 acl-2010-WebLicht: Web-Based LRT Services for German
9 0.035079442 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
10 0.033465691 27 acl-2010-An Active Learning Approach to Finding Related Terms
11 0.033116519 127 acl-2010-Global Learning of Focused Entailment Graphs
12 0.032760911 121 acl-2010-Generating Entailment Rules from FrameNet
13 0.030212557 94 acl-2010-Edit Tree Distance Alignments for Semantic Role Labelling
14 0.029920736 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs
15 0.02989888 62 acl-2010-Combining Orthogonal Monolingual and Multilingual Sources of Evidence for All Words WSD
16 0.028891437 30 acl-2010-An Open-Source Package for Recognizing Textual Entailment
17 0.026504621 26 acl-2010-All Words Domain Adapted WSD: Finding a Middle Ground between Supervision and Unsupervision
18 0.025877912 85 acl-2010-Detecting Experiences from Weblogs
19 0.025789643 76 acl-2010-Creating Robust Supervised Classifiers via Web-Scale N-Gram Data
20 0.025347203 35 acl-2010-Automated Planning for Situated Natural Language Generation
topicId topicWeight
[(0, -0.077), (1, 0.041), (2, -0.017), (3, -0.022), (4, 0.084), (5, -0.001), (6, 0.041), (7, 0.052), (8, -0.044), (9, -0.001), (10, 0.007), (11, -0.006), (12, -0.027), (13, 0.014), (14, -0.013), (15, 0.001), (16, 0.025), (17, 0.067), (18, 0.029), (19, 0.048), (20, -0.034), (21, -0.007), (22, 0.029), (23, -0.024), (24, 0.058), (25, -0.012), (26, -0.015), (27, 0.039), (28, -0.015), (29, -0.058), (30, 0.011), (31, 0.011), (32, 0.072), (33, -0.06), (34, 0.046), (35, -0.101), (36, -0.026), (37, 0.012), (38, -0.014), (39, -0.0), (40, 0.033), (41, -0.022), (42, -0.03), (43, 0.008), (44, -0.048), (45, 0.012), (46, -0.083), (47, 0.031), (48, -0.099), (49, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.92906487 126 acl-2010-GernEdiT - The GermaNet Editing Tool
Author: Verena Henrich ; Erhard Hinrichs
Abstract: GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. 1
2 0.64075828 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet
Author: Clifton McFate
Abstract: A robust dictionary of semantic frames is an essential element of natural language understanding systems that use ontologies. However, creating lexical resources that accurately capture semantic representations en masse is a persistent problem. Where the sheer amount of content makes hand creation inefficient, computerized approaches often suffer from over generality and difficulty with sense disambiguation. This paper describes a semi-automatic method to create verb semantic frames in the Cyc ontology by converting the information contained in VerbNet into a Cyc usable format. This method captures the differences in meaning between types of verbs, and uses existing connections between WordNet, VerbNet, and Cyc to specify distinctions between individual verbs when available. This method provides 27,909 frames to OpenCyc which currently has none and can be used to extend ResearchCyc as well. We show that these frames lead to a 20% increase in sample sentences parsed over the Research Cyc verb lexicon. 1
3 0.61496687 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
Author: Barbara McGillivray
Abstract: We present a system that automatically induces Selectional Preferences (SPs) for Latin verbs from two treebanks by using Latin WordNet. Our method overcomes some of the problems connected with data sparseness and the small size of the input corpora. We also suggest a way to evaluate the acquired SPs on unseen events extracted from other Latin corpora.
4 0.60320598 44 acl-2010-BabelNet: Building a Very Large Multilingual Semantic Network
Author: Roberto Navigli ; Simone Paolo Ponzetto
Abstract: In this paper we present BabelNet a very large, wide-coverage multilingual semantic network. The resource is automatically constructed by means of a methodology that integrates lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition Machine Translation is also applied to enrich the resource with lexical information for all languages. We conduct experiments on new and existing gold-standard datasets to show the high quality and coverage of the resource. –
5 0.53766143 156 acl-2010-Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems
Author: Simone Paolo Ponzetto ; Roberto Navigli
Abstract: One of the main obstacles to highperformance Word Sense Disambiguation (WSD) is the knowledge acquisition bottleneck. In this paper, we present a methodology to automatically extend WordNet with large amounts of semantic relations from an encyclopedic resource, namely Wikipedia. We show that, when provided with a vast amount of high-quality semantic relations, simple knowledge-lean disambiguation algorithms compete with state-of-the-art supervised WSD systems in a coarse-grained all-words setting and outperform them on gold-standard domain-specific datasets.
6 0.50231087 85 acl-2010-Detecting Experiences from Weblogs
7 0.45302129 230 acl-2010-The Manually Annotated Sub-Corpus: A Community Resource for and by the People
8 0.44417313 258 acl-2010-Weakly Supervised Learning of Presupposition Relations between Verbs
9 0.43365988 121 acl-2010-Generating Entailment Rules from FrameNet
10 0.4183563 166 acl-2010-Learning Word-Class Lattices for Definition and Hypernym Extraction
11 0.40831324 261 acl-2010-Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
12 0.39965698 226 acl-2010-The Human Language Project: Building a Universal Corpus of the World's Languages
13 0.37868336 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
14 0.32946166 259 acl-2010-WebLicht: Web-Based LRT Services for German
15 0.31114545 139 acl-2010-Identifying Generic Noun Phrases
16 0.30722666 148 acl-2010-Improving the Use of Pseudo-Words for Evaluating Selectional Preferences
17 0.30392957 196 acl-2010-Plot Induction and Evolutionary Search for Story Generation
18 0.30225512 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
19 0.30215389 200 acl-2010-Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
20 0.29988536 165 acl-2010-Learning Script Knowledge with Web Experiments
topicId topicWeight
[(2, 0.013), (14, 0.011), (25, 0.036), (42, 0.021), (59, 0.059), (73, 0.037), (78, 0.024), (83, 0.038), (84, 0.557), (98, 0.075)]
simIndex simValue paperId paperTitle
same-paper 1 0.91749948 126 acl-2010-GernEdiT - The GermaNet Editing Tool
Author: Verena Henrich ; Erhard Hinrichs
Abstract: GernEdiT (short for: GermaNet Editing Tool) offers a graphical interface for the lexicographers and developers of GermaNet to access and modify the underlying GermaNet resource. GermaNet is a lexical-semantic wordnet that is modeled after the Princeton WordNet for English. The traditional lexicographic development of GermaNet was error prone and time-consuming, mainly due to a complex underlying data format and no opportunity of automatic consistency checks. GernEdiT replaces the earlier development by a more userfriendly tool, which facilitates automatic checking of internal consistency and correctness of the linguistic resource. This paper pre- sents all these core functionalities of GernEdiT along with details about its usage and usability. 1
2 0.89202225 103 acl-2010-Estimating Strictly Piecewise Distributions
Author: Jeffrey Heinz ; James Rogers
Abstract: Strictly Piecewise (SP) languages are a subclass of regular languages which encode certain kinds of long-distance dependencies that are found in natural languages. Like the classes in the Chomsky and Subregular hierarchies, there are many independently converging characterizations of the SP class (Rogers et al., to appear). Here we define SP distributions and show that they can be efficiently estimated from positive data.
3 0.73186404 220 acl-2010-Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Author: Jeff Mitchell ; Mirella Lapata ; Vera Demberg ; Frank Keller
Abstract: The analysis of reading times can provide insights into the processes that underlie language comprehension, with longer reading times indicating greater cognitive load. There is evidence that the language processor is highly predictive, such that prior context allows upcoming linguistic material to be anticipated. Previous work has investigated the contributions of semantic and syntactic contexts in isolation, essentially treating them as independent factors. In this paper we analyze reading times in terms of a single predictive measure which integrates a model of semantic composition with an incremental parser and a language model.
4 0.71043509 216 acl-2010-Starting from Scratch in Semantic Role Labeling
Author: Michael Connor ; Yael Gertner ; Cynthia Fisher ; Dan Roth
Abstract: A fundamental step in sentence comprehension involves assigning semantic roles to sentence constituents. To accomplish this, the listener must parse the sentence, find constituents that are candidate arguments, and assign semantic roles to those constituents. Each step depends on prior lexical and syntactic knowledge. Where do children learning their first languages begin in solving this problem? In this paper we focus on the parsing and argumentidentification steps that precede Semantic Role Labeling (SRL) training. We combine a simplified SRL with an unsupervised HMM part of speech tagger, and experiment with psycholinguisticallymotivated ways to label clusters resulting from the HMM so that they can be used to parse input for the SRL system. The results show that proposed shallow representations of sentence structure are robust to reductions in parsing accuracy, and that the contribution of alternative representations of sentence structure to successful semantic role labeling varies with the integrity of the parsing and argumentidentification stages.
5 0.60783088 18 acl-2010-A Study of Information Retrieval Weighting Schemes for Sentiment Analysis
Author: Georgios Paltoglou ; Mike Thelwall
Abstract: Most sentiment analysis approaches use as baseline a support vector machines (SVM) classifier with binary unigram weights. In this paper, we explore whether more sophisticated feature weighting schemes from Information Retrieval can enhance classification accuracy. We show that variants of the classic tf.idf scheme adapted to sentiment analysis provide significant increases in accuracy, especially when using a sublinear function for term frequency weights and document frequency smoothing. The techniques are tested on a wide selection of data sets and produce the best accuracy to our knowledge.
6 0.55181956 136 acl-2010-How Many Words Is a Picture Worth? Automatic Caption Generation for News Images
7 0.48968685 65 acl-2010-Complexity Metrics in an Incremental Right-Corner Parser
8 0.47744137 59 acl-2010-Cognitively Plausible Models of Human Language Processing
9 0.44477174 217 acl-2010-String Extension Learning
10 0.40444165 13 acl-2010-A Rational Model of Eye Movement Control in Reading
11 0.38092339 66 acl-2010-Compositional Matrix-Space Models of Language
12 0.35370588 175 acl-2010-Models of Metaphor in NLP
13 0.35349429 67 acl-2010-Computing Weakest Readings
14 0.35294688 108 acl-2010-Expanding Verb Coverage in Cyc with VerbNet
15 0.34713298 177 acl-2010-Multilingual Pseudo-Relevance Feedback: Performance Study of Assisting Languages
16 0.34630531 41 acl-2010-Automatic Selectional Preference Acquisition for Latin Verbs
17 0.34030265 162 acl-2010-Learning Common Grammar from Multilingual Corpus
19 0.33940202 235 acl-2010-Tools for Multilingual Grammar-Based Translation on the Web
20 0.33385572 195 acl-2010-Phylogenetic Grammar Induction