acl acl2013 acl2013-265 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Marco Fossati ; Claudio Giuliano ; Sara Tonelli
Abstract: We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time.
Reference: text
sentIndex sentText sentNum sentScore
1 eu , l Abstract We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. [sent-2, score-0.217]
2 We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. [sent-3, score-1.353]
3 We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time. [sent-4, score-0.219]
4 Existing frame annotation tools, such as Salto (Burchardt et al. [sent-6, score-0.583]
5 , 2002) foresee this two-step approach, in which annotators first select a frame from a large repository of possible frames (1,162 frames are currently listed in the online version of the resource), and then assign the FE labels constrained by the chosen frame to LU dependents. [sent-8, score-1.301]
6 In this paper, we argue that such workflow shows some redundancy which can be addressed by radically changing the annotation methodology and performing it in one single step. [sent-9, score-0.256]
7 Our novel annotation approach is also more compliant with the definition of frames proposed in Fillmore (1976): in his seminal work, Fillmore postulated that the meanings of words can be understood on the basis of a semantic frame, i. [sent-10, score-0.281]
8 a description of a type of event or entity and the participants in it. [sent-12, score-0.043]
9 This implies that frames can be distinguished one from another on the basis of the participants involved, thus it seems more cognitively plausible to start from the FE annotation to identify the frame expressed in a sentence, and not the contrary. [sent-13, score-0.768]
10 The goal of our methodology is to provide full frame annotation in a single step and in a bottomup fashion. [sent-14, score-0.646]
11 Instead of choosing the frame first, we focus on FEs and let the frame emerge based on the chosen FEs. [sent-15, score-0.995]
12 We believe this approach complies better with the cognitive activity performed by annotators, while the 2-step methodology is more artificial and introduces some redundancy because part of the annotators’ choices are replicated in the two steps (i. [sent-16, score-0.16]
13 in order to assign a frame, annotators implicitly identify the participants also in the first step, even if they are annotated later). [sent-18, score-0.133]
14 Another issue we investigate in this work is how semantic roles should be annotated in a crowdsourcing framework. [sent-19, score-0.168]
15 This task is particularly complex, therefore it is usually performed by expert annotators under the supervision of linguis- tic experts and lexicographers, as in the case of FrameNet. [sent-20, score-0.163]
16 In NLP, different annotation efforts for encoding semantic roles have been carried out, each applying its own methodology and annotation guidelines (see for instance Ruppenhofer et al. [sent-21, score-0.341]
17 In this work, we present a pilot study in which we assess to what extent role descriptions meant for ‘linguistics experts’ are also suitable for annotators from the crowd. [sent-24, score-0.12]
18 Moreover, we show how a simplified version ofthese descriptions, less bounded to a specific linguistic theory, improve the annotation quality. [sent-25, score-0.188]
19 2 Related work The construction of annotation datasets for NLP tasks via non-expert contributors has been ap742 Proce dingSsof oifa, th Beu 5l1gsarti Aan,An u aglu Mste 4e-ti9n2g 0 o1f3 t. [sent-26, score-0.147]
20 , 2009) was meant to gather a corpus with coreference resolution annotations. [sent-33, score-0.057]
21 (2008) described design and evaluation guidelines for five natural language micro-tasks. [sent-35, score-0.036]
22 However, they explicitly chose a set of tasks that could be easily understood by nonexpert contributors, thus leaving the recruitment and training issues open. [sent-36, score-0.033]
23 The semantic role labeling problem has been recently addressed via crowdsourcing by Hong and Baker (201 1). [sent-39, score-0.137]
24 Furthermore, Baker (2012) highlighted the crucial role of recruiting people from the crowd in order to bypass the need for linguistics expert annotations. [sent-40, score-0.115]
25 Nevertheless, Hong and Baker (201 1) focused on the frame discrimination task, namely selecting the correct frame evoked by a given lemma. [sent-41, score-1.22]
26 Such task is comparable to the word sense disambiguation one as per (Snow et al. [sent-42, score-0.058]
27 3 Experiments In this section, we describe the anatomy and discuss the results of the tasks we outsourced to the crowd via the CrowdFlower1 platform. [sent-44, score-0.068]
28 Cheating risk is minimized by adding gold units, namely data for which the requester already knows the answer. [sent-46, score-0.174]
29 If a worker misses too many gold answers within a given threshold, he or she will be flagged as untrusted and his or her judgments will be automatically discarded. [sent-47, score-0.447]
30 Worker switching effect Depending on their accuracy in providing answers to gold units, workers may switch from a trusted to an untrusted status and vice versa. [sent-48, score-0.647]
31 In practice, a worker submits his or her responses via a web page. [sent-49, score-0.164]
32 Each page contains one gold unit and a variable number of regular units that can be set by the requester dur- ing the calibration phase. [sent-50, score-0.284]
33 If a worker moves back to the trusted status, his or her previous contribution is added to the results as free extra judgments. [sent-53, score-0.246]
34 Such phenomenon typically occurs when the complexity of gold units is high enough to induce low agreement in workers’ answers. [sent-54, score-0.105]
35 Thus, the requester is constrained to review gold units and to eventually forgive workers who missed them. [sent-55, score-0.45]
36 This has massively happened in our experiments and is one of the main causes of the overall cost decrease and time increase. [sent-56, score-0.047]
37 Cost calibration The total cost of a generic crowdsourcing task is naturally bound to a data unit. [sent-57, score-0.206]
38 This represents an issue in most of our experiments, as the number of questions per unit (i. [sent-58, score-0.123]
39 a sentence) varies according to the number of frames and FEs evoked by the LU contained in a sentence. [sent-60, score-0.232]
40 In order to enable cost comparison, for each experiment we need to use the average num- ber of questions per sentence as a multiplier to a constant cost per sentence. [sent-61, score-0.304]
41 We set the payment per working page to 5 $ cents and the number of sentences per page to 3, resulting in 1. [sent-62, score-0.206]
42 1 Assessing task reproducibility and worker behavior change Since our overall goal is to compare the performance of FrameNet annotation using our novel workflow to the performance of the standard, 2step approach, we first take into account past related works and try to reproduce them. [sent-65, score-0.303]
43 To our knowledge, the only attempt to annotate frame information through crowdsourcing is the one presented in Hong and Baker (201 1), which however did not include FE annotation. [sent-66, score-0.588]
44 (a) Workers are invited to read a sentence where a LU is bolded. [sent-68, score-0.071]
45 is combined with the set of frames evoked by the given LU, as well as the None choice. [sent-70, score-0.232]
46 Finally, (c) workers must select the correct frame. [sent-71, score-0.273]
47 A set of example sentences corresponding to each possible frame is provided in the instructions to facilitate workers. [sent-72, score-0.477]
48 As a preliminary study, we wanted to assess to what extent the proposed task could be reproduced and if workers reacted in a comparable way over time. [sent-73, score-0.369]
49 Hong and Baker (201 1) did not publish the input datasets, thus we ignore which sen743 Table 1: Comparison of the reproduced frame discrimination task as per (Hong and Baker, 2011) tences were used. [sent-74, score-0.72]
50 Besides, the authors computed accuracy values directly from the results upon a majority vote ground truth. [sent-75, score-0.031]
51 5 expertannotated sentences as gold-standard data for immediate accuracy computation. [sent-85, score-0.031]
52 For the latter, we only show accuracy values, as the number of sentences was set to a constant value of 18, 2 of which were gold. [sent-87, score-0.031]
53 If we assume that the crowd-based ground truth in 2011 experiments is approximately equivalent to the expert one, workers seem to have reacted in a similar manner compared to Hong and Baker’s values, except for rip. [sent-88, score-0.408]
54 2 General task setting We randomly chose the following LUs among the set of all verbal LUs in FrameNet evoking 2 frames each: disappear. [sent-91, score-0.168]
55 We considered verbal LUs as they usually have more overt arguments in a sentence, so that we were sure to provide workers with enough candidate FEs to annotate. [sent-96, score-0.273]
56 Linguistic tasks in crowdsourcing frameworks are usually decomposed to make them accessible to the crowd. [sent-97, score-0.111]
57 Hence, we set the polysemy of LUs to 2 to ensure that all experiments are executed using the smallest-scale subtask. [sent-98, score-0.031]
58 More frames can then be handled by just replicating the experiments. [sent-99, score-0.142]
59 3 2-step approach After observing that we were able to achieve similar results on the frame discrimination task as in previous work, we focused on the comparison between the 2-step and the 1-step frame annotation approaches. [sent-101, score-1.207]
60 We first set up experiments that emulate the former approach both in frame discrimination and FEs annotation. [sent-102, score-0.624]
61 Given the pipeline nature of the approach, errors in the frame discrimination step will affect FE recognition, thus impacting on the final accuracy. [sent-104, score-0.624]
62 The magnitude of such effect strictly depends on the number of FEs associated with the wrongly detected frame. [sent-105, score-0.033]
63 1 Frame discrimination Frame discrimination is the first phase of the 2step annotation procedure. [sent-108, score-0.4]
64 The task is modeled as per Sec- Discussion Table 2 gives an insight into the results, which confirm the overall good accuracy as per the experiments discussed in Section 3. [sent-112, score-0.147]
65 2 Frame elements recognition We consider all sentences annotated in the previous subtask with the frame assigned by the workers, even if it is not correct. [sent-116, score-0.504]
66 (a) Workers are invited to read a sentence where a LU is bolded and the frame that was identified in the first step is provided as a title. [sent-118, score-0.574]
67 (b) A list of FE definitions is then shown together with the FEs text chunks. [sent-119, score-0.111]
68 Finally, (c) workers must match each definition with the proper FE. [sent-120, score-0.273]
69 FD stands for Frame Discrimination, FER for FEs Recognition Original: The conscious entity, generally a person, that performs the intentional action that results in the damage to the Patient. [sent-124, score-0.164]
70 Manually simplified: This element describes the person that performs the intentional action resulting in the damage to another person or object. [sent-125, score-0.263]
71 Automatic system: What that performs the intentional action that results in the damage to the Patient? [sent-126, score-0.164]
72 complex definitions as Avoid variability in FE definitions, try to Amvaokeid th veamria homogeneous (e. [sent-131, score-0.111]
73 they s throyul tod all start with “This element describes. [sent-133, score-0.029]
74 Although these changes (especially the last item) may make FE definitions less precise from one achieved a better accuracy and a lower num- ber of untrusted annotators compared to the others. [sent-138, score-0.381]
75 Therefore, we use the simplified definitions in both the 2-step and the 1-step approach (Section 3. [sent-139, score-0.193]
76 The total number of answers differs from the total number of trusted judgments, since the average value of questions per sentence amounts to 1. [sent-142, score-0.244]
77 2 First of all, we notice an increase in the number of untrusted judgments. [sent-144, score-0.146]
78 This is caused by a generally low inter-worker agreement on gold sentences due to FE definitions, which still present a certain degree of complexity, even after simplification. [sent-145, score-0.046]
79 We inspected the full reports sentence by sentence and observed a propagation of incorrect judgments when a sentence involves an unclear FE definition. [sent-146, score-0.129]
80 As FE definitions may mutually include mentions of other FEs from the same frame, we believe this circularity generated confusion. [sent-147, score-0.111]
81 4 1-step approach Having set the LU polysemy to 2, in our case a sentence S always contains a LU with 2 possible frames (f1, f2), but only conveys one, e. [sent-149, score-0.199]
82 Furthermore, we allow workers to select the None answer. [sent-157, score-0.273]
83 In practice, we ask a total amount of |E1 ∪ E2 | + 2 questions per sentence aSm. [sent-158, score-0.122]
84 oIunn tth oisf way, we le|t + +th 2e qfuraemsteio directly emerge from the FEs. [sent-159, score-0.041]
85 If workers correctly answer None to a FE definition d ∈ E2, the probability that S etovo ake FEs f1ifnicnrietioasnes d. [sent-160, score-0.273]
86 Modeling Figure 1 displays a screenshot of the worker interface. [sent-161, score-0.191]
87 The sentence Karen threw he r arms round my ne ck spi l ing l , taieoln etxbei cn otgiatmriaep shmaircoerpeno itn ntueiotcifev s iea wrnildy(hfpaoerds iaon sptsoa)ns,ictaeinv, esoietman- tcHh orwoawme. [sent-166, score-0.052]
88 atWanidlsetahsekCtoAaUnS EotMatOeTbIoOtNh 745 Figure 1: 1-step approach worker interface core FEs, respectively as regular and cross-frame units. [sent-172, score-0.164]
89 Instead of precision and recall, we are thus able to directly compute workers’ accuracy upon a majority vote. [sent-174, score-0.031]
90 We envision an improvement with respect to the 2step methodology, as we avoid the proven risk of error propagation originating from wrongly annotated frames in the first step. [sent-175, score-0.255]
91 This demon- strates the hypothesis we stated in Section 1 on the cognitive plausibility of a bottom-up approach for frame annotation. [sent-178, score-0.477]
92 Nevertheless, the cost is sensibly higher due to the higher number of questions that need to be addressed, in average 4. [sent-180, score-0.085]
93 Untrusted judgments seriously grow, mainly because of the cross-frame gold complexity. [sent-183, score-0.097]
94 Workers seem puzzled by the presence of None, which is a required answer for such units. [sent-184, score-0.056]
95 If we consider the English FrameNet annotation agreement values between experts reported by Pad o´ and Lapata (2009) as the upper bound (i. [sent-185, score-0.159]
96 4 Conclusion In this work, we presented an approach to perform frame annotation with crowdsourcing techniques, based on a single annotation step and on manu- ally simplified FE definitions. [sent-190, score-0.882]
97 Since the results seem promising, we are currently running larger scale experiments with the full set ofFrameNet 1. [sent-191, score-0.03]
98 Future work will include the investigation of a frame assignment strategy. [sent-195, score-0.477]
99 Hence, we need a confidence score to determine which frame emerges if workers selected contradictory answers in a subset of cross-frame FE definitions. [sent-197, score-0.79]
100 Divide and conquer: crowdsourcing the creation of cross-lingual textual entailment corpora. [sent-245, score-0.111]
wordName wordTfidf (topN-words)
[('frame', 0.477), ('fes', 0.336), ('workers', 0.273), ('fe', 0.215), ('framenet', 0.176), ('worker', 0.164), ('baker', 0.163), ('chamberlain', 0.155), ('discrimination', 0.147), ('untrusted', 0.146), ('frames', 0.142), ('lus', 0.129), ('hong', 0.122), ('crowdsourcing', 0.111), ('definitions', 0.111), ('annotation', 0.106), ('evoked', 0.09), ('fillmore', 0.09), ('lu', 0.084), ('simplified', 0.082), ('trusted', 0.082), ('burchardt', 0.077), ('intentional', 0.072), ('requester', 0.072), ('negri', 0.072), ('crowd', 0.068), ('annotators', 0.063), ('methodology', 0.063), ('damage', 0.059), ('units', 0.059), ('reacted', 0.058), ('salto', 0.058), ('per', 0.058), ('collin', 0.057), ('ruppenhofer', 0.057), ('meant', 0.057), ('snow', 0.056), ('experts', 0.053), ('heilman', 0.052), ('judgments', 0.051), ('von', 0.05), ('ahn', 0.05), ('sb', 0.048), ('verbosity', 0.048), ('detectives', 0.048), ('calibration', 0.048), ('expert', 0.047), ('cost', 0.047), ('gold', 0.046), ('udo', 0.045), ('invited', 0.045), ('none', 0.044), ('participants', 0.043), ('pad', 0.043), ('replicated', 0.043), ('contributors', 0.041), ('emerge', 0.041), ('answers', 0.04), ('reproduced', 0.038), ('questions', 0.038), ('guidelines', 0.036), ('person', 0.035), ('palmer', 0.034), ('action', 0.033), ('understood', 0.033), ('wrongly', 0.033), ('workflow', 0.033), ('page', 0.032), ('polysemy', 0.031), ('accuracy', 0.031), ('seem', 0.03), ('roles', 0.03), ('ber', 0.03), ('massimo', 0.03), ('simplification', 0.03), ('namely', 0.029), ('jon', 0.029), ('status', 0.029), ('element', 0.029), ('assessing', 0.028), ('redundancy', 0.028), ('judgment', 0.027), ('displays', 0.027), ('annotated', 0.027), ('unit', 0.027), ('risk', 0.027), ('addressed', 0.026), ('sentence', 0.026), ('evoking', 0.026), ('envision', 0.026), ('puzzled', 0.026), ('tchh', 0.026), ('petruck', 0.026), ('mantics', 0.026), ('complies', 0.026), ('cents', 0.026), ('bolded', 0.026), ('palmas', 0.026), ('kowalski', 0.026), ('spi', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000008 265 acl-2013-Outsourcing FrameNet to the Crowd
Author: Marco Fossati ; Claudio Giuliano ; Sara Tonelli
Abstract: We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time.
2 0.24104737 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
Author: Silvana Hartmann ; Iryna Gurevych
Abstract: We present a new bilingual FrameNet lexicon for English and German. It is created through a simple, but powerful approach to construct a FrameNet in any language using Wiktionary as an interlingual representation. Our approach is based on a sense alignment of FrameNet and Wiktionary, and subsequent translation disambiguation into the target language. We perform a detailed evaluation of the created resource and a discussion of Wiktionary as an interlingual connection for the cross-language transfer of lexicalsemantic resources. The created resource is publicly available at http : / /www . ukp .tu-darmst adt .de / fnwkde / .
3 0.22135451 355 acl-2013-TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain
Author: Anoop Kunchukuttan ; Rajen Chatterjee ; Shourya Roy ; Abhijit Mishra ; Pushpak Bhattacharyya
Abstract: Large amount of parallel corpora is required for building Statistical Machine Translation (SMT) systems. We describe the TransDoop system for gathering translations to create parallel corpora from online crowd workforce who have familiarity with multiple languages but are not expert translators. Our system uses a Map-Reduce-like approach to translation crowdsourcing where sentence translation is decomposed into the following smaller tasks: (a) translation ofconstituent phrases of the sentence; (b) validation of quality of the phrase translations; and (c) composition of complete sentence translations from phrase translations. Trans- Doop incorporates quality control mechanisms and easy-to-use worker user interfaces designed to address issues with translation crowdsourcing. We have evaluated the crowd’s output using the METEOR metric. For a complex domain like judicial proceedings, the higher scores obtained by the map-reduce based approach compared to complete sentence translation establishes the efficacy of our work.
4 0.21454532 310 acl-2013-Semantic Frames to Predict Stock Price Movement
Author: Boyi Xie ; Rebecca J. Passonneau ; Leon Wu ; German G. Creamer
Abstract: Semantic frames are a rich linguistic resource. There has been much work on semantic frame parsers, but less that applies them to general NLP problems. We address a task to predict change in stock price from financial news. Semantic frames help to generalize from specific sentences to scenarios, and to detect the (positive or negative) roles of specific companies. We introduce a novel tree representation, and use it to train predictive models with tree kernels using support vector machines. Our experiments test multiple text representations on two binary classification tasks, change of price and polarity. Experiments show that features derived from semantic frame parsing have significantly better performance across years on the polarity task.
5 0.13566341 224 acl-2013-Learning to Extract International Relations from Political Context
Author: Brendan O'Connor ; Brandon M. Stewart ; Noah A. Smith
Abstract: We describe a new probabilistic model for extracting events between major political actors from news corpora. Our unsupervised model brings together familiar components in natural language processing (like parsers and topic models) with contextual political information— temporal and dyad dependence—to infer latent event classes. We quantitatively evaluate the model’s performance on political science benchmarks: recovering expert-assigned event class valences, and detecting real-world conflict. We also conduct a small case study based on our model’s inferences. A supplementary appendix, and replication software/data are available online, at: http://brenocon.com/irevents
6 0.11543331 119 acl-2013-Diathesis alternation approximation for verb clustering
7 0.10941148 195 acl-2013-Improving machine translation by training against an automatic semantic frame based evaluation metric
8 0.09783677 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
9 0.093438171 267 acl-2013-PARMA: A Predicate Argument Aligner
10 0.0785117 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models
11 0.068993069 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
12 0.068682708 175 acl-2013-Grounded Language Learning from Video Described with Sentences
13 0.05984921 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
14 0.059412606 314 acl-2013-Semantic Roles for String to Tree Machine Translation
15 0.059374247 385 acl-2013-WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations
16 0.056632586 99 acl-2013-Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation
17 0.05627048 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
18 0.055443291 145 acl-2013-Exploiting Qualitative Information from Automatic Word Alignment for Cross-lingual NLP Tasks
19 0.054533593 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
20 0.05255799 52 acl-2013-Annotating named entities in clinical text by combining pre-annotation and active learning
topicId topicWeight
[(0, 0.15), (1, 0.042), (2, 0.008), (3, -0.091), (4, -0.044), (5, -0.01), (6, -0.022), (7, 0.025), (8, 0.092), (9, 0.022), (10, -0.044), (11, 0.033), (12, -0.104), (13, 0.029), (14, -0.046), (15, -0.078), (16, -0.002), (17, 0.033), (18, 0.177), (19, -0.014), (20, -0.005), (21, -0.078), (22, -0.12), (23, -0.046), (24, -0.024), (25, -0.07), (26, -0.093), (27, -0.04), (28, 0.049), (29, 0.089), (30, -0.031), (31, 0.109), (32, 0.085), (33, -0.06), (34, -0.121), (35, -0.092), (36, -0.114), (37, -0.094), (38, 0.072), (39, 0.047), (40, 0.25), (41, -0.013), (42, 0.0), (43, 0.064), (44, -0.164), (45, 0.161), (46, -0.19), (47, -0.043), (48, 0.089), (49, 0.045)]
simIndex simValue paperId paperTitle
same-paper 1 0.94859296 265 acl-2013-Outsourcing FrameNet to the Crowd
Author: Marco Fossati ; Claudio Giuliano ; Sara Tonelli
Abstract: We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time.
2 0.64241135 310 acl-2013-Semantic Frames to Predict Stock Price Movement
Author: Boyi Xie ; Rebecca J. Passonneau ; Leon Wu ; German G. Creamer
Abstract: Semantic frames are a rich linguistic resource. There has been much work on semantic frame parsers, but less that applies them to general NLP problems. We address a task to predict change in stock price from financial news. Semantic frames help to generalize from specific sentences to scenarios, and to detect the (positive or negative) roles of specific companies. We introduce a novel tree representation, and use it to train predictive models with tree kernels using support vector machines. Our experiments test multiple text representations on two binary classification tasks, change of price and polarity. Experiments show that features derived from semantic frame parsing have significantly better performance across years on the polarity task.
3 0.57805353 162 acl-2013-FrameNet on the Way to Babel: Creating a Bilingual FrameNet Using Wiktionary as Interlingual Connection
Author: Silvana Hartmann ; Iryna Gurevych
Abstract: We present a new bilingual FrameNet lexicon for English and German. It is created through a simple, but powerful approach to construct a FrameNet in any language using Wiktionary as an interlingual representation. Our approach is based on a sense alignment of FrameNet and Wiktionary, and subsequent translation disambiguation into the target language. We perform a detailed evaluation of the created resource and a discussion of Wiktionary as an interlingual connection for the cross-language transfer of lexicalsemantic resources. The created resource is publicly available at http : / /www . ukp .tu-darmst adt .de / fnwkde / .
4 0.55136967 119 acl-2013-Diathesis alternation approximation for verb clustering
Author: Lin Sun ; Diana McCarthy ; Anna Korhonen
Abstract: Although diathesis alternations have been used as features for manual verb classification, and there is recent work on incorporating such features in computational models of human language acquisition, work on large scale verb classification has yet to examine the potential for using diathesis alternations as input features to the clustering process. This paper proposes a method for approximating diathesis alternation behaviour in corpus data and shows, using a state-of-the-art verb clustering system, that features based on alternation approximation outperform those based on independent subcategorization frames. Our alternation-based approach is particularly adept at leveraging information from less frequent data.
5 0.53107613 100 acl-2013-Crowdsourcing Interaction Logs to Understand Text Reuse from the Web
Author: Martin Potthast ; Matthias Hagen ; Michael Volske ; Benno Stein
Abstract: unkown-abstract
6 0.4751215 224 acl-2013-Learning to Extract International Relations from Political Context
7 0.44185123 367 acl-2013-Universal Conceptual Cognitive Annotation (UCCA)
8 0.41449457 355 acl-2013-TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain
9 0.41161233 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
10 0.4067834 324 acl-2013-Smatch: an Evaluation Metric for Semantic Feature Structures
11 0.37511408 192 acl-2013-Improved Lexical Acquisition through DPP-based Verb Clustering
12 0.37125769 385 acl-2013-WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations
13 0.35895386 344 acl-2013-The Effects of Lexical Resource Quality on Preference Violation Detection
14 0.35551405 195 acl-2013-Improving machine translation by training against an automatic semantic frame based evaluation metric
15 0.33884582 213 acl-2013-Language Acquisition and Probabilistic Models: keeping it simple
16 0.33067316 234 acl-2013-Linking and Extending an Open Multilingual Wordnet
17 0.33021954 267 acl-2013-PARMA: A Predicate Argument Aligner
18 0.32363495 349 acl-2013-The mathematics of language learning
19 0.32077485 161 acl-2013-Fluid Construction Grammar for Historical and Evolutionary Linguistics
20 0.30683684 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
topicId topicWeight
[(0, 0.052), (6, 0.058), (11, 0.07), (15, 0.024), (24, 0.032), (26, 0.045), (35, 0.091), (38, 0.191), (42, 0.05), (48, 0.039), (64, 0.065), (70, 0.049), (88, 0.026), (90, 0.026), (95, 0.109)]
simIndex simValue paperId paperTitle
Author: Simone Paolo Ponzetto ; Andrea Zielinski
Abstract: unkown-abstract
same-paper 2 0.86273813 265 acl-2013-Outsourcing FrameNet to the Crowd
Author: Marco Fossati ; Claudio Giuliano ; Sara Tonelli
Abstract: We present the first attempt to perform full FrameNet annotation with crowdsourcing techniques. We compare two approaches: the first one is the standard annotation methodology of lexical units and frame elements in two steps, while the second is a novel approach aimed at acquiring frames in a bottom-up fashion, starting from frame element annotation. We show that our methodology, relying on a single annotation step and on simplified role definitions, outperforms the standard one both in terms of accuracy and time.
3 0.72361946 228 acl-2013-Leveraging Domain-Independent Information in Semantic Parsing
Author: Dan Goldwasser ; Dan Roth
Abstract: Semantic parsing is a domain-dependent process by nature, as its output is defined over a set of domain symbols. Motivated by the observation that interpretation can be decomposed into domain-dependent and independent components, we suggest a novel interpretation model, which augments a domain dependent model with abstract information that can be shared by multiple domains. Our experiments show that this type of information is useful and can reduce the annotation effort significantly when moving between domains.
4 0.71865356 15 acl-2013-A Novel Graph-based Compact Representation of Word Alignment
Author: Qun Liu ; Zhaopeng Tu ; Shouxun Lin
Abstract: In this paper, we propose a novel compact representation called weighted bipartite hypergraph to exploit the fertility model, which plays a critical role in word alignment. However, estimating the probabilities of rules extracted from hypergraphs is an NP-complete problem, which is computationally infeasible. Therefore, we propose a divide-and-conquer strategy by decomposing a hypergraph into a set of independent subhypergraphs. The experiments show that our approach outperforms both 1-best and n-best alignments.
Author: Guido Boella ; Luigi Di Caro
Abstract: In this paper we present a technique to reveal definitions and hypernym relations from text. Instead of using pattern matching methods that rely on lexico-syntactic patterns, we propose a technique which only uses syntactic dependencies between terms extracted with a syntactic parser. The assumption is that syntactic information are more robust than patterns when coping with length and complexity of the sentences. Afterwards, we transform such syntactic contexts in abstract representations, that are then fed into a Support Vector Machine classifier. The results on an annotated dataset of definitional sentences demonstrate the validity of our approach overtaking current state-of-the-art techniques.
6 0.70960891 355 acl-2013-TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain
7 0.69484532 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
8 0.69444746 6 acl-2013-A Java Framework for Multilingual Definition and Hypernym Extraction
9 0.69132841 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation
10 0.69076484 207 acl-2013-Joint Inference for Fine-grained Opinion Extraction
11 0.69014597 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
12 0.68971509 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
13 0.68819129 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
14 0.68734276 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
15 0.68666512 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
16 0.68436098 250 acl-2013-Models of Translation Competitions
17 0.68396205 174 acl-2013-Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation
18 0.68387264 8 acl-2013-A Learner Corpus-based Approach to Verb Suggestion for ESL
19 0.68303943 333 acl-2013-Summarization Through Submodularity and Dispersion
20 0.68241876 240 acl-2013-Microblogs as Parallel Corpora