acl acl2013 acl2013-206 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Qi Li ; Heng Ji ; Liang Huang
Abstract: Traditional approaches to the task of ACE event extraction usually rely on sequential pipelines with multiple stages, which suffer from error propagation since event triggers and arguments are predicted in isolation by independent local classifiers. By contrast, we propose a joint framework based on structured prediction which extracts triggers and arguments together so that the local predictions can be mutually improved. In addition, we propose to incorporate global features which explicitly capture the dependencies of multiple triggers and arguments. Experimental results show that our joint approach with local features outperforms the pipelined baseline, and adding global features further improves the performance significantly. Our approach advances state-ofthe-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents.
Reference: text
sentIndex sentText sentNum sentScore
1 By contrast, we propose a joint framework based on structured prediction which extracts triggers and arguments together so that the local predictions can be mutually improved. [sent-2, score-0.495]
2 In addition, we propose to incorporate global features which explicitly capture the dependencies of multiple triggers and arguments. [sent-3, score-0.382]
3 Experimental results show that our joint approach with local features outperforms the pipelined baseline, and adding global features further improves the performance significantly. [sent-4, score-0.393]
4 Our approach advances state-ofthe-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents. [sent-5, score-0.828]
5 1 Introduction Event extraction is an important and challenging task in Information Extraction (IE), which aims to discover event triggers with specific types and their arguments. [sent-6, score-0.711]
6 , 2011) use sequential pipelines as building blocks, which break down the whole task into separate subtasks, such as trigger identification/classification and argument identification/classification. [sent-8, score-0.68]
7 For example, consider the following sentences with an ambiguous word “fired”: (1) In Baghdad, a cameraman died when an American tank fired on the Palestine Hotel. [sent-14, score-0.352]
8 In sentence (1), “fired” is a trigger of type Attack. [sent-16, score-0.367]
9 Because of the ambiguity, a local classifier may miss it or mislabel it as a trigger of End-Position. [sent-17, score-0.452]
10 However, knowing that “tank” is very likely to be an Instrument argument of Attack events, the correct event subtype assignment of “fired” is obviously Attack. [sent-18, score-0.857]
11 Likewise, in sentence (2), “air defense chief” is a job title, hence the argument classifier is likely to label it as an Entity argument for End-Position trigger. [sent-19, score-0.651]
12 In addition, the local classifiers are incapable of capturing inter-dependencies among multiple event triggers and arguments. [sent-20, score-0.734]
13 Figure 1 depicts the corresponding event triggers and arguments. [sent-22, score-0.683]
14 By using global features, we can propagate the Victim argument of the Die event to the Target argument of the Attack event. [sent-24, score-1.174]
15 As another example, knowing that an Attack event usually only has one Attacker argument, we could penalize assignments in which one trigger has more than one Attacker. [sent-25, score-0.797]
16 We propose a novel joint event extraction algorithm to predict the triggers and arguments simultaneously, and use the structured perceptron (Collins, 2002) to train the joint model. [sent-28, score-1.049]
17 This way we can capture the dependencies between triggers and argument as well as explore 73 ProceedinSgosfi oa,f tB huel 5g1arsita, An Anuugauls Mt 4e-e9ti n2g01 o3f. [sent-29, score-0.532]
18 There are two event mentions that share three arguments, namely the Die event mention triggered by “died”, and the Attack event mention triggered by “fired”. [sent-33, score-1.485]
19 Therefore we employ beam search in decoding, and train the model using the early-update perceptron variant tailored for beam search (Collins and Roark, 2004; Huang et al. [sent-36, score-0.633]
20 Different from traditional pipeline approach, we present a novel framework for sentencelevel event extraction, which predicts triggers and their arguments jointly (Section 3). [sent-39, score-0.815]
21 We develop a rich set of features for event extraction which yield promising performance even with the traditional pipeline (Section 3. [sent-41, score-0.537]
22 We introduce various global features to exploit dependencies among multiple triggers and arguments (Section 3. [sent-46, score-0.476]
23 2 Event Extraction Task In this paper we focus on the event extraction task defined in Automatic Content Extraction (ACE) evaluation. [sent-50, score-0.492]
24 1 The task defines 8 event types and 33 subtypes such as Attack, End-Position etc. [sent-51, score-0.479]
25 We introduce the terminology of the ACE event extraction that we used in this paper: 1http : / /pro j e ct s . [sent-52, score-0.492]
26 edu / ace / • Event mention: an occurrence of an event • • • Ewvitehn a particular type acncdu subtype. [sent-55, score-0.568]
27 Job-Title) tphoatserves as a participant or attribute with a specific role in an event mention. [sent-59, score-0.457]
28 Event mention: an instance that includes one event trigger ann:d a some arguments tchluatd appear within the same sentence. [sent-60, score-0.913]
29 Given an English text document, an event extraction system should predict event triggers with specific subtypes and their arguments from each sentence. [sent-61, score-1.284]
30 Figure 1 depicts the event triggers and their arguments of sentence (1) in Section 1. [sent-62, score-0.777]
31 The outcome of the entire sentence can be considered a graph in which each argument role is represented as a typed edge from a trigger to its argument. [sent-63, score-0.751]
32 In this work, we assume that argument candidates such as entities are part of the input to the event extraction, and can be from either gold standard or IE system output. [sent-64, score-0.772]
33 3 Joint Framework for Event Extraction Based on the hypothesis that facts are interdependent, we propose to use structured perceptron with inexact search to jointly extract triggers and arguments that co-occur in the same sentence. [sent-65, score-0.617]
34 1 Structured perceptron with beam search Structured perceptron is an extension to the standard linear perceptron for structured prediction, which was proposed in (Collins, 2002). [sent-68, score-0.71]
35 Unfortunately, it is intractable to perform the exact search in our framework because: (1) by jointly modeling the trigger labeling and argument labeling, the search space becomes much more complex. [sent-77, score-0.853]
36 Figure 2 describes the skeleton of perceptron training algorithm with beam search. [sent-82, score-0.356]
37 In each step of the beam search, if the prefix of oracle assignment y falls out from the beam, then the top result in the beam is returned for early update. [sent-83, score-0.465]
38 5 we will show that the standard perceptron introduces many invalid updates especially with smaller beam sizes, also observed by Huang et al. [sent-89, score-0.522]
39 Label sets Here we introduce the label sets for trigger and argument in the model. [sent-103, score-0.705]
40 We use L ∪ {⊥} to denote tghuem trigger tlhaebe ml alphabet, uwsheer Le ∪L represents tohtee t3h3e e trveigngte subtypes, ahnadb e⊥t, winhdiecraete Ls trhepatr ethseen ttosk theen i3s3 n eovte a trigger. [sent-104, score-0.367]
41 Similarly, nRdi c∪a {⊥} adten thoete tos ktehen argument liagbgeelr sets, wilhaerrley, ,R R Ri s∪ th {e⊥ s}et d doefn possible argument roles, eatsn,d w ⊥h means t thhaet tehte o argument caragnudmideantet riso lnesot, an argument sfo trh tahte t hceur arergnut trigger. [sent-105, score-1.252]
42 For example, the Attacker argument for an Attack event can only be one of PER, ORG and GPE (Geo-political Entity). [sent-108, score-0.743]
43 , as,m) to denote the corresponding gold standard structure, where ti represents the trigger assignment for the token xi, and ai,k represents the argument role label for the edge between xi and argument candidate ek. [sent-123, score-1.161]
44 During each step with token i, there are two sub-steps: • Trigger labeling We enumerate all possible trigger rla lbaeblse lfionrg gth We ceu ernreunmt etorakteen. [sent-134, score-0.456]
45 • Argument labeling After the trigger labeling step, we traverse Aallf configurations ilanb tehlebeam. [sent-138, score-0.493]
46 Once a trigger label for xi is found in the beam, the decoder searches through the argument candidates E to label the edges beatwrgeuemn eenatch ca argument Ec atnod laidbaetle t ahned e dtghee trigger. [sent-139, score-1.105]
47 After labeling each argument candidate, we again score each partial assignment and select the K-best results to the beam. [sent-140, score-0.411]
48 After the second step, the rank of different trigger assignments can be changed because of the argument edges. [sent-141, score-0.68]
49 Likewise, the decision on later argument candidates may be affected by earlier argument assignments. [sent-142, score-0.655]
50 L ×o Ycal → →fea Rtu,r wesh are only related to predictions on individual trigger or argument. [sent-164, score-0.367]
51 In the case of unigram tagging for trigger labeling, each local feature takes the form of f(x, i, yg(i) ), where idenotes the index of the current token, and yg(i) is its trigger label. [sent-165, score-0.877]
52 Formally, each global feature function takes the form of f(x, i, k, y), where i and k denote the indices of the current token and argument candidate in decoding, respectively. [sent-167, score-0.515]
53 In practice, we define two versions of q(yg(i) ): (yg(i) ) = yg(i) (event subtype) q1 (yg(i) ) = event type of yg(i) q0 (yg(i) ) is a backoff version of the standard unigram feature. [sent-171, score-0.43]
54 Some text features for the same event type may share a certain distributional similarity regardless of the subtypes. [sent-172, score-0.475]
55 Argument features Similarly, the local feature function for argument labeling can be represented as f(x, i, k, yg(i) , yh(i,k) ) = p(x, i, k) ◦ q(yg(i), yh(i,k)), where yh(i,k) de)no =tes p (txhe, argument assignment for the edge between trigger word iand argument candidate ek. [sent-174, score-1.592]
56 We define two versions of q(yg(i),yh(i,k)): q0(yg(i),yh(i,k) = y gh(i,)k◦)yh(i,fky)hT(im,koe)thioser PwNlaiosnce , q1(yg(i),yh(i,k) =(01 ioft hyehr(wi, ks)e6=None It is notable that Place and Time arguments are applicable and behave similarly to all event subtypes. [sent-175, score-0.524]
57 Therefore features for these arguments are not conjuncted with trigger labels. [sent-176, score-0.506]
58 q1 (yh(i,k) ) can be considered as a backoff version of q0(yh(i,k)) , which does not discriminate different argument roles but only focuses on argument identification. [sent-177, score-0.626]
59 Table 1 summarizes the text features about the input for trigger and argument labeling. [sent-178, score-0.725]
60 02 Table 3: Top 5 event subtypes that co-occur with Attack event in the same sentence. [sent-194, score-0.909]
61 Trigger global feature This type of feature captures the dependencies between two triggers within the same sentence. [sent-195, score-0.409]
62 For instance: feature (1) captures the co-occurrence of trigger types. [sent-196, score-0.403]
63 This kind of feature is motivated by the fact that two event mentions in the same sentence tend to be semantically coherent. [sent-197, score-0.491]
64 As an example, from Table 3 we can see that Attack event often co-occur with Die event in the same sentence, but rarely co-occur with Start-Position event. [sent-198, score-0.86]
65 A simple example is whether an Attack trigger and a Die trigger are linked by the dependency relation conj and. [sent-201, score-0.801]
66 The Transport event mention “transport” has two Artifact arguments, “women ” and “children ”. [sent-204, score-0.515]
67 The dependency edge conj and between “women ” and “children ” indicates that they should play the same role in the event mention. [sent-205, score-0.546]
68 In this example, an entity mention is Victim argument to Die event and Target argument to Attack event, and the two event triggers are connected by the typed dependency advcl. [sent-209, score-1.865]
69 The feature in the triangle shape can be considered as a soft constraint such that if a JobTitle mention is a Position argument to an EndPosition trigger, then the Organization mention 78 which appears at the end of it should be labeled as Entity argument for the same trigger. [sent-214, score-0.857]
70 For instance, in many cases, a trigger can only have one Place argument. [sent-216, score-0.367]
71 , 2011), we use the following criteria to determine the correctness of an predicted event mention: • • A trigger is correct if its event subtype and oAff tsreitgsg mera ticsh c tohrroesect to iff a t rsef eevreenntce s trigger. [sent-223, score-1.306]
72 An argument is correctly identified ifits event • subtype manedn toifsfsceotrsr mectaltychid tehnotsiefi eodf any oefv tehnet reference argument mentions. [sent-224, score-1.135]
73 An argument is correctly identified and classified gifu mitse event subtype, oefnftsiefitesd a anndd argument role match those of any of the reference argument mentions. [sent-225, score-1.396]
74 To compare our proposed method with the previous pipelined approaches, we implemented two Maximum Entropy (MaxEnt) classifiers for trigger labeling and argument labeling respectively. [sent-229, score-0.853]
75 Figure 6 shows the training curves of the averaged perceptron with respect to the performance on the development set when the beam size is 4. [sent-234, score-0.417]
76 4 Impact of beam size The beam size is an important hyper parameter in both training and test. [sent-238, score-0.482]
77 Larger beam size will increase the computational cost while smaller beam size may reduce the performance. [sent-239, score-0.482]
78 When beam size = 4, the algorithm achieved the highest performance on the development set with trigger F1 = 67. [sent-241, score-0.608]
79 The model with only local features made much smaller numbers of invalid updates, which suggests that the use of global features makes the search problem much harder. [sent-269, score-0.422]
80 When the beam size is increased (b = 4), the gap becomes smaller as the ratio of invalid updates is reduced. [sent-273, score-0.407]
81 The proposed joint framework with local features achieves comparable performance for triggers and outperforms the staged baseline especially on arguments. [sent-278, score-0.503]
82 3% for arguments, our approach with global features achieves even better performance on argument labeling although we only used sentence- level information. [sent-286, score-0.539]
83 Our approach aims to tackle the problem of sentence-level event extraction, thereby only used intra-sentential evidence. [sent-295, score-0.43]
84 5 Related Work Most recent studies about ACE event extraction rely on staged pipeline which consists of separate local classifiers for trigger labeling and argument labeling (Grishman et al. [sent-297, score-1.459]
85 To the best of our knowledge, our work is the first attempt to jointly model these two ACE event subtasks. [sent-301, score-0.453]
86 For the Message Understanding Conference (MUC) and FAS Program for Monitoring Emerging Diseases (ProMED) event extraction tasks, Patwardhan and Riloff (2009) proposed a probabilistic framework to extract event role fillers conditioned on the sentential event occurrence. [sent-335, score-1.435]
87 Besides having different task definitions, the key difference from our approach is that their role filler recognizer and sentential event recognizer are trained independently but combined in the test stage. [sent-336, score-0.532]
88 They casted the problem of biomedical event extraction as a dependency parsing prob- lem. [sent-343, score-0.565]
89 The key assumption that event structure can be considered as trees is incompatible with ACE event extraction. [sent-344, score-0.86]
90 In addition, they used a separate classifier to predict the event triggers before applying the parser, while we extract the triggers and argument jointly. [sent-345, score-1.181]
91 In comparison, our approach is a unified framework based on beam search, which allows us to exploit arbitrary global features efficiently. [sent-348, score-0.403]
92 6 Conclusions and Future Work We presented a joint framework for ACE event extraction based on structured perceptron with inexact search. [sent-349, score-0.798]
93 The experiments proved that the perceptron with local features outperforms the staged baseline and the global features further improve the performance significantly, surpassing the current state-of-the-art by a large margin. [sent-351, score-0.554]
94 To improve the accuracy of endto-end IE system, we plan to develop a complete joint framework to recognize entities together with event mentions for future work. [sent-353, score-0.511]
95 Language specific issue and feature exploration in chinese event extraction. [sent-383, score-0.466]
96 Joint modeling for chinese event extraction with rich linguistic features. [sent-387, score-0.492]
97 Employing compositional semantics and discourse consistency in chinese event extraction. [sent-420, score-0.43]
98 Using document level cross-event inference to improve event extraction. [sent-428, score-0.43]
99 Fast and robust joint models for biomedical event extraction. [sent-449, score-0.51]
100 Robust biomedical event extraction with dual decomposition and minimal domain adaptation. [sent-453, score-0.541]
wordName wordTfidf (topN-words)
[('event', 0.43), ('trigger', 0.367), ('argument', 0.313), ('yg', 0.26), ('triggers', 0.219), ('beam', 0.215), ('grishman', 0.167), ('fired', 0.15), ('perceptron', 0.141), ('ace', 0.138), ('buf', 0.132), ('attack', 0.125), ('global', 0.118), ('liao', 0.107), ('yh', 0.101), ('invalid', 0.098), ('arguments', 0.094), ('died', 0.086), ('local', 0.085), ('mention', 0.085), ('subtype', 0.079), ('ji', 0.078), ('die', 0.078), ('staged', 0.076), ('transport', 0.076), ('vivendi', 0.075), ('inexact', 0.068), ('updates', 0.068), ('tank', 0.066), ('labeling', 0.063), ('extraction', 0.062), ('heng', 0.058), ('riedel', 0.055), ('hong', 0.055), ('configuration', 0.05), ('cameraman', 0.05), ('biomedical', 0.049), ('subtypes', 0.049), ('decoding', 0.049), ('pipelined', 0.047), ('women', 0.047), ('features', 0.045), ('huang', 0.043), ('conj', 0.043), ('ralph', 0.041), ('structured', 0.041), ('award', 0.039), ('baghdad', 0.037), ('ibf', 0.037), ('standardupdate', 0.037), ('trig', 0.037), ('collins', 0.036), ('ie', 0.036), ('feature', 0.036), ('assignment', 0.035), ('curves', 0.035), ('depicts', 0.034), ('xi', 0.033), ('attacker', 0.033), ('qiaoming', 0.033), ('bbu', 0.033), ('timex', 0.033), ('sentential', 0.031), ('mcclosky', 0.031), ('joint', 0.031), ('search', 0.031), ('palestine', 0.031), ('harmonic', 0.03), ('entity', 0.029), ('candidates', 0.029), ('universal', 0.028), ('entertainment', 0.027), ('queens', 0.027), ('cuny', 0.027), ('role', 0.027), ('chen', 0.027), ('artifact', 0.026), ('violation', 0.026), ('size', 0.026), ('token', 0.026), ('label', 0.025), ('attained', 0.025), ('triangle', 0.025), ('framework', 0.025), ('mentions', 0.025), ('jobs', 0.024), ('sentencelevel', 0.024), ('victim', 0.024), ('dependency', 0.024), ('apple', 0.023), ('guodong', 0.023), ('jointly', 0.023), ('instance', 0.022), ('encourages', 0.022), ('recognizer', 0.022), ('edge', 0.022), ('org', 0.022), ('current', 0.022), ('typed', 0.022), ('outperforms', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000004 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features
Author: Qi Li ; Heng Ji ; Liang Huang
Abstract: Traditional approaches to the task of ACE event extraction usually rely on sequential pipelines with multiple stages, which suffer from error propagation since event triggers and arguments are predicted in isolation by independent local classifiers. By contrast, we propose a joint framework based on structured prediction which extracts triggers and arguments together so that the local predictions can be mutually improved. In addition, we propose to incorporate global features which explicitly capture the dependencies of multiple triggers and arguments. Experimental results show that our joint approach with local features outperforms the pipelined baseline, and adding global features further improves the performance significantly. Our approach advances state-ofthe-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents.
2 0.63796324 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
Author: Peifeng Li ; Qiaoming Zhu ; Guodong Zhou
Abstract: As a paratactic language, sentence-level argument extraction in Chinese suffers much from the frequent occurrence of ellipsis with regard to inter-sentence arguments. To resolve such problem, this paper proposes a novel global argument inference model to explore specific relationships, such as Coreference, Sequence and Parallel, among relevant event mentions to recover those intersentence arguments in the sentence, discourse and document layers which represent the cohesion of an event or a topic. Evaluation on the ACE 2005 Chinese corpus justifies the effectiveness of our global argument inference model over a state-of-the-art baseline. 1
3 0.21602362 386 acl-2013-What causes a causal relation? Detecting Causal Triggers in Biomedical Scientific Discourse
Author: Claudiu Mihaila ; Sophia Ananiadou
Abstract: Current domain-specific information extraction systems represent an important resource for biomedical researchers, who need to process vaster amounts of knowledge in short times. Automatic discourse causality recognition can further improve their workload by suggesting possible causal connections and aiding in the curation of pathway models. We here describe an approach to the automatic identification of discourse causality triggers in the biomedical domain using machine learning. We create several baselines and experiment with various parameter settings for three algorithms, i.e., Conditional Random Fields (CRF), Support Vector Machines (SVM) and Random Forests (RF). Also, we evaluate the impact of lexical, syntactic and semantic features on each of the algorithms and look at er- rors. The best performance of 79.35% F-score is achieved by CRFs when using all three feature types.
4 0.2072743 296 acl-2013-Recognizing Identical Events with Graph Kernels
Author: Goran Glavas ; Jan Snajder
Abstract: Identifying news stories that discuss the same real-world events is important for news tracking and retrieval. Most existing approaches rely on the traditional vector space model. We propose an approach for recognizing identical real-world events based on a structured, event-oriented document representation. We structure documents as graphs of event mentions and use graph kernels to measure the similarity between document pairs. Our experiments indicate that the proposed graph-based approach can outperform the traditional vector space model, and is especially suitable for distinguishing between topically similar, yet non-identical events.
5 0.19842003 153 acl-2013-Extracting Events with Informal Temporal References in Personal Histories in Online Communities
Author: Miaomiao Wen ; Zeyu Zheng ; Hyeju Jang ; Guang Xiang ; Carolyn Penstein Rose
Abstract: We present a system for extracting the dates of illness events (year and month of the event occurrence) from posting histories in the context of an online medical support community. A temporal tagger retrieves and normalizes dates mentioned informally in social media to actual month and year referents. Building on this, an event date extraction system learns to integrate the likelihood of candidate dates extracted from time-rich sentences with temporal constraints extracted from eventrelated sentences. Our integrated model achieves 89.7% of the maximum performance given the performance of the temporal expression retrieval step.
6 0.19253148 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
7 0.13815221 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation
8 0.13317581 224 acl-2013-Learning to Extract International Relations from Political Context
9 0.13160309 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
10 0.12879317 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data
11 0.12683964 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
12 0.12664269 358 acl-2013-Transition-based Dependency Parsing with Selectional Branching
13 0.1239128 267 acl-2013-PARMA: A Predicate Argument Aligner
14 0.11526078 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling
15 0.10278322 98 acl-2013-Cross-lingual Transfer of Semantic Role Labeling Models
16 0.10023125 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
17 0.097303778 27 acl-2013-A Two Level Model for Context Sensitive Inference Rules
18 0.092705861 134 acl-2013-Embedding Semantic Similarity in Tree Kernels for Domain Adaptation of Relation Extraction
19 0.085294351 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
20 0.084714077 247 acl-2013-Modeling of term-distance and term-occurrence information for improving n-gram language model performance
topicId topicWeight
[(0, 0.205), (1, 0.009), (2, -0.134), (3, -0.132), (4, 0.002), (5, 0.366), (6, 0.005), (7, 0.247), (8, -0.064), (9, 0.098), (10, 0.057), (11, -0.067), (12, 0.029), (13, 0.048), (14, 0.179), (15, -0.061), (16, -0.122), (17, -0.146), (18, 0.182), (19, 0.053), (20, 0.074), (21, -0.167), (22, -0.007), (23, 0.013), (24, 0.024), (25, -0.126), (26, -0.059), (27, -0.067), (28, -0.022), (29, -0.056), (30, -0.058), (31, -0.051), (32, -0.17), (33, 0.125), (34, -0.085), (35, 0.031), (36, 0.072), (37, 0.205), (38, 0.003), (39, -0.053), (40, 0.076), (41, 0.026), (42, -0.014), (43, 0.009), (44, 0.014), (45, -0.055), (46, 0.066), (47, 0.005), (48, -0.071), (49, 0.024)]
simIndex simValue paperId paperTitle
same-paper 1 0.97016954 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features
Author: Qi Li ; Heng Ji ; Liang Huang
Abstract: Traditional approaches to the task of ACE event extraction usually rely on sequential pipelines with multiple stages, which suffer from error propagation since event triggers and arguments are predicted in isolation by independent local classifiers. By contrast, we propose a joint framework based on structured prediction which extracts triggers and arguments together so that the local predictions can be mutually improved. In addition, we propose to incorporate global features which explicitly capture the dependencies of multiple triggers and arguments. Experimental results show that our joint approach with local features outperforms the pipelined baseline, and adding global features further improves the performance significantly. Our approach advances state-ofthe-art sentence-level event extraction, and even outperforms previous argument labeling methods which use external knowledge from other sentences and documents.
2 0.91320556 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
Author: Peifeng Li ; Qiaoming Zhu ; Guodong Zhou
Abstract: As a paratactic language, sentence-level argument extraction in Chinese suffers much from the frequent occurrence of ellipsis with regard to inter-sentence arguments. To resolve such problem, this paper proposes a novel global argument inference model to explore specific relationships, such as Coreference, Sequence and Parallel, among relevant event mentions to recover those intersentence arguments in the sentence, discourse and document layers which represent the cohesion of an event or a topic. Evaluation on the ACE 2005 Chinese corpus justifies the effectiveness of our global argument inference model over a state-of-the-art baseline. 1
3 0.62040633 296 acl-2013-Recognizing Identical Events with Graph Kernels
Author: Goran Glavas ; Jan Snajder
Abstract: Identifying news stories that discuss the same real-world events is important for news tracking and retrieval. Most existing approaches rely on the traditional vector space model. We propose an approach for recognizing identical real-world events based on a structured, event-oriented document representation. We structure documents as graphs of event mentions and use graph kernels to measure the similarity between document pairs. Our experiments indicate that the proposed graph-based approach can outperform the traditional vector space model, and is especially suitable for distinguishing between topically similar, yet non-identical events.
4 0.58636618 153 acl-2013-Extracting Events with Informal Temporal References in Personal Histories in Online Communities
Author: Miaomiao Wen ; Zeyu Zheng ; Hyeju Jang ; Guang Xiang ; Carolyn Penstein Rose
Abstract: We present a system for extracting the dates of illness events (year and month of the event occurrence) from posting histories in the context of an online medical support community. A temporal tagger retrieves and normalizes dates mentioned informally in social media to actual month and year referents. Building on this, an event date extraction system learns to integrate the likelihood of candidate dates extracted from time-rich sentences with temporal constraints extracted from eventrelated sentences. Our integrated model achieves 89.7% of the maximum performance given the performance of the temporal expression retrieval step.
5 0.5609737 386 acl-2013-What causes a causal relation? Detecting Causal Triggers in Biomedical Scientific Discourse
Author: Claudiu Mihaila ; Sophia Ananiadou
Abstract: Current domain-specific information extraction systems represent an important resource for biomedical researchers, who need to process vaster amounts of knowledge in short times. Automatic discourse causality recognition can further improve their workload by suggesting possible causal connections and aiding in the curation of pathway models. We here describe an approach to the automatic identification of discourse causality triggers in the biomedical domain using machine learning. We create several baselines and experiment with various parameter settings for three algorithms, i.e., Conditional Random Fields (CRF), Support Vector Machines (SVM) and Random Forests (RF). Also, we evaluate the impact of lexical, syntactic and semantic features on each of the algorithms and look at er- rors. The best performance of 79.35% F-score is achieved by CRFs when using all three feature types.
6 0.5126062 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation
7 0.5058009 224 acl-2013-Learning to Extract International Relations from Political Context
8 0.48283544 22 acl-2013-A Structured Distributional Semantic Model for Event Co-reference
9 0.44946206 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
10 0.44566953 267 acl-2013-PARMA: A Predicate Argument Aligner
11 0.41104284 247 acl-2013-Modeling of term-distance and term-occurrence information for improving n-gram language model performance
12 0.36746907 189 acl-2013-ImpAr: A Deterministic Algorithm for Implicit Semantic Role Labelling
13 0.36197644 178 acl-2013-HEADY: News headline abstraction through event pattern clustering
14 0.32723951 132 acl-2013-Easy-First POS Tagging and Dependency Parsing with Beam Search
15 0.31841332 376 acl-2013-Using Lexical Expansion to Learn Inference Rules from Sparse Data
16 0.30455753 57 acl-2013-Arguments and Modifiers from the Learner's Perspective
17 0.30208051 387 acl-2013-Why-Question Answering using Intra- and Inter-Sentential Causal Relations
18 0.29990512 17 acl-2013-A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation
19 0.29418215 175 acl-2013-Grounded Language Learning from Video Described with Sentences
20 0.27059078 306 acl-2013-SPred: Large-scale Harvesting of Semantic Predicates
topicId topicWeight
[(0, 0.035), (6, 0.034), (11, 0.048), (15, 0.017), (24, 0.025), (26, 0.049), (35, 0.051), (42, 0.468), (48, 0.037), (70, 0.039), (88, 0.021), (90, 0.025), (95, 0.059)]
simIndex simValue paperId paperTitle
Author: Sina Zarriess ; Jonas Kuhn
Abstract: We suggest a generation task that integrates discourse-level referring expression generation and sentence-level surface realization. We present a data set of German articles annotated with deep syntax and referents, including some types of implicit referents. Our experiments compare several architectures varying the order of a set of trainable modules. The results suggest that a revision-based pipeline, with intermediate linearization, significantly outperforms standard pipelines or a parallel architecture.
2 0.98490262 125 acl-2013-Distortion Model Considering Rich Context for Statistical Machine Translation
Author: Isao Goto ; Masao Utiyama ; Eiichiro Sumita ; Akihiro Tamura ; Sadao Kurohashi
Abstract: This paper proposes new distortion models for phrase-based SMT. In decoding, a distortion model estimates the source word position to be translated next (NP) given the last translated source word position (CP). We propose a distortion model that can consider the word at the CP, a word at an NP candidate, and the context of the CP and the NP candidate simultaneously. Moreover, we propose a further improved model that considers richer context by discriminating label sequences that specify spans from the CP to NP candidates. It enables our model to learn the effect of relative word order among NP candidates as well as to learn the effect of distances from the training data. In our experiments, our model improved 2.9 BLEU points for Japanese-English and 2.6 BLEU points for Chinese-English translation compared to the lexical reordering models.
3 0.98203683 372 acl-2013-Using CCG categories to improve Hindi dependency parsing
Author: Bharat Ram Ambati ; Tejaswini Deoskar ; Mark Steedman
Abstract: We show that informative lexical categories from a strongly lexicalised formalism such as Combinatory Categorial Grammar (CCG) can improve dependency parsing of Hindi, a free word order language. We first describe a novel way to obtain a CCG lexicon and treebank from an existing dependency treebank, using a CCG parser. We use the output of a supertagger trained on the CCGbank as a feature for a state-of-the-art Hindi dependency parser (Malt). Our results show that using CCG categories improves the accuracy of Malt on long distance dependencies, for which it is known to have weak rates of recovery.
4 0.96341771 64 acl-2013-Automatically Predicting Sentence Translation Difficulty
Author: Abhijit Mishra ; Pushpak Bhattacharyya ; Michael Carl
Abstract: In this paper we introduce Translation Difficulty Index (TDI), a measure of difficulty in text translation. We first define and quantify translation difficulty in terms of TDI. We realize that any measure of TDI based on direct input by translators is fraught with subjectivity and adhocism. We, rather, rely on cognitive evidences from eye tracking. TDI is measured as the sum of fixation (gaze) and saccade (rapid eye movement) times of the eye. We then establish that TDI is correlated with three properties of the input sentence, viz. length (L), degree of polysemy (DP) and structural complexity (SC). We train a Support Vector Regression (SVR) system to predict TDIs for new sentences using these features as input. The prediction done by our framework is well correlated with the empirical gold standard data, which is a repository of < L, DP, SC > and TDI pairs for a set of sentences. The primary use of our work is a way of “binning” sentences (to be translated) in “easy”, “medium” and “hard” categories as per their predicted TDI. This can decide pricing of any translation task, especially useful in a scenario where parallel corpora for Machine Translation are built through translation crowdsourcing/outsourcing. This can also provide a way of monitoring progress of second language learners.
5 0.95566517 11 acl-2013-A Multi-Domain Translation Model Framework for Statistical Machine Translation
Author: Rico Sennrich ; Holger Schwenk ; Walid Aransa
Abstract: While domain adaptation techniques for SMT have proven to be effective at improving translation quality, their practicality for a multi-domain environment is often limited because of the computational and human costs of developing and maintaining multiple systems adapted to different domains. We present an architecture that delays the computation of translation model features until decoding, allowing for the application of mixture-modeling techniques at decoding time. We also de- scribe a method for unsupervised adaptation with development and test data from multiple domains. Experimental results on two language pairs demonstrate the effectiveness of both our translation model architecture and automatic clustering, with gains of up to 1BLEU over unadapted systems and single-domain adaptation.
same-paper 6 0.94042915 206 acl-2013-Joint Event Extraction via Structured Prediction with Global Features
7 0.93933469 302 acl-2013-Robust Automated Natural Language Processing with Multiword Expressions and Collocations
8 0.93512774 40 acl-2013-Advancements in Reordering Models for Statistical Machine Translation
9 0.76820177 166 acl-2013-Generalized Reordering Rules for Improved SMT
10 0.75349545 77 acl-2013-Can Markov Models Over Minimal Translation Units Help Phrase-Based SMT?
11 0.7451883 281 acl-2013-Post-Retrieval Clustering Using Third-Order Similarity Measures
12 0.73703641 56 acl-2013-Argument Inference from Relevant Event Mentions in Chinese Argument Extraction
13 0.72829217 38 acl-2013-Additive Neural Networks for Statistical Machine Translation
14 0.71371078 199 acl-2013-Integrating Multiple Dependency Corpora for Inducing Wide-coverage Japanese CCG Resources
15 0.70265347 127 acl-2013-Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation
16 0.6988312 69 acl-2013-Bilingual Lexical Cohesion Trigger Model for Document-Level Machine Translation
17 0.68270719 68 acl-2013-Bilingual Data Cleaning for SMT using Graph-based Random Walk
18 0.67755485 363 acl-2013-Two-Neighbor Orientation Model with Cross-Boundary Global Contexts
19 0.67637479 181 acl-2013-Hierarchical Phrase Table Combination for Machine Translation
20 0.67063022 101 acl-2013-Cut the noise: Mutually reinforcing reordering and alignments for improved machine translation