emnlp emnlp2011 emnlp2011-59 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sebastian Riedel ; Andrew McCallum
Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.
Reference: text
sentIndex sentText sentNum sentScore
1 edu , Abstract Extracting biomedical events from literature has attracted much recent attention. [sent-3, score-0.35]
2 The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. [sent-7, score-0.543]
3 To increase coverage of such databases, and to keep up with the rate of publishing, we need to automatically extract structured representations from biomedical text—a process often referred to as biomedical text mining. [sent-17, score-0.278]
4 However, in recent years there has also been an increasing interest in the extraction of biomedical events and their causal relations. [sent-19, score-0.382]
5 This gave rise to the BioNLP 2009 and 2011 shared tasks which challenged participants to gather such events from biomedical text (Kim et al. [sent-20, score-0.4]
6 Notably, these events can be complex and recursive: they may have several arguments, and some of the arguments may be events themselves. [sent-23, score-0.536]
7 Current state-of-the-art event extractors follow the same architectural blueprint and divide the extraction process into a pipeline of three stages (Björne et al. [sent-24, score-0.295]
8 First they predict a set of candidate event trigger words (say, tokens 2, 5 and 6 in figure 1), then argument mentions are attached to these triggers (say, token 4 for trigger 2). [sent-27, score-1.244]
9 The final stage decides how arguments are shared between events—compare how one event subsumes all arguments of trigger 6 in figure 1, while two events share the three arguments of trigger 4 in figure 2. [sent-28, score-1.664]
10 This architecture is prone to cascading errors: If we miss a trigger in the first stage, we will never be able to extract the full event ProceEed i n bgusr oghf, t Shec o20tla1 nd C,o UnKfe,r Jeunlyce 2 o7n– E31m,p 2i0ri1c1a. [sent-29, score-0.687]
11 sphorylationThemCeauseRegulaTTthhieeommnSeeamBei nBdinidnginga6T,h9e,Tmheemeb4,9 11 the phosphorylation of TRAF2 inhibits binding to the CD40 cytoplasmic domain Figure 1: (a) sentence with target event structure to extract; (b) projection to a set of labelled graph over tokens. [sent-36, score-0.593]
12 We present a family of event extraction models that address the aforementioned problems. [sent-52, score-0.295]
13 Notably, the highest scoring event structure under this model can be found efficiently in O (mn) time where m is the number of trigger candidates, and n the number of argument candidates. [sent-54, score-0.755]
14 This is only slightly slower than the O (m0n) runtime of a pipeline, where m0 is the number of trigger candidates as filtered by the first stage. [sent-55, score-0.458]
15 We achieve these guarantees through a novel algorithm that jointly picks best trigger label and arguments on a per-token basis. [sent-56, score-0.547]
16 The second model enforces additional constraints that ensure consistency between events in hierarchical regulation structures. [sent-58, score-0.5]
17 While inference in this model is more complicated, we show how dual decomposition (Komodakis et al. [sent-59, score-0.242]
18 The second-best results are achieved with Model 3 as is (Riedel and McCallum, 2011), the best results when using Stanford event predictions as input features (Riedel et al. [sent-71, score-0.263]
19 In the following we will first introduce biomedical event extraction and our notation. [sent-76, score-0.434]
20 ThemeBindingThemeThemeBindingTheme Grb2 can be coimmunoprecipitated with Sos1 and Sos2 123Theme4Theme5Theme678 Figure 2: Two binding events with identical trigger. [sent-79, score-0.483]
21 2 Biomedical Event Extraction By bio-molecular event we mean a change of state of one or more bio-molecules. [sent-81, score-0.263]
22 We see a snippet of text from a biomedical abstract, and the three events that can be extracted from it. [sent-84, score-0.35]
23 We will use these to characterize the types of events we ought to extract, as defined by the 2009 BioNLP shared task. [sent-85, score-0.261]
24 The event E1 in the figure refers to a Phosphorylation of the TRAF2 protein. [sent-87, score-0.263]
25 It is an instance of a set of simple events that describe changes to a single gene or gene product. [sent-88, score-0.267]
26 Each ofthese events has to have exactly one theme, the protein of which a state change is described. [sent-90, score-0.327]
27 Binding events are particular in that they may have more than one theme, as there can be several biomolecules associated in a binding structure. [sent-93, score-0.483]
28 In the top-center of figure 1a) we see the Regulation event E2. [sent-95, score-0.263]
29 Regulations may also have zero or one cause arguments that denote events or proteins which trigger the regulation. [sent-99, score-0.799]
30 In the BioNLP shared task, we are also asked to find a trigger (or clue) token for each event. [sent-100, score-0.482]
31 This token grounds the event in text and allows users to 3 quickly validate extracted events. [sent-101, score-0.296]
32 For example, the trigger for event E2 is “inhibit”, as indicated by a dashed line. [sent-102, score-0.662]
33 1 Event Projection To formulate the search for event structures of the form shown in figure 1a) as an optimization problem, it will be convenient to represent them through a set of binary variables. [sent-104, score-0.263]
34 Consider sentence x and a set of candidate trigger tokens, denoted by Trig (x). [sent-108, score-0.399]
35 We label each candidate i with the event type it is a trigger for, or None if it is not a trigger. [sent-109, score-0.696]
36 This decision is represented through a set of binary variables ei,t, one for each possible event type t. [sent-110, score-0.306]
37 The set of possible event types will be denoted as T , the regulation event types as TReg bd=eef {PosReg, NegReg, Reg} atniodn nit esv complement =dePfo T \ TReg. [sent-112, score-0.676]
38 as T¬reg For each candidate trigger iwe consider the arguments of all events that have ias trigger. [sent-113, score-0.724]
39 Each argument a will either be an event itself, or a protein. [sent-114, score-0.329]
40 For events we add a labelled edge between iand the trigger j of a. [sent-115, score-0.765]
41 For proteins we add an edge between iand the syntactic head j of the protein mention. [sent-116, score-0.317]
42 , 2009) and hence shares their main shortcoming: we cannot differentiate between two (or more) binding events with the same trigger but different arguments, or one binding event with several arguments. [sent-124, score-1.417]
43 Consider, for example, the arguments of trigger 6 in figure 1b) that are all subsumed in a single event. [sent-125, score-0.513]
44 By contrast, the arguments of trigger 4 shown in figure 2 are split between two events. [sent-126, score-0.513]
45 We propose to augment the graph representation through edges between pairs of proteins that are themes in the same binding event. [sent-130, score-0.397]
46 However, while they group binding arguments according to ad-hoc rules based on dependency paths from trigger to argument, we simply query the variables bp,q. [sent-141, score-0.828]
47 We denote the set of protein head tokens with Prot (x) ; the set of a possible targets for outgoing edges from a trigger is Cand(x) =def Trig (x) ∪ Prot (x). [sent-143, score-0.664]
48 We can use the scoring function sm and the set of legal structures Ym (x) to predict the event hm (x) fleogr a given userenste Ynce x according to hm(x) =def argy∈mYmax(x) sm(y;x,w). [sent-153, score-0.29]
49 1 Model 1 Model 1performs a simple version of joint trigger and argument extraction. [sent-156, score-0.497]
50 It independently scores trigger labels and argument roles: s1(e,a) =def X sT(i,t)+ X sR(i,j,r). [sent-157, score-0.465]
51 stems from enforcing consistency between the trigger label of iand its outgoing edges. [sent-161, score-0.694]
52 By consistency we mean that: (a) there is at least one Theme whenever there is an event at i; (b) only regulation events are allowed to have Cause arguments; (c) all arguments of a None trigger must have the None role. [sent-162, score-1.194]
53 It also ensures that when we see an “obvious” argument edge i j →r with high score sR (i, j,r) there is pressure to ext→rac jt a trigger at i, even if the fact that iis a trigger may not be as obvious. [sent-165, score-0.934]
54 Igt tparkoecse as input a Oveutc(to·)r, o isf trigger a innd a edge penalties c that are added to the local scores of the sT and sR functions. [sent-171, score-0.594]
55 The bestOut (c) routine exploits the fact that the constraints of Model 1 only act on the label for trigger iand its outgoing edges. [sent-176, score-0.634]
56 In particular, enforcing consistency between ei,t and outgoing edges ai,j,r has no effect on consistency between el,t and 21 = for any other trigger i0 i. [sent-177, score-0.711]
57 Moreover, for a given trigger the constraints6= only differentiate between three cases: (a) regulation event, (b) non-regulation event and (c) no event. [sent-178, score-0.812]
58 e W conscotrnaisnistt etnhatt every a,cat)ive ∈ edge mhisus vt eoilathteers ethned caotn a protein, or at an active event trigger. [sent-190, score-0.363]
59 This is a requirement on the label of a trigger and the assignment of roles for its incoming edges. [sent-191, score-0.52]
60 Hence, using In to dheen soctoer tinheg set nocft assignments with consistent trigger labels and incoming edges, we get Y2 =def Y1 ∩ Iand s2 (y) =def s1 (y). [sent-193, score-0.486]
61 a tT whihse ins predicting an outgoing edge from trigger ito trigger lthere is a high-scoring event at l. [sent-198, score-1.23]
62 (2010) and solve this problem in the framework of dual decomposi5 Algorithm 1 Sub-procedures for inference in Model Algorithm 1Sub-procedures for inference in Model 1, 2 and 3. [sent-200, score-0.225]
63 best label and outgoing edges for all triggers under penalties c bestOut (c) : ∀i y0 ← emp? [sent-201, score-0.392]
64 1 Dual decomposition solves a Linear Programming (LP) relaxation of M2 (that allows fractional values for all binary variables) through subgradient descent on a particular dual of M2. [sent-218, score-0.287]
65 It maintains the dual variables λ that will appear as local penalties in the subproblems to be solved. [sent-223, score-0.301]
66 It chooses, for each trigger candidate, the best label and incoming set of arguments together with the best outgoing edges to proteins. [sent-249, score-0.755]
67 to all incoming edges, but also greedily picks outgoing protein edges (as done within in(·)). [sent-265, score-0.324]
68 Model 3 fixes this by (a) adding binding variables bp,q into the objective, and (b) enforcing that the binding assignment b is consistent with the trigger and argument assignments e and a. [sent-268, score-1.129]
69 We will also enforce that the same pair of entities p, q cannot be arguments in more than one event together. [sent-269, score-0.424]
70 When a ti,p,q is active, we enforce that there is a binding trigger at i with proteins p and q as Theme arguments. [sent-274, score-0.793]
71 The latter means that an active bp,q requires a trigger ito point to p and q. [sent-278, score-0.429]
72 Or in other words, ti,p,q = 1for exactly one trigger i. [sent-279, score-0.399]
73 We exploit this by performing dual decomposition with a dual objective that has multipliers λ for the coupling constraints and multipliers for the constraints which enforce (e, a, t) ∈ T. [sent-286, score-0.589]
74 It groups together two proteins p, q if tPheir score plus the penalty of the best possible trigger iexceeds 0. [sent-296, score-0.474]
75 In this case, or if there is at least one trigger with positive penalty ci,p,q > 0 , we activate the set of triggers I q) with maximal score. [sent-297, score-0.483]
76 (p, Note that when several triggers i maximize the score, we assign them all the same fractional value |I(p, q) This enforces the constraint that at most one binding event can point to both p and q and also means that we are solving an LP relaxation. [sent-298, score-0.655]
77 The penalties for bestBind (c) are derived from the dual by setting cib,ipnd,q (µ) = |−1. [sent-300, score-0.258]
78 By using dual decomposition instead, we can exploit tractable substructure and achieve quadratic (Model 2) and cubic (Model 3) runtime guarantees. [sent-310, score-0.255]
79 (201 1b) cast event extraction as dependency parsing task. [sent-316, score-0.295]
80 Their model assumes that event structures are trees, an assumption that is frequently violated in practice. [sent-317, score-0.263]
81 Finally, all previous joint approaches use heuristics to decide whether binding arguments are part of the same event, while we capture these decisions in the joint model. [sent-318, score-0.45]
82 While tailored towards (biomedical) event extraction, we believe that our models can also be effective in a more general Semantic Role Label- ing (SRL) context. [sent-325, score-0.263]
83 Meza-Ruiz and Riedel (2009) showed that inducing pressure on arguments to be attached to at least one predicate is helpful; this is a soft incoming edge constraint. [sent-328, score-0.243]
84 Based on trigger words collected from the training set, a set of candidate trigger tokens Trig (x) is generated for each sentence x. [sent-336, score-0.798]
85 2 Features The feature function fT (i, t) extracts a per-trigger feature vector for trigger i and type t ∈ T . [sent-338, score-0.399]
86 (2010c) and contains: labelled and unlabeled n-gram dependency paths; edge and vertex walk features, argument and trigger modifiers and heads, words in between. [sent-351, score-0.564]
87 Presented is F1 score for all events (TOT), regulation events (REG), binding events (BIND) and simple events (SVT). [sent-367, score-1.266]
88 This is achieved without careful tuning of thresholds that control flow of information between trigger and argument extraction. [sent-370, score-0.465]
89 With Model 3 our focus is extended to binding events, improving F1 for such events by at least 5 F1. [sent-376, score-0.483]
90 This also has a positive effect on regulation events, as regulations of binding events can now be more accurately extracted. [sent-377, score-0.708]
91 Model 2 improves F1 for regulations, while Model 3 again increases F1 for both regulations and binding events. [sent-387, score-0.347]
92 This yields the best binding event results reported so far. [sent-388, score-0.535]
93 Notably, not only are we able to resolve binding ambiguity better. [sent-389, score-0.272]
94 Relative to Genia it provides less data and introduces more types of entities as well as the biological process event type. [sent-421, score-0.263]
95 6 Conclusion We presented three joint models for biomedical event extraction. [sent-445, score-0.434]
96 By explicitly capturing regulation events (Model 2), and binding events (Model 3) we achieve the best results reported so far on several event extraction tasks. [sent-447, score-1.139]
97 We also show how dual decomposition can be used for constraints that go beyond coupling equalities. [sent-449, score-0.299]
98 A comparative study of syntactic parsers for event extraction. [sent-517, score-0.263]
99 Event extraction with complex event classification using rich features. [sent-528, score-0.295]
100 Robust biomedical event extraction with dual decomposition and minimal domain adaptation. [sent-550, score-0.63]
wordName wordTfidf (topN-words)
[('trigger', 0.399), ('bionlp', 0.338), ('miwa', 0.291), ('binding', 0.272), ('event', 0.263), ('events', 0.211), ('regulation', 0.15), ('riedel', 0.144), ('biomedical', 0.139), ('dual', 0.133), ('rne', 0.131), ('penalties', 0.125), ('protein', 0.116), ('def', 0.115), ('arguments', 0.114), ('bestout', 0.102), ('outgoing', 0.099), ('poon', 0.099), ('bj', 0.095), ('mcclosky', 0.089), ('trig', 0.087), ('triggers', 0.084), ('theme', 0.08), ('vanderwende', 0.079), ('proteins', 0.075), ('genia', 0.075), ('regulations', 0.075), ('feats', 0.073), ('edge', 0.07), ('argument', 0.066), ('subgradient', 0.063), ('decomposition', 0.063), ('runtime', 0.059), ('incoming', 0.059), ('prot', 0.058), ('coupling', 0.057), ('biomedicine', 0.057), ('consistency', 0.057), ('iand', 0.056), ('rush', 0.051), ('cand', 0.05), ('reg', 0.05), ('edges', 0.05), ('shared', 0.05), ('enforcing', 0.049), ('enforce', 0.047), ('ichi', 0.047), ('constraints', 0.046), ('inference', 0.046), ('pyysalo', 0.045), ('sampo', 0.045), ('certificates', 0.045), ('diseases', 0.044), ('bestin', 0.044), ('treg', 0.044), ('variables', 0.043), ('sebastian', 0.039), ('infectious', 0.038), ('sr', 0.037), ('enforces', 0.036), ('jun', 0.035), ('label', 0.034), ('mihai', 0.034), ('makoto', 0.034), ('sb', 0.034), ('ei', 0.033), ('token', 0.033), ('joint', 0.032), ('notably', 0.032), ('extraction', 0.032), ('multipliers', 0.032), ('active', 0.03), ('none', 0.029), ('labelled', 0.029), ('bestbind', 0.029), ('compatibilities', 0.029), ('phosphorylation', 0.029), ('rnm', 0.029), ('tadayoshi', 0.029), ('integer', 0.028), ('roles', 0.028), ('projected', 0.028), ('descent', 0.028), ('gene', 0.028), ('assignments', 0.028), ('scoring', 0.027), ('kim', 0.027), ('optimality', 0.027), ('arg', 0.026), ('surdeanu', 0.025), ('databases', 0.025), ('tapio', 0.025), ('fulfill', 0.025), ('decoupled', 0.025), ('dualize', 0.025), ('rune', 0.025), ('bestperforming', 0.025), ('cascading', 0.025), ('cores', 0.025), ('enju', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
Author: Sebastian Riedel ; Andrew McCallum
Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.
2 0.14747876 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts
Author: Bryan Rink ; Sanda Harabagiu
Abstract: This paper presents a generative model for the automatic discovery of relations between entities in electronic medical records. The model discovers relation instances and their types by determining which context tokens express the relation. Additionally, the valid semantic classes for each type of relation are determined. We show that the model produces clusters of relation trigger words which better correspond with manually annotated relations than several existing clustering techniques. The discovered relations reveal some of the implicit semantic structure present in patient records.
3 0.14723738 92 emnlp-2011-Minimally Supervised Event Causality Identification
Author: Quang Do ; Yee Seng Chan ; Dan Roth
Abstract: This paper develops a minimally supervised approach, based on focused distributional similarity methods and discourse connectives, for identifying of causality relations between events in context. While it has been shown that distributional similarity can help identifying causality, we observe that discourse connectives and the particular discourse relation they evoke in context provide additional information towards determining causality between events. We show that combining discourse relation predictions and distributional similarity methods in a global inference procedure provides additional improvements towards determining event causality.
4 0.11863629 45 emnlp-2011-Dual Decomposition with Many Overlapping Components
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Dual decomposition has been recently proposed as a way of combining complementary models, with a boost in predictive power. However, in cases where lightweight decompositions are not readily available (e.g., due to the presence of rich features or logical constraints), the original subgradient algorithm is inefficient. We sidestep that difficulty by adopting an augmented Lagrangian method that accelerates model consensus by regularizing towards the averaged votes. We show how first-order logical constraints can be handled efficiently, even though the corresponding subproblems are no longer combinatorial, and report experiments in dependency parsing, with state-of-the-art results. 1
5 0.11209669 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation
Author: Yin-Wen Chang ; Michael Collins
Abstract: This paper describes an algorithm for exact decoding of phrase-based translation models, based on Lagrangian relaxation. The method recovers exact solutions, with certificates of optimality, on over 99% of test examples. The method is much more efficient than approaches based on linear programming (LP) or integer linear programming (ILP) solvers: these methods are not feasible for anything other than short sentences. We compare our method to MOSES (Koehn et al., 2007), and give precise estimates of the number and magnitude of search errors that MOSES makes.
6 0.10710447 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling
7 0.1055638 128 emnlp-2011-Structured Relation Discovery using Generative Models
8 0.081083998 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
9 0.077194951 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
10 0.074808367 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
11 0.069530688 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
12 0.065723188 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
13 0.06375479 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing
14 0.05171359 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
15 0.049286466 129 emnlp-2011-Structured Sparsity in Structured Prediction
16 0.04819037 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
17 0.047947824 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser
18 0.047720049 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
19 0.047434799 96 emnlp-2011-Multilayer Sequence Labeling
20 0.044337243 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests
topicId topicWeight
[(0, 0.179), (1, -0.054), (2, -0.11), (3, 0.071), (4, -0.007), (5, -0.1), (6, 0.067), (7, -0.197), (8, -0.051), (9, -0.064), (10, -0.038), (11, -0.026), (12, 0.18), (13, -0.176), (14, 0.067), (15, -0.019), (16, -0.113), (17, 0.021), (18, -0.056), (19, 0.011), (20, -0.132), (21, 0.011), (22, 0.06), (23, -0.065), (24, 0.011), (25, -0.16), (26, 0.063), (27, -0.059), (28, 0.141), (29, -0.12), (30, 0.042), (31, 0.062), (32, -0.073), (33, -0.006), (34, -0.096), (35, 0.197), (36, 0.119), (37, 0.046), (38, 0.15), (39, -0.082), (40, -0.084), (41, 0.067), (42, 0.161), (43, -0.056), (44, -0.025), (45, 0.052), (46, 0.01), (47, 0.108), (48, 0.104), (49, 0.069)]
simIndex simValue paperId paperTitle
same-paper 1 0.94377393 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
Author: Sebastian Riedel ; Andrew McCallum
Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.
2 0.48748505 45 emnlp-2011-Dual Decomposition with Many Overlapping Components
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Dual decomposition has been recently proposed as a way of combining complementary models, with a boost in predictive power. However, in cases where lightweight decompositions are not readily available (e.g., due to the presence of rich features or logical constraints), the original subgradient algorithm is inefficient. We sidestep that difficulty by adopting an augmented Lagrangian method that accelerates model consensus by regularizing towards the averaged votes. We show how first-order logical constraints can be handled efficiently, even though the corresponding subproblems are no longer combinatorial, and report experiments in dependency parsing, with state-of-the-art results. 1
3 0.46021718 82 emnlp-2011-Learning Local Content Shift Detectors from Document-level Information
Author: Richard Farkas
Abstract: Information-oriented document labeling is a special document multi-labeling task where the target labels refer to a specific information instead of the topic of the whole document. These kind oftasks are usually solved by looking up indicator phrases and analyzing their local context to filter false positive matches. Here, we introduce an approach for machine learning local content shifters which detects irrelevant local contexts using just the original document-level training labels. We handle content shifters in general, instead of learning a particular language phenomenon detector (e.g. negation or hedging) and form a single system for document labeling and content shift detection. Our empirical results achieved 24% error reduction compared to supervised baseline methods – on three document label– ing tasks.
4 0.44268256 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts
Author: Bryan Rink ; Sanda Harabagiu
Abstract: This paper presents a generative model for the automatic discovery of relations between entities in electronic medical records. The model discovers relation instances and their types by determining which context tokens express the relation. Additionally, the valid semantic classes for each type of relation are determined. We show that the model produces clusters of relation trigger words which better correspond with manually annotated relations than several existing clustering techniques. The discovered relations reveal some of the implicit semantic structure present in patient records.
5 0.44225922 92 emnlp-2011-Minimally Supervised Event Causality Identification
Author: Quang Do ; Yee Seng Chan ; Dan Roth
Abstract: This paper develops a minimally supervised approach, based on focused distributional similarity methods and discourse connectives, for identifying of causality relations between events in context. While it has been shown that distributional similarity can help identifying causality, we observe that discourse connectives and the particular discourse relation they evoke in context provide additional information towards determining causality between events. We show that combining discourse relation predictions and distributional similarity methods in a global inference procedure provides additional improvements towards determining event causality.
6 0.41314483 11 emnlp-2011-A Simple Word Trigger Method for Social Tag Suggestion
7 0.40714991 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling
8 0.36422756 51 emnlp-2011-Exact Decoding of Phrase-Based Translation Models through Lagrangian Relaxation
9 0.29729071 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
10 0.27514401 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
11 0.26973426 49 emnlp-2011-Entire Relaxation Path for Maximum Entropy Problems
12 0.25896886 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions
13 0.25687972 128 emnlp-2011-Structured Relation Discovery using Generative Models
14 0.24474865 114 emnlp-2011-Relation Extraction with Relation Topics
15 0.23709765 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing
16 0.23277718 111 emnlp-2011-Reducing Grounded Learning Tasks To Grammatical Inference
17 0.23045605 129 emnlp-2011-Structured Sparsity in Structured Prediction
18 0.22120374 61 emnlp-2011-Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
19 0.20738021 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
20 0.20166998 139 emnlp-2011-Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter
topicId topicWeight
[(3, 0.012), (11, 0.011), (15, 0.016), (23, 0.12), (28, 0.025), (36, 0.028), (37, 0.03), (45, 0.046), (53, 0.016), (54, 0.023), (57, 0.019), (62, 0.051), (64, 0.09), (66, 0.026), (67, 0.167), (69, 0.038), (79, 0.04), (82, 0.036), (87, 0.013), (90, 0.017), (96, 0.063), (98, 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.82754636 59 emnlp-2011-Fast and Robust Joint Models for Biomedical Event Extraction
Author: Sebastian Riedel ; Andrew McCallum
Abstract: Extracting biomedical events from literature has attracted much recent attention. The bestperforming systems so far have been pipelines of simple subtask-specific local classifiers. A natural drawback of such approaches are cascading errors introduced in early stages of the pipeline. We present three joint models of increasing complexity designed to overcome this problem. The first model performs joint trigger and argument extraction, and lends itself to a simple, efficient and exact inference algorithm. The second model captures correlations between events, while the third model ensures consistency between arguments of the same event. Inference in these models is kept tractable through dual decomposition. The first two models outperform the previous best joint approaches and are very competitive with respect to the current state-of-theart. The third model yields the best results reported so far on the BioNLP 2009 shared task, the BioNLP 2011 Genia task and the BioNLP 2011Infectious Diseases task.
2 0.69556695 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
Author: Kevin Gimpel ; Noah A. Smith
Abstract: We present a quasi-synchronous dependency grammar (Smith and Eisner, 2006) for machine translation in which the leaves of the tree are phrases rather than words as in previous work (Gimpel and Smith, 2009). This formulation allows us to combine structural components of phrase-based and syntax-based MT in a single model. We describe a method of extracting phrase dependencies from parallel text using a target-side dependency parser. For decoding, we describe a coarse-to-fine approach based on lattice dependency parsing of phrase lattices. We demonstrate performance improvements for Chinese-English and UrduEnglish translation over a phrase-based baseline. We also investigate the use of unsupervised dependency parsers, reporting encouraging preliminary results.
3 0.68802786 146 emnlp-2011-Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance
Author: Shay B. Cohen ; Dipanjan Das ; Noah A. Smith
Abstract: We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts are unavailable. We obtain state-of-theart performance for two tasks of structure prediction: unsupervised part-of-speech tagging and unsupervised dependency parsing.
4 0.67858791 45 emnlp-2011-Dual Decomposition with Many Overlapping Components
Author: Andre Martins ; Noah Smith ; Mario Figueiredo ; Pedro Aguiar
Abstract: Dual decomposition has been recently proposed as a way of combining complementary models, with a boost in predictive power. However, in cases where lightweight decompositions are not readily available (e.g., due to the presence of rich features or logical constraints), the original subgradient algorithm is inefficient. We sidestep that difficulty by adopting an augmented Lagrangian method that accelerates model consensus by regularizing towards the averaged votes. We show how first-order logical constraints can be handled efficiently, even though the corresponding subproblems are no longer combinatorial, and report experiments in dependency parsing, with state-of-the-art results. 1
5 0.67749226 95 emnlp-2011-Multi-Source Transfer of Delexicalized Dependency Parsers
Author: Ryan McDonald ; Slav Petrov ; Keith Hall
Abstract: We present a simple method for transferring dependency parsers from source languages with labeled training data to target languages without labeled training data. We first demonstrate that delexicalized parsers can be directly transferred between languages, producing significantly higher accuracies than unsupervised parsers. We then use a constraint driven learning algorithm where constraints are drawn from parallel corpora to project the final parser. Unlike previous work on projecting syntactic resources, we show that simple methods for introducing multiple source lan- guages can significantly improve the overall quality of the resulting parsers. The projected parsers from our system result in state-of-theart performance when compared to previously studied unsupervised and projected parsing systems across eight different languages.
6 0.66709638 136 emnlp-2011-Training a Parser for Machine Translation Reordering
7 0.66425556 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
8 0.6635105 134 emnlp-2011-Third-order Variational Reranking on Packed-Shared Dependency Forests
9 0.6583156 128 emnlp-2011-Structured Relation Discovery using Generative Models
10 0.65629441 20 emnlp-2011-Augmenting String-to-Tree Translation Models with Fuzzy Use of Source-side Syntax
11 0.65428597 137 emnlp-2011-Training dependency parsers by jointly optimizing multiple objectives
12 0.65119731 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
13 0.65040892 79 emnlp-2011-Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction
14 0.64923871 126 emnlp-2011-Structural Opinion Mining for Graph-based Sentiment Representation
15 0.64830375 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
16 0.64566219 58 emnlp-2011-Fast Generation of Translation Forest for Large-Scale SMT Discriminative Training
17 0.64519167 22 emnlp-2011-Better Evaluation Metrics Lead to Better Machine Translation
18 0.64323229 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
19 0.64278555 66 emnlp-2011-Hierarchical Phrase-based Translation Representations
20 0.64060229 122 emnlp-2011-Simple Effective Decipherment via Combinatorial Optimization