Author: Aria Haghighi ; Dan Klein
Abstract: We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%.
1 edu Abstract We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. [sent-3, score-0.475]
2 Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. [sent-4, score-0.402]
3 On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%. [sent-5, score-0.404]
4 1 Introduction Template-filling information extraction (IE) systems must merge information across multiple sentences to identify all role fillers of interest. [sent-6, score-0.222]
5 For instance, in the MUC4 terrorism event extrac- tion task, the entity filling the individual perpetrator role often occurs multiple times, variously as proper, nominal, or pronominal mentions. [sent-7, score-0.579]
6 However, most template-filling systems (Freitag and McCallum, 2000; Patwardhan and Riloff, 2007) assign roles to individual textual mentions using only local context as evidence, leaving aggregation for post-processing. [sent-8, score-0.558]
7 While prior work has acknowledged that coreference resolution and discourse analysis are integral to accurate role identification, to our knowledge no model has been proposed which jointly models these phenomena. [sent-9, score-0.403]
8 Our model jointly merges surface mentions into underlying entities (coreference resolution) and assigns roles to those discovered entities. [sent-11, score-0.821]
9 In the generative process proposed here, document entities are generated for each template role, along with a set of non-template entities. [sent-12, score-0.47]
10 These entities then generate mentions in a process sensitive to both lexical and structural properties of the mention. [sent-13, score-0.586]
11 o ild not Figure 1: Example of the corporate acquisitions role-filling task. [sent-20, score-0.273]
12 In (a), an example template specifying the entities playing each domain role. [sent-21, score-0.353]
13 In (b), an example document with coreferent mentions sharing the same role label. [sent-22, score-0.628]
14 Note that pronoun mentions provide direct clues to entity roles. [sent-23, score-0.655]
15 can naturally incorporate unannotated data, which further increases accuracy. [sent-24, score-0.13]
16 2 Problem Setting Figure 1(a) shows an example template-filling task from the corporate acquisitions domain (Freitag, 1998). [sent-25, score-0.273]
17 1 We have a template of K roles (PURCHASER, AMOUNT, etc. [sent-26, score-0.345]
18 ) and we must identify which entity (if any) fills each role (CSR Limited, etc. [sent-27, score-0.412]
19 Often such problems are modeled at the mention level, directly labeling individual mentions as in Figure 1(b). [sent-29, score-0.818]
20 Indeed, in this data set, the mention-level perspective is evident in the gold annotations, which ignore pronominal references. [sent-30, score-0.146]
21 However, roles in this domain appear in several locations throughout the document, with pronominal mentions often carrying the critical information for template filling. [sent-31, score-0.845]
22 Therefore, Section 3 presents a model in which entities are explicitly modeled, naturally merging information across all mention types and explicitly representing latent structure very much like the entity-level template structure from Figure 1(a). [sent-32, score-0.853]
23 Also we ignore the status field as it doesn’t apply to entities and its meaning is not consistent. [sent-34, score-0.185]
24 c C2o0n1f0er Aenscseoc Sihatoirotn P faopre Crso,m papguetsat 2io9n1a–l2 L9i5n,guistics Purchaser Role Role Entity Parameters Other Entity Parameters Figure 2: Graphical model depiction of our generative model described in Section 3. [sent-37, score-0.134]
25 3 Model We describe our generative model for a document, which has many similarities to the coreferenceonly model of Haghighi and Klein (2010), but which integrally models template role-fillers. [sent-39, score-0.34]
26 Mentions: A mention is an observed textual reference to a latent real-world entity. [sent-41, score-0.466]
27 Mentions are associated with nodes in a parse tree and are typically realized as NPs. [sent-42, score-0.113]
28 There are three basic forms of mentions: proper (NAM), nominal (NOM), and pronominal (PRO). [sent-43, score-0.218]
29 Each mention M is represented as collection of key-value pairs. [sent-44, score-0.383]
30 Mention types are trivially determined from mention head POS tag. [sent-48, score-0.435]
31 Entities: An entity is a specific individual or object in the world. [sent-50, score-0.221]
32 Where a mention has a single word for each property, an entity has a list of signature words. [sent-52, score-0.604]
33 Formally, entities are mappings from properties r ∈ R to lists Lr of “canonical” words which that entity uses sftosr Lthat property. [sent-53, score-0.453]
34 Our model performs role-filling by assuming that each entity is drawn from an underlying role. [sent-55, score-0.37]
35 These roles include the K template roles as well as ‘junk’ roles to represent entities which do not fill a template role (see Section 5. [sent-56, score-1.198]
36 Each role R is represented as a mapping between properties r and pairs of multinomials (θr, fr). [sent-58, score-0.287]
37 θr is a unigram distribution of words for property r that are semantically licensed for the role (e. [sent-59, score-0.298]
38 fr is a “fertility” distribution over the integers that characterizes entity list lengths. [sent-62, score-0.296]
39 Together, these distributions control the lists Lr for entities which instantiate the role. [sent-63, score-0.218]
40 We temporarily assume that all mentions belong to a template role-filling entity; we lift this restriction in Section 5. [sent-65, score-0.598]
41 First, a semantic component generates a sequence of entities E = (E1, . [sent-67, score-0.191]
42 , EK), where each Ei is generated from a corresponding role Ri. [sent-70, score-0.191]
43 , RK) to denote the vector of template role parameters. [sent-74, score-0.397]
44 Note that this work assumes that there is a one-to-one mapping between entities and roles; in particular, at most one entity can fill each role. [sent-75, score-0.399]
45 Once entities have been generated, a discourse component generates which entities will be evoked in each of the n mention positions. [sent-77, score-0.721]
46 We represent these choices using entity indicators denoted by Z = (Z1, . [sent-78, score-0.249]
47 , K indicating the entity number (and thereby the role) underlying the ith mention position. [sent-86, score-0.672]
48 Finally, a mention generation component renders each mention conditioned on the underlying entity and role. [sent-87, score-1.128]
49 The antecedent position is selected according to the distribution, P(j0 |j) ∝ exp{−γTREEDIST(j0, j)} where TREEDIST(j0,j) represents the tree distance between the parse nodes for Mj Mass is and Mj0. [sent-92, score-0.139]
50 4 2There is one exception: the sizes of the proper and nominal head property lists are jointly generated, but their word lists are still independently populated. [sent-93, score-0.353]
51 4Sentence parse trees are merged into a right-branching document parse tree. [sent-95, score-0.141]
52 restricted to antecedent mention positions j0 which occur earlier in the same sentence or in the previous sentence. [sent-97, score-0.436]
53 3 Mention Generation Once the entity indicator has been drawn, we generate words associated with mention conditioned on the underlying entity E and role R. [sent-99, score-1.194]
54 For each mention property r associated with the mention, a word w is drawn utilizing E’s word list Lr as well as the multinomials (fr, θr) from role R. [sent-100, score-0.771]
55 The word w is drawn according to, P(w|E,R)=(1 − αr)1le[wn( ∈L Lrr)]+ αrP(w|θr) For each property r, there is a hyper-parameter αr which interpolates between selecting a word uniformly from the entity list Lr and drawing from the underlying role distribution θr. [sent-101, score-0.637]
56 Intuitively, a small αr indicates that an entity prefers to re-use a small number of words for property r. [sent-102, score-0.292]
57 This is typically the case for proper and nominal heads as well as modifiers. [sent-103, score-0.139]
58 At the other extreme, setting αr to 1 indicates the property isn’t particular to the entity itself, but rather always drawn from the underlying role distribution. [sent-104, score-0.638]
59 4 Learning and Inference Since we will make use of unannotated data (see Section 5), we utilize a variational EM algorithm to learn parameters R and φ. [sent-106, score-0.215]
60 MWe, approximate it using a surrogate variational distribution of the following factored form: Q(E,Z) = iY=K1qi(Ei)! [sent-108, score-0.179]
61 jY=n1rj(Zj) Each rj (Zj) is a distribution over the entity indicator for mention Mj, which approximates the true posterior of Zj. [sent-109, score-0.986]
62 Similarly, qi (Ei) approximates the posterior over entity Ei which is associated with role Ri. [sent-110, score-0.631]
63 As is standard, we iteratively update each component distribution to minimize KL-divergence, fixing all other distributions: qi← argqmiinKL(Q(E,Z)|P(E,Z|M,R,φ) ∝ ∝ exp{EQ/qi ln P(E, Z|M, R, φ))} exp{E 5The sole parameter γ is fixed 293 at 0. [sent-111, score-0.269]
64 8 Table 1: Results on corporate acquisition tasks with given role mention boundaries. [sent-122, score-0.705]
65 We report mention role accuracy and entity role accuracy (correctly labeling all entity mentions). [sent-123, score-1.25]
66 The update for variational entity distribution is given by: ln qi(ei) ∝ EQ/qi ln P(E, Z, M|R, φ) ∝ E{rj}lnP(ei|Ri)j:ZYj=iP(Mj|ei,Ri) = lnP(ei|Ri) +Xrj(i)lnP(Mj|ei,Ri) Xj It is intractable to enumerate all possible entities ei (each consisting of several sets of words). [sent-125, score-0.994]
67 We obtain entity samples by sampling mention entity indicators according to rj. [sent-127, score-0.853]
68 For a given sample, we assume that Ei consists of the non-pronominal head words and modifiers of mentions such that Zj has sampled value i. [sent-128, score-0.444]
69 During the E-Step, we perform 5 iterations of updating each variational factor, which results in an approximate posterior distribution. [sent-129, score-0.17]
70 The role parameters Ri are computed from the qi(ei) and rj (z) distributions, and the global role prior φ from the non-pronominal components of rj (z). [sent-131, score-0.78]
71 5 Experiments We present results on the corporate acquisitions task, which consists of 600 annotated documents split into a 300/300 train/test split. [sent-132, score-0.302]
72 documents, proper and (usually) nominal mentions are annotated with roles, while pronouns are not. [sent-135, score-0.502]
73 , 2006), and extract mention properties from parse trees and the Stanford Dependency Extractor (de Marneffe et al. [sent-137, score-0.478]
74 1 Gold Role Boundaries We first consider the simplified task where role mention boundaries are given. [sent-140, score-0.619]
75 We map each labeled token span in training and test data to a parse tree node that shares the same head. [sent-141, score-0.114]
76 In this setting, the role-filling task is a collective classifica- tion problem, since we know each mention is filling some role. [sent-142, score-0.414]
77 It uses features as similar as possible to the generative model (and more), including the head word, typed dependencies of the head, various tree features, governing word, and several conjunctions of these features as well as coarser versions of lexicalized features. [sent-144, score-0.225]
78 The primary difficulty in classification is the disambiguation amongst the acquired, seller, and purchaser roles, which have similar internal structure, and differ primarily in their semantic contexts. [sent-147, score-0.152]
79 Our entity-centered model, JOINT in Table 1, has no latent variables at training time in this setting, since each role maps to a unique entity. [sent-148, score-0.247]
80 7 During development, we noted that often the most direct evidence of the role of an entity was associated with pronoun usage (see the first “it” in Figure 1). [sent-151, score-0.481]
81 Training our model with pronominal mentions, whose roles are latent variables at training time, improves accuracy to 5. [sent-152, score-0.334]
82 8 Full Task We now consider the more difficult setting where role mention boundaries time. [sent-155, score-0.656]
83 In this setting, are not provided at test we automatically extract mentions from a parse tree using a heuristic ap7We use the mode of the variational posteriors rj (Zj) to make predictions (see Section 4). [sent-156, score-0.841]
84 Systems must determine which mentions are template role-fillers as well as label them. [sent-165, score-0.598]
85 ROLE ID only evaluates the binary decision of whether a mention is a template role-filler or not. [sent-166, score-0.589]
86 Our BEST system, see Section 5, adds extra unannotated data to our JOINT+PRO system. [sent-168, score-0.1]
87 Our mention extraction procedure yields 95% recall over annotated role mentions and 45% precision. [sent-170, score-1.06]
88 9 Using extracted mentions as input, our task is to label some subset of the mentions with template roles. [sent-171, score-0.99]
89 Since systems can label mentions as non-role bearing, only recall is critical to mention extraction. [sent-172, score-0.775]
90 The baseline then classifies mentions which pass this first phase as before. [sent-174, score-0.425]
91 We add ‘junk’ roles to our model to flexibly model entities that do not correspond to annotated template roles. [sent-175, score-0.554]
92 During training, extracted mentions which are not matched in the labeled data have posteriors which are constrained to be amongst the ‘junk’ roles. [sent-176, score-0.523]
93 We first evaluate role identification (ROLE ID in Table 2), the task of identifying mentions which play some role in the template. [sent-177, score-0.774]
94 On the task of identifying and correctly labeling role mentions, our model outperforms INDEP as well (OVERALL in Table 2). [sent-182, score-0.265]
95 As our model is generative, it is straightforward to utilize totally unannotated data. [sent-183, score-0.131]
96 We added 700 fully unannotated documents from the mergers and acquisitions portion of the Reuters 21857 corpus. [sent-184, score-0.299]
97 9Following Patwardhan and Riloff (2009), we match extracted mentions to labeled spans if the head of the mention matches the labeled span. [sent-190, score-0.883]
98 6 Conclusion We have presented a joint generative model of coreference resolution and role-filling information extraction. [sent-192, score-0.279]
99 This model makes role decisions at the entity, rather than at the mention level. [sent-193, score-0.605]
100 This approach naturally aggregates information across multiple mentions, incorporates unannotated data, and yields strong performance. [sent-194, score-0.193]
same-paper 1 0.9999994 28 acl-2010-An Entity-Level Approach to Information Extraction
Author: Aria Haghighi ; Dan Klein
Abstract: We present a generative model of template-filling in which coreference resolution and role assignment are jointly determined. Underlying template roles first generate abstract entities, which in turn generate concrete textual mentions. On the standard corporate acquisitions dataset, joint resolution in our entity-level model reduces error over a mention-level discriminative approach by up to 20%.
2 0.24104145 233 acl-2010-The Same-Head Heuristic for Coreference
Author: Micha Elsner ; Eugene Charniak
Abstract: We investigate coreference relationships between NPs with the same head noun. It is relatively common in unsupervised work to assume that such pairs are coreferent– but this is not always true, especially if realistic mention detection is used. We describe the distribution of noncoreferent same-head pairs in news text, and present an unsupervised generative model which learns not to link some samehead NPs using syntactic features, improving precision.
3 0.22882025 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
4 0.21456851 72 acl-2010-Coreference Resolution across Corpora: Languages, Coding Schemes, and Preprocessing Information
Author: Marta Recasens ; Eduard Hovy
Abstract: This paper explores the effect that different corpus configurations have on the performance of a coreference resolution system, as measured by MUC, B3, and CEAF. By varying separately three parameters (language, annotation scheme, and preprocessing information) and applying the same coreference resolution system, the strong bonds between system and corpus are demonstrated. The experiments reveal problems in coreference resolution evaluation relating to task definition, coding schemes, and features. They also ex- pose systematic biases in the coreference evaluation metrics. We show that system comparison is only possible when corpus parameters are in exact agreement.
5 0.14105923 125 acl-2010-Generating Templates of Entity Summaries with an Entity-Aspect Model and Pattern Mining
Author: Peng Li ; Jing Jiang ; Yinglin Wang
Abstract: In this paper, we propose a novel approach to automatic generation of summary templates from given collections of summary articles. This kind of summary templates can be useful in various applications. We first develop an entity-aspect LDA model to simultaneously cluster both sentences and words into aspects. We then apply frequent subtree pattern mining on the dependency parse trees of the clustered and labeled sentences to discover sentence patterns that well represent the aspects. Key features of our method include automatic grouping of semantically related sentence patterns and automatic identification of template slots that need to be filled in. We apply our method on five Wikipedia entity categories and compare our method with two baseline methods. Both quantitative evaluation based on human judgment and qualitative comparison demonstrate the effectiveness and advantages of our method.
7 0.12047432 219 acl-2010-Supervised Noun Phrase Coreference Research: The First Fifteen Years
8 0.11506322 38 acl-2010-Automatic Evaluation of Linguistic Quality in Multi-Document Summarization
9 0.10020723 49 acl-2010-Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates
10 0.10018893 101 acl-2010-Entity-Based Local Coherence Modelling Using Topological Fields
11 0.09062323 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
12 0.088747621 33 acl-2010-Assessing the Role of Discourse References in Entailment Inference
13 0.080123156 73 acl-2010-Coreference Resolution with Reconcile
14 0.080051728 149 acl-2010-Incorporating Extra-Linguistic Information into Reference Resolution in Collaborative Task Dialogue
15 0.07892888 229 acl-2010-The Influence of Discourse on Syntax: A Psycholinguistic Model of Sentence Processing
16 0.075172096 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
17 0.07219816 238 acl-2010-Towards Open-Domain Semantic Role Labeling
18 0.071185432 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
19 0.07093136 4 acl-2010-A Cognitive Cost Model of Annotations Based on Eye-Tracking Data
20 0.0701041 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
