emnlp emnlp2012 emnlp2012-71 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract We introduce a novel coreference resolution system that models entities and events jointly. [sent-2, score-1.001]
2 Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. [sent-3, score-1.241]
3 As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. [sent-4, score-0.968]
4 Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. [sent-5, score-1.384]
5 In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately. [sent-6, score-1.001]
6 1 Introduction Most coreference resolution systems focus on enti- ties and tacitly assume a correspondence between entities and noun phrases (NPs). [sent-7, score-0.845]
7 Focusing on NPs is a way to restrict the challenging problem of coreference resolution, but misses coreference relations like the one between hanged and his suicide in (1), and between placed and put in (2). [sent-8, score-1.158]
8 Since arguments play a key role in describing an event, knowing that two arguments corefer is useful for finding coreference relations between events, and knowing that two events corefer is useful for finding coreference relations between entities. [sent-21, score-1.54]
9 In (1), the coreference relation between One of the key suspected Mafia bosses arrested yesterday and Lo Presti can be found by knowing that their predicates (i. [sent-22, score-0.767]
10 On the other hand, the coreference relations between the arguments Saints and Bush in (2) helps to determine the coreference relation between their predicates placed and put. [sent-25, score-1.175]
11 We annotate a corpus with cross-document coreference relations for nominal and verbal mentions. [sent-27, score-0.82]
12 We focus on both intra and inter-document coreference because this scenario is at the same time more challenging and more relevant to real-world applications such as news aggregation. [sent-28, score-0.549]
13 lc L2a0n1g2ua Agseso Pcrioactieosnsi fnogr a Cnodm Cpoumtaptiuotna tilo Lnianlg Nuaist uircasl our approach is an iterative algorithm that cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. [sent-33, score-1.241]
14 We evaluate our cross-document coreference rWeesol euvtaioluna system on tshs-isd corpus atn cdo srhefoewre tnhcaet our joint approach significantly outperforms two strong baselines that resolve entities and events separately. [sent-37, score-0.761]
15 Related Work Entity coreference resolution is a well studied problem with many successful techniques for identifying mention clusters (Ponzetto and Strube, 2006; Haghighi and Klein, 2009; Stoyanov et al. [sent-38, score-1.148]
16 Prior work showed that models thatjointly resolve mentions across multiple entities result in better performance than simply resolving mentions in a pairwise fashion (Denis and Baldridge, 2007; Poon and Domingos, 2008; Wick et al. [sent-42, score-0.478]
17 A natural extension is to perform coreference jointly across both entities and events. [sent-45, score-0.634]
18 We confirm 490 that such features are useful but also show that the complementary features for verbal mentions lead to even better performance, especially when event and entity clusters are jointly modeled. [sent-50, score-1.059]
19 Compared to the extensive work on entity coreference, the related problem of event coreference remains relatively under-explored, with minimal work on how entity and event coreference can be considered jointly on an open domain. [sent-51, score-1.883]
20 Early work on event coreference for MUC (Humphreys et al. [sent-52, score-0.766]
21 More recently, there have been approaches that looked at event coreference for wider domains. [sent-54, score-0.766]
22 To our knowledge, the only previous work that considered entity and event coreference resolution jointly is He (2007), but limited to the medical domain and focused on just five semantic categories. [sent-60, score-1.196]
23 3 Architecture Following the intuition introduced in Section 1, our approach iteratively builds clusters of event and entity mentions jointly. [sent-61, score-0.862]
24 , finding out that two verbal mentions have arguments that belong to the same entity cluster), the features of both entity and event mentions are re-generated, which prompts future clustering operations. [sent-64, score-1.251]
25 Our model follows a cautious (or “baby steps”) approach, which we previously showed to be successful for entity coreference resolution (Raghunathan et al. [sent-65, score-0.92]
26 However, unlike our previous work, which used deterministic rules, in this paper we learn a coreference resolution model using linear regression. [sent-68, score-0.805]
27 1 Document Clustering Our approach starts with several steps that reduce the search space for the actual coreference resolution task. [sent-73, score-0.759]
28 For example, with- out document clustering, our algorithm may decide to cluster two mentions of the verb hit, but knowing that one belongs to a cluster containing earthquake reports and the other to a cluster with reports on criminal activities, this decision can be avoided. [sent-78, score-0.635]
29 We extract nom- inal and pronominal mentions using the mention identification component in the publicly downloadable Stanford coreference resolution system (Raghunathan et al. [sent-88, score-1.143]
30 Crucially, we do not make a formal distinction between entity and event mentions. [sent-93, score-0.408]
31 , is the noun earthquake an entity or an event mention? [sent-96, score-0.438]
32 ) and an imperfect classification would negatively affect the following coreference resolution. [sent-97, score-0.519]
33 3 High-precision Entity Resolution Sieves To further reduce the problem’s search space, in step 6 of Algorithm 1 we apply a set of high- precision filters from the Stanford coreference resolution system. [sent-103, score-0.759]
34 This system is a collection of deterministic models (or “sieves”) for entity coreference resolution that incorporate lexical, syntactic, semantic, and discourse information. [sent-104, score-0.966]
35 As clusters are built, information such as mention gender and number is propagated across mentions in the same cluster, which helps subsequent decisions. [sent-106, score-0.617]
36 The Stanford system obtained the highest score at the CoNLL2011 shared task on English coreference resolution. [sent-107, score-0.519]
37 For this step, we selected all the sieves from the Stanford system with the exception of the pronoun resolution sieve. [sent-108, score-0.468]
38 , High-precision sieves Discourse processing sieve Exact string match sieve Relaxed string match sieve Precise constructs sieve (e. [sent-111, score-0.879]
39 , appositives) Strict head match sieves Proper head noun match sieve Relaxed head matching sieve Table 1: Deterministic sieves in step 6 of Algorithm 1. [sent-113, score-0.846]
40 one sieve clusters together two entity mentions only when they have the same head word. [sent-114, score-0.83]
41 That is, all verbal mentions are still in singleton clusters after this step. [sent-118, score-0.672]
42 Furthermore, none of these sieves use features that facilitate the joint resolution of nominal and verbal mentions (e. [sent-119, score-0.924]
43 4 Iterative Entity/Event Resolution In this stage (steps 7 9 in Algorithm 1), we construct entity and event clusters using a cautious or “baby steps” approach. [sent-124, score-0.666]
44 We use a single linear regressor (Θ) to model cluster merge operations between both verbal and nominal clusters. [sent-125, score-0.606]
45 Once two clusters are merged (step 9) we regenerate all the mention features to reflect the current clusters. [sent-132, score-0.434]
46 This iterative procedure is the core of our joint coreference resolution approach. [sent-134, score-0.807]
47 This algorithm transparently merges both entity and event mentions and, importantly, allows information to flow between clusters of both types as merge operations take place. [sent-135, score-1.086]
48 Because of this merge, in iteration i+ 1the nominal mentions Lo Presti and One of the key suspected Mafia bosses have the same semantic role for verbs assigned to the same cluster. [sent-137, score-0.491]
49 5 Pronoun Sieve Our approach concludes with the pronominal coreference resolution sieve from the Stanford system. [sent-141, score-0.989]
50 This sieve is necessary because our current resolution algorithm ignores mention ordering and distance (i. [sent-142, score-0.544]
51 , in step 7 we compare all clusters regardless of where their mentions appear in the text). [sent-144, score-0.454]
52 As previous work has proved, the structure of the text is crucial for pronominal coreference (Hobbs, 1978). [sent-145, score-0.576]
53 The algorithm uses gold coreference labels to train a linear regressor that models the quality of the clusters produced by merge operations. [sent-153, score-0.984]
54 , not present in either one of the clusters to be merged) that are correct: q =linkscorrleinctk+sc loirrnekctsincorrect (1) where links(in)correct is the number of newly introduced (in)correct pairwise mention links when two clusters are merged. [sent-156, score-0.675]
55 , – – 2We skip the pronoun sieve here because it does not affect the decisions taken during the iterative resolution steps. [sent-166, score-0.502]
56 Since these deterministic models address only nominal clusters, at the end we generate training data for events by inspecting all the pairs of singleton verbal clusters. [sent-177, score-0.553]
57 For example, when comparing the event clusters {bought} taond c {acquired}, eitxyt. [sent-202, score-0.505]
58 iti Foonr t eon PropBank-style isro lfeesat, furore event mentions we also include the closest left and right entity mentions in order to capture any arguments missed by the SRL system. [sent-206, score-0.875]
59 Indicator feature set to 1if the two clusters have at least one coreferent argument in a given role. [sent-207, score-0.441]
60 Indicator feature set to 1if the two clusters have at least one coreferent predicate for a given role. [sent-212, score-0.452]
61 If any of the two clusters contains a verbal mention we consider the merge an operation between event (V) clusters; otherwise it is a merge between entity (E) clusters. [sent-222, score-1.203]
62 1 Corpus The training and test data sets were derived from the EventCorefBank (ECB) corpus5 created by Bejan and Harabagiu (2010) to study event coreference since standard corpora such as OntoNotes (Pradhan et al. [sent-233, score-0.766]
63 The reason for including comparable documents was to increase the number of cross-document coreference relations. [sent-236, score-0.519]
64 For the purpose of our study, we extended the original corpus in two directions: (i) fully annotated sentences, and (ii) entity coreference relations. [sent-238, score-0.711]
65 In addition, we removed relations other than coreference (e. [sent-239, score-0.519]
66 2 Evaluation We use five coreference evaluation metrics widely used in the literature: MUC (Vilain et al. [sent-328, score-0.519]
67 BLANC (Recasens and Hovy, 2011) Metric based on the Rand index (Rand, 1971) that considers both coreference and non-coreference links to address the imbalance between singleton and coreferent mentions. [sent-333, score-0.749]
68 Note that the gold corpus separates clusters into entity and event clusters (see Table 3), but our We report scores for entity clusters, event clusters and the system does not make this distinction at runtime. [sent-341, score-1.618]
69 , all spurious mentions that our system in- cludes in a cluster with a gold entity mention are considered for the entity score, regardless of their gold type (event or entity). [sent-348, score-0.831]
70 Baseline 1 uses a modified Stanford coreference resolution system after our document clustering and mention identification steps. [sent-351, score-0.937]
71 Because the original Stanford system implements only entity coreference, we extended it with an extra sieve that implements lemma matching for events. [sent-352, score-0.442]
72 , clusters that contain at least one verbal mention) or a verbal and a nominal cluster when at least two lemmas of mention head words are the same between clusters, e. [sent-355, score-1.056]
73 Both these sieves model entity and event 496 contextual information using semantic roles. [sent-359, score-0.595]
74 The first sieve merges two nominal clusters when two mentions in the respective clusters have the same head words and two mentions (possibly with different heads) modify with the same role label two predicates that have the same lemma. [sent-360, score-1.438]
75 The second sieve implements the complementary action for event clusters. [sent-362, score-0.449]
76 That is, it merges two verbal clusters when at least two mentions have the same lemma and at least two mentions have semantic arguments with the same role label and the same lemma. [sent-363, score-1.063]
77 This demonstrates that local syntactico-semantic context is important for coreference resolution even in a cross-document setting and that the current state-of-the-art in SRL can model this context accurately. [sent-369, score-0.759]
78 This demonstrates that a holistic approach to coreference resolution improves the resolution of both entities and events more than models that address aspects of the task separately. [sent-376, score-1.273]
79 For example, the “Coreferent Argument for Arg1” feature is triggered when two event clusters have Arg1 arguments that already belong to the same entity cluster. [sent-390, score-0.741]
80 This allows information from previous entity coreference operations to impact future merges of event clusters. [sent-391, score-1.003]
81 This is the crux of our iterative approach to joint coreference resolution. [sent-392, score-0.567]
82 This work demonstrates that an approach that jointly models entities and events is better for crossdocument coreference resolution. [sent-396, score-0.82]
83 For example, document clustering and coreference resolution can be solved jointly, which we expect would improve both tasks. [sent-398, score-0.806]
84 Furthermore, our iterative coreference resolution procedure (Algorithm 1) could be modified to account for mention ordering and distance, which would allow us to include pronominal resolution in our joint model, rather than addressing it with a separate deterministic sieve. [sent-399, score-1.281]
85 8 Conclusion We have presented a holistic model for crossdocument coreference resolution that jointly solves references to events and entities by handling both nominal and verbal mentions. [sent-400, score-1.393]
86 Our joint resolution algorithm allows event coreference to help improve entity coreference, and vice versa. [sent-401, score-1.167]
87 In addition, our iterative procedure, based on a linear regressor that models the quality of cluster merges, allows each Error Type (Ratio) DExeascmrippletion Pronoun resolution (36%) The pronoun is incorrectly resolved by the pronominal sieve of the Stanford deterministic entity system. [sent-402, score-0.952]
88 Semantics beyond role frames The semantics of the coreference relation cannot be captured by role frames or WordNet. [sent-405, score-0.607]
89 Initial high-precision sieves (6%) Phrasal verbs (6%) Linear regression (4%) An error made by the initial high-precision entity resolution sieves is propagated to our model. [sent-426, score-0.869]
90 merging state to benefit from the previous merged entity and event mentions. [sent-453, score-0.488]
91 This approach allows us to start with a set of high-precision coreference relations and gradually add new ones to increase recall. [sent-454, score-0.519]
92 This is noteworthy since each measure has been shown to place primary emphasis in evaluating a different aspect of the coreference resolution task. [sent-456, score-0.759]
93 Our system is tailored for cross-document coreference resolution on a corpus that contains news articles that repeatedly report on a smaller number of topics. [sent-457, score-0.759]
94 Joint determination of anaphoricity and coreference resolution using integer programming. [sent-488, score-0.759]
95 Simple coreference resolution with rich syntactic and semantic features. [sent-496, score-0.759]
96 Stanford’s multi-pass sieve coreference resolution system at the CoNLL-201 1 shared task. [sent-521, score-0.932]
97 Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution. [sent-548, score-0.563]
98 CoNLL-201 1 shared task: Modeling unrestricted coreference in OntoNotes. [sent-561, score-0.519]
99 Conundrums in noun phrase coreference resolution: Making sense ofthe state-of-the-art. [sent-582, score-0.519]
100 A unified approach for schema matching, coreference and canonicalization. [sent-604, score-0.519]
wordName wordTfidf (topN-words)
[('coreference', 0.519), ('clusters', 0.258), ('event', 0.247), ('resolution', 0.24), ('mentions', 0.196), ('sieves', 0.187), ('sieve', 0.173), ('verbal', 0.168), ('entity', 0.161), ('events', 0.156), ('coreferent', 0.152), ('nominal', 0.133), ('mention', 0.131), ('cluster', 0.126), ('hanged', 0.12), ('merge', 0.119), ('entities', 0.086), ('merges', 0.076), ('bejan', 0.075), ('arguments', 0.075), ('strike', 0.065), ('obama', 0.064), ('predicates', 0.062), ('amd', 0.06), ('mafia', 0.06), ('regressor', 0.06), ('stanford', 0.06), ('pronominal', 0.057), ('regression', 0.056), ('surdeanu', 0.056), ('srl', 0.052), ('pradhan', 0.051), ('lemma', 0.05), ('singleton', 0.05), ('iterative', 0.048), ('clustering', 0.047), ('raghunathan', 0.047), ('deterministic', 0.046), ('arrested', 0.045), ('bosses', 0.045), ('corefer', 0.045), ('ecb', 0.045), ('humphreys', 0.045), ('presti', 0.045), ('saints', 0.045), ('timmons', 0.045), ('merged', 0.045), ('role', 0.044), ('hit', 0.043), ('head', 0.042), ('predicate', 0.042), ('pronoun', 0.041), ('muc', 0.04), ('ati', 0.04), ('heeyoung', 0.04), ('blanc', 0.039), ('recasens', 0.039), ('verbs', 0.038), ('nps', 0.037), ('mihai', 0.037), ('helped', 0.036), ('suspected', 0.035), ('rand', 0.035), ('tbl', 0.035), ('police', 0.035), ('bush', 0.035), ('bought', 0.035), ('merging', 0.035), ('president', 0.035), ('heads', 0.033), ('bagga', 0.033), ('conll', 0.033), ('holistic', 0.032), ('gender', 0.032), ('argument', 0.031), ('haghighi', 0.031), ('knowing', 0.031), ('annotated', 0.031), ('earthquake', 0.03), ('lemmas', 0.03), ('israeli', 0.03), ('cautiously', 0.03), ('clusterdocuments', 0.03), ('cosmin', 0.03), ('crossdocument', 0.03), ('dead', 0.03), ('hevent', 0.03), ('hospital', 0.03), ('intra', 0.03), ('logan', 0.03), ('roper', 0.03), ('swirl', 0.03), ('yesterday', 0.03), ('jointly', 0.029), ('topics', 0.029), ('flow', 0.029), ('implements', 0.029), ('rahman', 0.029), ('gold', 0.028), ('links', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
2 0.35611707 72 emnlp-2012-Joint Inference for Event Timeline Construction
Author: Quang Do ; Wei Lu ; Dan Roth
Abstract: This paper addresses the task of constructing a timeline of events mentioned in a given text. To accomplish that, we present a novel representation of the temporal structure of a news article based on time intervals. We then present an algorithmic approach that jointly optimizes the temporal structure by coupling local classifiers that predict associations and temporal relations between pairs of temporal entities with global constraints. Moreover, we present ways to leverage knowledge provided by event coreference to further improve the system performance. Overall, our experiments show that the joint inference model significantly outperformed the local classifiers by 9.2% of relative improvement in F1. The experiments also suggest that good event coreference could make remarkable contribution to a robust event timeline construction system.
3 0.33188197 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge
Author: Lev Ratinov ; Dan Roth
Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.
4 0.32930198 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic
Author: Yang Song ; Jing Jiang ; Wayne Xin Zhao ; Sujian Li ; Houfeng Wang
Abstract: Pairwise coreference resolution models must merge pairwise coreference decisions to generate final outputs. Traditional merging methods adopt different strategies such as the bestfirst method and enforcing the transitivity constraint, but most of these methods are used independently of the pairwise learning methods as an isolated inference procedure at the end. We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic. Experimental results show that our joint learning system outperforms independent learning systems. Our system gives a better performance than all the learning-based systems from the CoNLL-201 1shared task on the same dataset. Compared with the best system from CoNLL2011, which employs a rule-based method, our system shows competitive performance.
5 0.21893592 112 emnlp-2012-Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge
Author: Altaf Rahman ; Vincent Ng
Abstract: We examine the task of resolving complex cases of definite pronouns, specifically those for which traditional linguistic constraints on coreference (e.g., Binding Constraints, gender and number agreement) as well as commonly-used resolution heuristics (e.g., string-matching facilities, syntactic salience) are not useful. Being able to solve this task has broader implications in artificial intelligence: a restricted version of it, sometimes referred to as the Winograd Schema Challenge, has been suggested as a conceptually and practically appealing alternative to the Turing Test. We employ a knowledge-rich approach to this task, which yields a pronoun resolver that outperforms state-of-the-art resolvers by nearly 18 points in accuracy on our dataset.
6 0.20800823 36 emnlp-2012-Domain Adaptation for Coreference Resolution: An Adaptive Ensemble Approach
7 0.16463064 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling
8 0.15275745 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
9 0.13553201 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
10 0.13157649 19 emnlp-2012-An Entity-Topic Model for Entity Linking
11 0.13085893 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction
12 0.11839724 113 emnlp-2012-Resolving This-issue Anaphora
13 0.10859459 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities
14 0.098211333 38 emnlp-2012-Employing Compositional Semantics and Discourse Consistency in Chinese Event Extraction
15 0.076125823 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
16 0.075969741 84 emnlp-2012-Linking Named Entities to Any Database
17 0.075008824 41 emnlp-2012-Entity based QA Retrieval
18 0.068816319 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
19 0.068208851 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
20 0.065529615 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings
topicId topicWeight
[(0, 0.274), (1, 0.393), (2, -0.086), (3, -0.428), (4, 0.031), (5, -0.062), (6, -0.159), (7, -0.273), (8, 0.118), (9, 0.001), (10, -0.036), (11, -0.022), (12, 0.072), (13, 0.053), (14, -0.009), (15, -0.062), (16, 0.045), (17, -0.052), (18, 0.015), (19, -0.052), (20, 0.051), (21, -0.024), (22, -0.039), (23, 0.085), (24, -0.028), (25, 0.006), (26, -0.028), (27, -0.017), (28, 0.057), (29, 0.075), (30, -0.088), (31, 0.024), (32, -0.01), (33, -0.041), (34, -0.009), (35, -0.033), (36, 0.069), (37, -0.037), (38, -0.012), (39, 0.002), (40, -0.03), (41, -0.091), (42, 0.027), (43, -0.02), (44, 0.001), (45, -0.001), (46, -0.048), (47, 0.005), (48, -0.014), (49, 0.037)]
simIndex simValue paperId paperTitle
same-paper 1 0.97961611 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
2 0.91210067 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic
Author: Yang Song ; Jing Jiang ; Wayne Xin Zhao ; Sujian Li ; Houfeng Wang
Abstract: Pairwise coreference resolution models must merge pairwise coreference decisions to generate final outputs. Traditional merging methods adopt different strategies such as the bestfirst method and enforcing the transitivity constraint, but most of these methods are used independently of the pairwise learning methods as an isolated inference procedure at the end. We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic. Experimental results show that our joint learning system outperforms independent learning systems. Our system gives a better performance than all the learning-based systems from the CoNLL-201 1shared task on the same dataset. Compared with the best system from CoNLL2011, which employs a rule-based method, our system shows competitive performance.
3 0.74400431 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge
Author: Lev Ratinov ; Dan Roth
Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.
4 0.70955193 72 emnlp-2012-Joint Inference for Event Timeline Construction
Author: Quang Do ; Wei Lu ; Dan Roth
Abstract: This paper addresses the task of constructing a timeline of events mentioned in a given text. To accomplish that, we present a novel representation of the temporal structure of a news article based on time intervals. We then present an algorithmic approach that jointly optimizes the temporal structure by coupling local classifiers that predict associations and temporal relations between pairs of temporal entities with global constraints. Moreover, we present ways to leverage knowledge provided by event coreference to further improve the system performance. Overall, our experiments show that the joint inference model significantly outperformed the local classifiers by 9.2% of relative improvement in F1. The experiments also suggest that good event coreference could make remarkable contribution to a robust event timeline construction system.
5 0.51831734 36 emnlp-2012-Domain Adaptation for Coreference Resolution: An Adaptive Ensemble Approach
Author: Jian Bo Yang ; Qi Mao ; Qiao Liang Xiang ; Ivor Wai-Hung Tsang ; Kian Ming Adam Chai ; Hai Leong Chieu
Abstract: We propose an adaptive ensemble method to adapt coreference resolution across domains. This method has three features: (1) it can optimize for any user-specified objective measure; (2) it can make document-specific prediction rather than rely on a fixed base model or a fixed set of base models; (3) it can automatically adjust the active ensemble members during prediction. With simplification, this method can be used in the traditional withindomain case, while still retaining the above features. To the best of our knowledge, this work is the first to both (i) develop a domain adaptation algorithm for the coreference resolution problem and (ii) have the above features as an ensemble method. Empirically, we show the benefits of (i) on the six domains of the ACE 2005 data set in domain adaptation set- ting, and of (ii) on both the MUC-6 and the ACE 2005 data sets in within-domain setting.
6 0.49694526 112 emnlp-2012-Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge
7 0.44947901 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling
8 0.43095312 38 emnlp-2012-Employing Compositional Semantics and Discourse Consistency in Chinese Event Extraction
9 0.37245655 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering
10 0.33996445 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction
11 0.33746031 19 emnlp-2012-An Entity-Topic Model for Entity Linking
12 0.33735713 135 emnlp-2012-Using Discourse Information for Paraphrase Extraction
13 0.30245185 113 emnlp-2012-Resolving This-issue Anaphora
14 0.28431278 77 emnlp-2012-Learning Constraints for Consistent Timeline Extraction
15 0.2792564 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings
16 0.26831833 41 emnlp-2012-Entity based QA Retrieval
17 0.254118 84 emnlp-2012-Linking Named Entities to Any Database
18 0.23603402 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities
19 0.23581739 3 emnlp-2012-A Coherence Model Based on Syntactic Patterns
20 0.22912009 9 emnlp-2012-A Sequence Labelling Approach to Quote Attribution
topicId topicWeight
[(2, 0.028), (11, 0.01), (12, 0.081), (16, 0.068), (19, 0.01), (23, 0.045), (25, 0.018), (34, 0.051), (60, 0.11), (63, 0.061), (64, 0.041), (65, 0.08), (70, 0.013), (73, 0.037), (74, 0.031), (76, 0.053), (80, 0.054), (86, 0.091), (95, 0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.89343953 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents
Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky
Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.
2 0.82398176 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation
Author: Liwei Chen ; Yansong Feng ; Lei Zou ; Dongyan Zhao
Abstract: In this paper, we investigate different usages of feature representations in the web person name disambiguation task which has been suffering from the mismatch of vocabulary and lack of clues in web environments. In literature, the latter receives less attention and remains more challenging. We explore the feature space in this task and argue that collecting person specific evidences from a corpus level can provide a more reasonable and robust estimation for evaluating a feature’s importance in a given web page. This can alleviate the lack of clues where discriminative features can be reasonably weighted by taking their corpus level importance into account, not just relying on the current local context. We therefore propose a topic-based model to exploit the person specific global importance and embed it into the person name similarity. The experimental results show that the corpus level topic in- formation provides more stable evidences for discriminative features and our method outperforms the state-of-the-art systems on three WePS datasets.
3 0.79437423 30 emnlp-2012-Constructing Task-Specific Taxonomies for Document Collection Browsing
Author: Hui Yang
Abstract: Taxonomies can serve as browsing tools for document collections. However, given an arbitrary collection, pre-constructed taxonomies could not easily adapt to the specific topic/task present in the collection. This paper explores techniques to quickly derive task-specific taxonomies supporting browsing in arbitrary document collections. The supervised approach directly learns semantic distances from users to propose meaningful task-specific taxonomies. The approach aims to produce globally optimized taxonomy structures by incorporating path consistency control and usergenerated task specification into the general learning framework. A comparison to stateof-the-art systems and a user study jointly demonstrate that our techniques are highly effective. .
4 0.78175801 79 emnlp-2012-Learning Syntactic Categories Using Paradigmatic Representations of Word Context
Author: Mehmet Ali Yatbaz ; Enis Sert ; Deniz Yuret
Abstract: We investigate paradigmatic representations of word context in the domain of unsupervised syntactic category acquisition. Paradigmatic representations of word context are based on potential substitutes of a word in contrast to syntagmatic representations based on properties of neighboring words. We compare a bigram based baseline model with several paradigmatic models and demonstrate significant gains in accuracy. Our best model based on Euclidean co-occurrence embedding combines the paradigmatic context representation with morphological and orthographic features and achieves 80% many-to-one accuracy on a 45-tag 1M word corpus.
5 0.77704948 36 emnlp-2012-Domain Adaptation for Coreference Resolution: An Adaptive Ensemble Approach
Author: Jian Bo Yang ; Qi Mao ; Qiao Liang Xiang ; Ivor Wai-Hung Tsang ; Kian Ming Adam Chai ; Hai Leong Chieu
Abstract: We propose an adaptive ensemble method to adapt coreference resolution across domains. This method has three features: (1) it can optimize for any user-specified objective measure; (2) it can make document-specific prediction rather than rely on a fixed base model or a fixed set of base models; (3) it can automatically adjust the active ensemble members during prediction. With simplification, this method can be used in the traditional withindomain case, while still retaining the above features. To the best of our knowledge, this work is the first to both (i) develop a domain adaptation algorithm for the coreference resolution problem and (ii) have the above features as an ensemble method. Empirically, we show the benefits of (i) on the six domains of the ACE 2005 data set in domain adaptation set- ting, and of (ii) on both the MUC-6 and the ACE 2005 data sets in within-domain setting.
6 0.77318656 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers
7 0.77264458 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic
8 0.77080262 82 emnlp-2012-Left-to-Right Tree-to-String Decoding with Prediction
9 0.7666319 23 emnlp-2012-Besting the Quiz Master: Crowdsourcing Incremental Classification Games
10 0.76158726 124 emnlp-2012-Three Dependency-and-Boundary Models for Grammar Induction
11 0.7528218 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge
12 0.75122702 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings
13 0.74745458 102 emnlp-2012-Optimising Incremental Dialogue Decisions Using Information Density for Interactive Systems
14 0.74318886 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction
15 0.74278295 12 emnlp-2012-A Transition-Based System for Joint Part-of-Speech Tagging and Labeled Non-Projective Dependency Parsing
16 0.74054188 110 emnlp-2012-Reading The Web with Learned Syntactic-Semantic Inference Rules
17 0.74035144 64 emnlp-2012-Improved Parsing and POS Tagging Using Inter-Sentence Consistency Constraints
18 0.73787761 72 emnlp-2012-Joint Inference for Event Timeline Construction
19 0.73650295 51 emnlp-2012-Extracting Opinion Expressions with semi-Markov Conditional Random Fields
20 0.73567003 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities