emnlp emnlp2010 emnlp2010-8 knowledge-graph by maker-knowledge-mining

8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution


Source: pdf

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. [sent-2, score-0.972]

2 To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. [sent-4, score-1.234]

3 , gender and number) across mentions in the same cluster. [sent-8, score-0.504]

4 This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. [sent-9, score-0.408]

5 The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. [sent-10, score-0.379]

6 1 Introduction Recent work on coreference resolution has shown that a rich feature space that models lexical, syntactic, semantic, and discourse phenomena is crucial to successfully address the task (Bengston and Roth, 2008; Haghighi and Klein, 2009; Haghighi and Klein, 2010). [sent-13, score-0.526]

7 By and large most approaches decide if two mentions are coreferent using a single function over all these features and information local to the two mentions. [sent-15, score-0.487]

8 This initial clustering step will assign the correct animacy attribute (inanimate) to the corresponding geo-political entity, which will prevent the incorrect merging with the mention we (animate) in later steps. [sent-25, score-0.462]

9 The approach applies tiers of coreference models one at a time from highest to lowest precision. [sent-30, score-0.408]

10 Furthermore, each model’s decisions are richly informed by sharing attributes across the mentions clustered in earlier tiers. [sent-32, score-0.56]

11 All our components are unsupervised, in the sense that they do not require training on gold coreference links. [sent-35, score-0.412]

12 Our approach outperforms most other unsupervised coreference models and several supervised ones on several datasets. [sent-39, score-0.416]

13 We believe that our approach also serves as an ideal platform for the development of future coreference systems. [sent-41, score-0.379]

14 Related Work This work builds upon the recent observation that strong features outweigh complex models for coreference resolution, in both supervised and unsupervised learning setups (Bengston and Roth, 2008; Haghighi and Klein, 2009). [sent-42, score-0.449]

15 Most coreference resolution approaches perform the task by aggregating local decisions about pairs of mentions (Bengston and Roth, 2008; Finkel and Manning, 2008; Haghighi and Klein, 2009; Stoyanov, 2010). [sent-44, score-0.921]

16 They perform coreference resolution jointly for all mentions in a document, using first-order probabilistic models in either supervised or unsupervised settings. [sent-47, score-0.958]

17 To the best of our knowledge, we are the first to apply this theory to coreference resolution. [sent-55, score-0.379]

18 3 Description of the Task Intra-document coreference resolution clusters together textual mentions within a single document based on the underlying referent entity. [sent-56, score-0.987]

19 To facilitate comparison with most of the recent previous work, we report results using gold mention boundaries. [sent-58, score-0.305]

20 , see Haghighi and Klein (2010) for a simple mention detection model). [sent-61, score-0.306]

21 The syntactic in- formation is used to identify the mention head words and to define the ordering of mentions in a given sentence (detailed in the next section). [sent-76, score-0.87]

22 For a fair comparison with previous work, we do not use gold named entity labels or mention types but, instead, take the labels provided by the Stanford named entity recognizer (NER) (Finkel et al. [sent-77, score-0.513]

23 2 Evaluation Metrics We use three evaluation metrics widely used in the literature: (a) pairwise F1 (Ghosh, 2003) computed over mention pairs in the same entity cluster; (b) MUC (Vilain et al. [sent-80, score-0.4]

24 – – – 4 Description of the Multi-Pass Sieve Our sieve framework is implemented as a succes- sion ofindependent coreference models. [sent-84, score-0.702]

25 1 Mention Processing Given a mention mi, each model may either decline to propose a solution (in the hope that one of the subsequent models will solve it) or deterministically select a single best antecedent from a list of previous mentions m1, . [sent-87, score-0.878]

26 If the – sentence containing the anaphoric mention contains multiple clauses, we repeat the above heuristic separately in each S * constituent, starting with the one containing the mention. [sent-95, score-0.314]

27 Previous Sentence For all nominal mentions we sort candidates in the previous sentences using rightto-left breadth-first traversal. [sent-96, score-0.513]

28 For example, this ordering favors the correct candidate (pepsi) for the mention they: [pepsi] says it expects to double [quaker]’s snack food growth rate. [sent-101, score-0.368]

29 – In a significant departure from previous work, each model in our framework gets (possibly incomplete) clustering information for each mention from the earlier coreference models in the multi-pass system. [sent-105, score-0.651]

30 In other words, each mention mi may already be assigned to a cluster Cj containing a set of mentions: Cj = ,. [sent-106, score-0.399]

31 Unassigned {mj1 mkj}; mentions are unique memb};er ms of∈ ∈th Ceir own cluster. [sent-110, score-0.436]

32 fected by missing attributes (which introduce precision errors because incorrect antecedents are selected due to missing information) and incorrect attributes (which introduce recall errors because correct links are not generated due to attribute mismatch between mention and antecedent). [sent-113, score-0.861]

33 To address this issue, we perform a union of all mention attributes (e. [sent-114, score-0.396]

34 If attributes from different mentions contradict each other we maintain all variants. [sent-117, score-0.56]

35 For example, our naive number detection assigns s ingular to the mention a group of students and plural to five students. [sent-118, score-0.401]

36 When these mentions end up in the same cluster, the resulting number attributes becomes the set {s ingular, plural}. [sent-119, score-0.56]

37 Mention selection Traditionally, a coreference model attempts to resolve every mention in the text, which increases the likelihood of errors. [sent-121, score-0.651]

38 Instead, in each of our models, we exploit the cluster information received from the previous stages by resolving only mentions that are currently first in textual order in their cluster. [sent-122, score-0.563]

39 These two are the only mentions that have potential antecedents and are currently marked as the first mentions in their clusters. [sent-124, score-0.92]

40 First, early cluster mentions are usually better defined than subsequent ones, which are likely to have fewer modifiers or are pronouns (Fox, 495 1993). [sent-126, score-0.725]

41 Second, by definition, first mentions appear closer to the beginning of the document, hence there are fewer antecedent candidates to select from, and fewer opportunities to make a mistake. [sent-128, score-0.606]

42 We disable coreference for first cluster mentions that: (a) are or start with indefinite pronouns (e. [sent-130, score-1.053]

43 One exception to this rule is the model deployed in the first pass; it only links mentions if their entire extents match exactly. [sent-135, score-0.527]

44 This model is triggered for all nominal mentions regardless of discourse salience, because it is possible that indefinite mentions are repeated in a document when concepts are discussed but not instantiated, e. [sent-136, score-1.041]

45 2 The Modules of the Multi-Pass Sieve We now describe the coreference mented in the sieve. [sent-145, score-0.379]

46 For clarity, we in Table 1 and show the cumulative they are added to the sieve in Table models implesummarize them performance as 2. [sent-146, score-0.323]

47 1 Pass 1- Exact Match This model links two mentions only if they contain exactly the same extent text, including modifiers and determiners, e. [sent-149, score-0.588]

48 2 Pass 2 - Precise Constructs This model links two mentions if any of the conditions below are satisfied: Appositive – the two nominal mentions are in an appositive construction, e. [sent-155, score-1.072]

49 The Type column indicates the type of coreference in each pronominal. [sent-162, score-0.379]

50 or P – Predicate nominative the two mentions (nominal or pronominal) are in a copulative subject-object relation, e. [sent-164, score-0.47]

51 Role appositive the candidate antecedent is headed by a noun and appears as a modifier in an NP whose head is the current mention, e. [sent-167, score-0.403]

52 This feature is inspired by Haghighi and Klein (2009), who triggered it only if the mention is labeled as a person by the NER. [sent-170, score-0.316]

53 We constrain this heuristic more in our work: we allow this feature to match only if: (a) the mention is labeled as a person, (b) the antecedent is animate – – (we detail animacy detection in Pass 7), and (c) the antecedent’s gender is not neutral. [sent-171, score-0.79]

54 Relative pronoun the mention is a relative pronoun that modifies the head of the antecedent NP, e. [sent-172, score-0.714]

55 Acronym both mentions are tagged as NNP and one of them is an acronym of the other, e. [sent-175, score-0.555]

56 We use a simple acronym detection algorithm, which marks a mention as an acronym of another if its text equals the sequence of upper case characters in the other mention. [sent-181, score-0.544]

57 Demonym one of the mentions is a demonym of the other, e. [sent-183, score-0.521]

58 As shown in Table 2 the pairwise precision of the sieve – – – 3 3 http :/ / en . [sent-191, score-0.419]

59 3 Pass 3 - Strict Head Matching Linking a mention to an antecedent based on the naive matching of their head words generates a lot of spurious links because it completely ignores possibly incompatible modifiers (Elsner and Charniak, 2010). [sent-196, score-0.753]

60 To address this issue, this pass implements several features that must all be matched in order to yield a link: Cluster head match the mention head word matches any head word in the antecedent cluster. [sent-198, score-1.005]

61 Note that this feature is actually more relaxed than naive head matching between mention and antecedent candidate because it is satisfied when the mention’s head matches the head of any entity in the candidate’s cluster. [sent-199, score-0.959]

62 – Word inclusion all the non-stop4 words in the mention cluster are included in the set of non-stop words in the cluster of the antecedent candidate. [sent-201, score-0.696]

63 This heuristic exploits the property of discourse that – it is uncommon to introduce novel information in later mentions (Fox, 1993). [sent-202, score-0.519]

64 Typically, mentions of the same entity become shorter and less informative as the narrative progresses. [sent-203, score-0.504]

65 does look like very dramatic change made by [the Florida court] point to the same entity, but the two mentions in the text below belong to different clusters: The pilot had confirmed . [sent-210, score-0.436]

66 Compatible modifiers only the mention’s modifiers are all included in the modifiers of the antecedent candidate. [sent-215, score-0.476]

67 This feature models the same discourse property as the previous feature, but it fo– cuses on the two individual mentions to be linked, rather than their entire clusters. [sent-216, score-0.477]

68 Not i-within-i the two mentions are not in an iwithin-i construct, i. [sent-218, score-0.436]

69 5 Pass 6 - Relaxed Head Matching This pass relaxes the cluster head match heuristic by allowing the mention head to match any word in the cluster of the candidate antecedent. [sent-232, score-1.072]

70 For example, this heuristic matches the mention Sanders to a cluster containing the mentions {Sauls, the judge, Cluisrtceurit c Judge Nin. [sent-233, score-0.877]

71 6 Pass 7 - Pronouns With one exception (Pass 2), all the previous coreference models focus on nominal coreference resolution. [sent-241, score-0.835]

72 However, it would be incorrect to say that our framework ignores pronominal coreference in the first six passes. [sent-242, score-0.508]

73 In fact, the previous models prepare the stage for pronominal coreference by constructing precise clusters with shared mention attributes. [sent-243, score-0.854]

74 Like previous work, we implement pronominal coreference resolution by enforcing agreement constraints between the coreferent mentions. [sent-245, score-0.656]

75 We exclude from this analysis two notable works that report results only on a version of the task that includes finding mentions (Haghighi and Klein, 2010; Stoyanov, 2010). [sent-267, score-0.436]

76 Our sieve model outperforms all systems on two out of the four evaluation corpora (ACE2004ROTH-DEV and ACE2004-NWIRE), on all metrics. [sent-270, score-0.323]

77 We believe this is particularly useful for large-scale NLP applications that use coreference resolution components, e. [sent-278, score-0.485]

78 These applications can generally function without coreference information so it is beneficial to provide such information only when it is highly precise. [sent-281, score-0.379]

79 1 Comparison to Previous Work The sieve model outperforms all other systems on at least two test sets, even though most of the other models are significantly richer. [sent-283, score-0.323]

80 2 Semantic Head Matching Recent unsupervised coreference work Haghighi and Klein (2009) included a semantic component that matched related words (e. [sent-298, score-0.416]

81 They first identified articles relevant to the entity mentions in the test set, and then bootstrapped from known syntactic patterns for apposition and predicate-nominatives in order to learn a database of related head pairs. [sent-372, score-0.634]

82 They show impressive gains by using these learned pairs in coreference decisions. [sent-373, score-0.379]

83 This type of learning using test set mentions is often described as transductive. [sent-374, score-0.436]

84 However, our results show 499 that our sieve algorithm with minimal semantic information still performs as well as the Haghighi and Klein (2009) system with semantics. [sent-377, score-0.323]

85 3 Flexible Architecture The sieve architecture offers benefits beyond improved accuracy. [sent-379, score-0.389]

86 The sieve allows new features to be seamlessly inserted without affecting (or even understanding) the other components. [sent-381, score-0.357]

87 It can be difficult to fully understand how a system makes a single decision, but the sieve allows for flexible usage with minimal effort. [sent-384, score-0.323]

88 For example, the combined (precision plus recall) number of errors for proper or common noun mentions is three times larger than the number of errors made for pronom- inal mentions. [sent-392, score-0.534]

89 In order to understand the limitations of our current system, we randomly selected 60 recall errors (20 for each mention type) and investigated their causes. [sent-397, score-0.321]

90 For proper nouns, 50% of recall errors are due to mention lengthening, mentions that are longer than their earlier mentions. [sent-399, score-0.757]

91 For example, Washingtonbased USAir appears after USAir in the text, so our head matching components skip it because their high precision depends on disallowing new modifiers as the discourse proceeds. [sent-400, score-0.338]

92 When the mentions were reversed (as is the usual case), they match. [sent-401, score-0.436]

93 Our sieve-based approach to coreference uniquely allows for such new models to be seamlessly inserted. [sent-415, score-0.413]

94 7 Conclusion We presented a simple deterministic approach to coreference resolution that incorporates documentlevel information, which is typically exploited only by more complex, joint learning models. [sent-416, score-0.543]

95 Our sieve architecture applies a battery of deterministic coreference models one at a time from highest to lowest precision, where each model builds on the previous model’s cluster output. [sent-417, score-0.986]

96 An additional benefit of the sieve framework is its modularity: new features or models can be inserted in the system with limited understanding of the other features already deployed. [sent-419, score-0.323]

97 Our code is publicly released5 and can be used both as a stand-alone coreference system and as a platform for the development of future systems. [sent-420, score-0.379]

98 Simple coreference resolution with rich syntactic and semantic features. [sent-507, score-0.485]

99 Gender and animacy knowledge discovery from web-scale n-grams for unsupervised person mention detection. [sent-525, score-0.472]

100 Conundrums in noun phrase coreference resolution: making sense of the state-of-the-art. [sent-579, score-0.379]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('mentions', 0.436), ('coreference', 0.379), ('sieve', 0.323), ('mention', 0.272), ('haghighi', 0.189), ('antecedent', 0.17), ('bengston', 0.136), ('pass', 0.132), ('klein', 0.131), ('head', 0.13), ('cluster', 0.127), ('attributes', 0.124), ('acronym', 0.119), ('animacy', 0.119), ('resolution', 0.106), ('modifiers', 0.102), ('pronominal', 0.091), ('demonym', 0.085), ('nominal', 0.077), ('appositive', 0.073), ('pronoun', 0.071), ('entity', 0.068), ('gender', 0.068), ('clusters', 0.066), ('architecture', 0.066), ('bar', 0.066), ('plural', 0.061), ('pronouns', 0.06), ('pairwise', 0.06), ('israelis', 0.058), ('deterministic', 0.058), ('finkel', 0.058), ('ner', 0.057), ('passes', 0.057), ('broncos', 0.051), ('cautious', 0.051), ('coreferent', 0.051), ('indefinite', 0.051), ('links', 0.05), ('errors', 0.049), ('antecedents', 0.048), ('modular', 0.048), ('static', 0.048), ('precise', 0.046), ('person', 0.044), ('roth', 0.044), ('tier', 0.044), ('animate', 0.044), ('stoyanov', 0.044), ('traversal', 0.044), ('israel', 0.043), ('heuristic', 0.042), ('match', 0.041), ('culotta', 0.041), ('discourse', 0.041), ('bergsma', 0.039), ('fox', 0.039), ('poon', 0.039), ('incorrect', 0.038), ('unsupervised', 0.037), ('sports', 0.037), ('stanford', 0.037), ('precision', 0.036), ('domingos', 0.036), ('salience', 0.036), ('named', 0.036), ('ion', 0.036), ('detection', 0.034), ('ible', 0.034), ('inclus', 0.034), ('ingular', 0.034), ('kertz', 0.034), ('lengthening', 0.034), ('modi', 0.034), ('nominative', 0.034), ('pepsi', 0.034), ('precedence', 0.034), ('quaker', 0.034), ('runway', 0.034), ('sauls', 0.034), ('seamlessly', 0.034), ('usair', 0.034), ('vilain', 0.034), ('favors', 0.034), ('builds', 0.033), ('attribute', 0.033), ('gold', 0.033), ('ordering', 0.032), ('ji', 0.032), ('singular', 0.032), ('np', 0.031), ('candidate', 0.03), ('points', 0.029), ('tiers', 0.029), ('compat', 0.029), ('muc', 0.029), ('nps', 0.029), ('overwhelm', 0.029), ('enforcing', 0.029), ('matching', 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000002 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

2 0.19864999 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora

Author: Yassine Benajiba ; Imed Zitouni

Abstract: The research question treated in this paper is centered on the idea of exploiting rich resources of one language to enhance the performance of a mention detection system of another one. We successfully achieve this goal by projecting information from one language to another via a parallel corpus. We examine the potential improvement using various degrees of linguistic information in a statistical framework and we show that the proposed technique is effective even when the target language model has access to a significantly rich feature set. Experimental results show up to 2.4F improvement in performance when the system has access to information obtained by projecting mentions from a resource-richlanguage mention detection system via a parallel corpus.

3 0.19755429 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input

Author: Radu Florian ; John Pitrelli ; Salim Roukos ; Imed Zitouni

Abstract: Information-extraction (IE) research typically focuses on clean-text inputs. However, an IE engine serving real applications yields many false alarms due to less-well-formed input. For example, IE in a multilingual broadcast processing system has to deal with inaccurate automatic transcription and translation. The resulting presence of non-target-language text in this case, and non-language material interspersed in data from other applications, raise the research problem of making IE robust to such noisy input text. We address one such IE task: entity-mention detection. We describe augmenting a statistical mention-detection system in order to reduce false alarms from spurious passages. The diverse nature of input noise leads us to pursue a multi-faceted approach to robustness. For our English-language system, at various miss rates we eliminate 97% of false alarms on inputs from other Latin-alphabet languages. In another experiment, representing scenarios in which genre-specific training is infeasible, we process real financial-transactions text containing mixed languages and data-set codes. On these data, because we do not train on data like it, we achieve a smaller but significant improvement. These gains come with virtually no loss in accuracy on clean English text.

4 0.16477427 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

5 0.084042341 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

Author: Christos Christodoulopoulos ; Sharon Goldwater ; Mark Steedman

Abstract: Part-of-speech (POS) induction is one of the most popular tasks in research on unsupervised NLP. Many different methods have been proposed, yet comparisons are difficult to make since there is little consensus on evaluation framework, and many papers evaluate against only one or two competitor systems. Here we evaluate seven different POS induction systems spanning nearly 20 years of work, using a variety of measures. We show that some of the oldest (and simplest) systems stand up surprisingly well against more recent approaches. Since most of these systems were developed and tested using data from the WSJ corpus, we compare their generalization abil- ities by testing on both WSJ and the multilingual Multext-East corpus. Finally, we introduce the idea of evaluating systems based on their ability to produce cluster prototypes that are useful as input to a prototype-driven learner. In most cases, the prototype-driven learner outperforms the unsupervised system used to initialize it, yielding state-of-the-art results on WSJ and improvements on nonEnglish corpora.

6 0.082827367 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

7 0.081937626 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

8 0.07954859 20 emnlp-2010-Automatic Detection and Classification of Social Events

9 0.06935209 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

10 0.062950373 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

11 0.056784589 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

12 0.05312404 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

13 0.049915474 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing

14 0.049166538 61 emnlp-2010-Improving Gender Classification of Blog Authors

15 0.049035657 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction

16 0.048845153 113 emnlp-2010-Unsupervised Induction of Tree Substitution Grammars for Dependency Parsing

17 0.047405589 104 emnlp-2010-The Necessity of Combining Adaptation Methods

18 0.047152575 84 emnlp-2010-NLP on Spoken Documents Without ASR

19 0.044997595 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications

20 0.044891991 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.184), (1, 0.122), (2, 0.014), (3, 0.204), (4, -0.072), (5, -0.309), (6, -0.009), (7, 0.11), (8, 0.059), (9, -0.285), (10, 0.082), (11, 0.053), (12, -0.017), (13, 0.074), (14, -0.001), (15, -0.066), (16, 0.075), (17, -0.128), (18, 0.135), (19, 0.011), (20, -0.137), (21, 0.064), (22, -0.03), (23, 0.018), (24, 0.025), (25, 0.088), (26, 0.056), (27, 0.015), (28, 0.008), (29, 0.057), (30, -0.036), (31, 0.047), (32, 0.107), (33, -0.002), (34, -0.019), (35, 0.004), (36, 0.111), (37, 0.157), (38, -0.018), (39, -0.015), (40, 0.041), (41, -0.056), (42, 0.007), (43, -0.152), (44, -0.028), (45, 0.02), (46, 0.088), (47, 0.067), (48, 0.016), (49, 0.067)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96839297 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

2 0.66808909 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input

Author: Radu Florian ; John Pitrelli ; Salim Roukos ; Imed Zitouni

Abstract: Information-extraction (IE) research typically focuses on clean-text inputs. However, an IE engine serving real applications yields many false alarms due to less-well-formed input. For example, IE in a multilingual broadcast processing system has to deal with inaccurate automatic transcription and translation. The resulting presence of non-target-language text in this case, and non-language material interspersed in data from other applications, raise the research problem of making IE robust to such noisy input text. We address one such IE task: entity-mention detection. We describe augmenting a statistical mention-detection system in order to reduce false alarms from spurious passages. The diverse nature of input noise leads us to pursue a multi-faceted approach to robustness. For our English-language system, at various miss rates we eliminate 97% of false alarms on inputs from other Latin-alphabet languages. In another experiment, representing scenarios in which genre-specific training is infeasible, we process real financial-transactions text containing mixed languages and data-set codes. On these data, because we do not train on data like it, we achieve a smaller but significant improvement. These gains come with virtually no loss in accuracy on clean English text.

3 0.65426487 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora

Author: Yassine Benajiba ; Imed Zitouni

Abstract: The research question treated in this paper is centered on the idea of exploiting rich resources of one language to enhance the performance of a mention detection system of another one. We successfully achieve this goal by projecting information from one language to another via a parallel corpus. We examine the potential improvement using various degrees of linguistic information in a statistical framework and we show that the proposed technique is effective even when the target language model has access to a significantly rich feature set. Experimental results show up to 2.4F improvement in performance when the system has access to information obtained by projecting mentions from a resource-richlanguage mention detection system via a parallel corpus.

4 0.52908254 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

5 0.36187074 14 emnlp-2010-A Tree Kernel-Based Unified Framework for Chinese Zero Anaphora Resolution

Author: Fang Kong ; Guodong Zhou

Abstract: This paper proposes a unified framework for zero anaphora resolution, which can be divided into three sub-tasks: zero anaphor detection, anaphoricity determination and antecedent identification. In particular, all the three sub-tasks are addressed using tree kernel-based methods with appropriate syntactic parse tree structures. Experimental results on a Chinese zero anaphora corpus show that the proposed tree kernel-based methods significantly outperform the feature-based ones. This indicates the critical role of the structural information in zero anaphora resolution and the necessity of tree kernel-based methods in modeling such structural information. To our best knowledge, this is the first systematic work dealing with all the three sub-tasks in Chinese zero anaphora resolution via a unified framework. Moreover, we release a Chinese zero anaphora corpus of 100 documents, which adds a layer of annotation to the manu- ally-parsed sentences in the Chinese Treebank (CTB) 6.0.

6 0.32485518 111 emnlp-2010-Two Decades of Unsupervised POS Induction: How Far Have We Come?

7 0.3221494 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

8 0.29408416 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

9 0.24475171 20 emnlp-2010-Automatic Detection and Classification of Social Events

10 0.24160933 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text

11 0.23971587 61 emnlp-2010-Improving Gender Classification of Blog Authors

12 0.21933311 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

13 0.21623579 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions

14 0.21389909 118 emnlp-2010-Utilizing Extra-Sentential Context for Parsing

15 0.20111759 59 emnlp-2010-Identifying Functional Relations in Web Text

16 0.20030858 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

17 0.1957203 24 emnlp-2010-Automatically Producing Plot Unit Representations for Narrative Text

18 0.19378535 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction

19 0.18607746 84 emnlp-2010-NLP on Spoken Documents Without ASR

20 0.1812291 106 emnlp-2010-Top-Down Nearly-Context-Sensitive Parsing


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.028), (10, 0.016), (12, 0.044), (17, 0.332), (29, 0.07), (30, 0.023), (52, 0.028), (56, 0.053), (62, 0.036), (66, 0.129), (72, 0.059), (76, 0.035), (77, 0.013), (79, 0.015), (87, 0.031), (89, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.77114987 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

2 0.71615869 124 emnlp-2010-Word Sense Induction Disambiguation Using Hierarchical Random Graphs

Author: Ioannis Klapaftis ; Suresh Manandhar

Abstract: Graph-based methods have gained attention in many areas of Natural Language Processing (NLP) including Word Sense Disambiguation (WSD), text summarization, keyword extraction and others. Most of the work in these areas formulate their problem in a graph-based setting and apply unsupervised graph clustering to obtain a set of clusters. Recent studies suggest that graphs often exhibit a hierarchical structure that goes beyond simple flat clustering. This paper presents an unsupervised method for inferring the hierarchical grouping of the senses of a polysemous word. The inferred hierarchical structures are applied to the problem of word sense disambiguation, where we show that our method performs sig- nificantly better than traditional graph-based methods and agglomerative clustering yielding improvements over state-of-the-art WSD systems based on sense induction.

3 0.47728163 6 emnlp-2010-A Latent Variable Model for Geographic Lexical Variation

Author: Jacob Eisenstein ; Brendan O'Connor ; Noah A. Smith ; Eric P. Xing

Abstract: The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author’s geographic location from raw text, outperforming both text regression and supervised topic models.

4 0.45845029 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

Author: Sankaranarayanan Ananthakrishnan ; Rohit Prasad ; David Stallard ; Prem Natarajan

Abstract: Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. Active sample selection aims to reduce the labor, time, and expense incurred in producing such resources, attaining a given performance benchmark with the smallest possible training corpus by choosing informative, nonredundant source sentences from an available candidate pool for manual translation. We present a novel, discriminative sample selection strategy that preferentially selects batches of candidate sentences with constructs that lead to erroneous translations on a held-out development set. The proposed strategy supports a built-in diversity mechanism that reduces redundancy in the selected batches. Simulation experiments on English-to-Pashto and Spanish-to-English translation tasks demon- strate the superiority of the proposed approach to a number of competing techniques, such as random selection, dissimilarity-based selection, as well as a recently proposed semisupervised active learning strategy.

5 0.45841819 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

Author: Xian Qian ; Qi Zhang ; Yaqian Zhou ; Xuanjing Huang ; Lide Wu

Abstract: Many sequence labeling tasks in NLP require solving a cascade of segmentation and tagging subtasks, such as Chinese POS tagging, named entity recognition, and so on. Traditional pipeline approaches usually suffer from error propagation. Joint training/decoding in the cross-product state space could cause too many parameters and high inference complexity. In this paper, we present a novel method which integrates graph structures of two subtasks into one using virtual nodes, and performs joint training and decoding in the factorized state space. Experimental evaluations on CoNLL 2000 shallow parsing data set and Fourth SIGHAN Bakeoff CTB POS tagging data set demonstrate the superiority of our method over cross-product, pipeline and candidate reranking approaches.

6 0.45767781 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

7 0.45729652 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

8 0.45292497 49 emnlp-2010-Extracting Opinion Targets in a Single and Cross-Domain Setting with Conditional Random Fields

9 0.45132118 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

10 0.44941664 120 emnlp-2010-What's with the Attitude? Identifying Sentences with Attitude in Online Discussions

11 0.44834763 119 emnlp-2010-We're Not in Kansas Anymore: Detecting Domain Changes in Streams

12 0.44796303 78 emnlp-2010-Minimum Error Rate Training by Sampling the Translation Lattice

13 0.4478364 45 emnlp-2010-Evaluating Models of Latent Document Semantics in the Presence of OCR Errors

14 0.44772008 92 emnlp-2010-Predicting the Semantic Compositionality of Prefix Verbs

15 0.44730794 104 emnlp-2010-The Necessity of Combining Adaptation Methods

16 0.44723648 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

17 0.44706073 84 emnlp-2010-NLP on Spoken Documents Without ASR

18 0.44695517 114 emnlp-2010-Unsupervised Parse Selection for HPSG

19 0.44694224 20 emnlp-2010-Automatic Detection and Classification of Social Events

20 0.44533166 43 emnlp-2010-Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping