acl acl2011 acl2011-85 knowledge-graph by maker-knowledge-mining

85 acl-2011-Coreference Resolution with World Knowledge


Source: pdf

Author: Altaf Rahman ; Vincent Ng

Abstract: While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. Hence, it is not clear whether these benefits can carry over to a stronger baseline. Moreover, since there has been no attempt to apply different sources of world knowledge in combination to coreference resolution, it is not clear whether they offer complementary benefits to a resolver. We systematically compare commonly-used and under-investigated sources of world knowledge for coreference resolution by applying them to two learning-based coreference models and evaluating them on documents annotated with two different annotation schemes.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu a Abstract While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. [sent-3, score-0.678]

2 Moreover, since there has been no attempt to apply different sources of world knowledge in combination to coreference resolution, it is not clear whether they offer complementary benefits to a resolver. [sent-5, score-0.577]

3 We systematically compare commonly-used and under-investigated sources of world knowledge for coreference resolution by applying them to two learning-based coreference models and evaluating them on documents annotated with two different annotation schemes. [sent-6, score-0.928]

4 1 Introduction Noun phrase (NP) coreference resolution is the task of determining which NPs in a text or dialogue refer to the same real-world entity. [sent-7, score-0.367]

5 The difficulty of the task stems in part from its reliance on world knowledge (Charniak, 1972). [sent-8, score-0.188]

6 Having the (world) knowledge that Martha Stewart is a celebrity would be helpful for establishing the coreference relation between the two NPs. [sent-14, score-0.408]

7 However, since these heuristics are not perfect, complementing them with world knowledge would be an important step towards bringing coreference systems to the next level of performance. [sent-16, score-0.471]

8 Despite the usefulness of world knowledge for coreference resolution, early learning-based coref- erence resolvers have relied mostly on morphosyntactic features (e. [sent-17, score-0.545]

9 With recent advances in lexical semantics research and the development of large-scale knowledge bases, researchers have begun to employ world knowledge for coreference resolution. [sent-22, score-0.575]

10 World knowledge is extracted primarily from three data sources, web-based encyclopedia (e. [sent-23, score-0.083]

11 While each of these three sources of world knowledge has been shown to improve coreference resolution, the improvements were typically obtained by incorporating world knowledge (as features) into a baseline resolver composed of a rather weak coreference model (i. [sent-31, score-1.047]

12 First, can world knowledge still offer benefits when used in combination with a richer set of features? [sent-38, score-0.231]

13 Second, since automatically extracted world knowledge is typically noisy (Ponzetto and Poesio, 2009), are recently-developed coreference models more noisetolerant than the mention-pair model, and if so, can they profit more from the noisily extracted world knowledge? [sent-39, score-0.658]

14 Finally, while different world knowlProceedinPgosrt olafn thde, 4 O9rtehg Aonn,n Juuanle M 1e9e-2tin4g, 2 o0f1 t1h. [sent-40, score-0.126]

15 We seek answers to these questions by conducting a systematic evaluation ofdifferent world knowledge sources for learning-based coreference resolution. [sent-43, score-0.509]

16 Specifically, we (1) derive world knowledge from encyclopedic sources that are underinvestigated for coreference resolution, including FrameNet (Baker et al. [sent-44, score-0.509]

17 Our evaluation corpus contains 410 documents, which are coreference-annotated using the ACE annotation scheme as well as the OntoNotes annotation scheme (Hovy et al. [sent-48, score-0.122]

18 By evaluating on two sets of coreference annotations for the same set of documents, we can determine whether the usefulness of world knowledge sources for coreference resolution is dependent on the underlying annotation scheme used to annotate the documents. [sent-50, score-0.962]

19 2 Preliminaries In this section, we describe the corpus, the NP extraction methods, the coreference models, and the evaluation measures we will use in our evaluation. [sent-51, score-0.283]

20 1 Data Set We evaluate on documents that are coreferenceannotated using both the ACE annotation scheme and the OntoNotes annotation scheme, so that we can examine whether the usefulness of our world knowledge sources is dependent on the underlying coreference annotation scheme. [sent-53, score-0.705]

21 ACE and OntoNotes employ different guidelines to annotate coreference chains. [sent-56, score-0.325]

22 A major 815 difference between the two annotation schemes is that ACE only concerns establishing coreference chains among NPs that belong to the ACE entity types, whereas OntoNotes does not have this restriction. [sent-57, score-0.333]

23 Hence, the OntoNotes annotation scheme should produce more coreference chains (i. [sent-58, score-0.365]

24 , nonsingleton coreference clusters) than the ACE annotation scheme for a given set of documents. [sent-60, score-0.344]

25 Another difference between the two annotation schemes is that singleton clusters are annotated in ACE but not OntoNotes. [sent-62, score-0.117]

26 As discussed below, the presence of singleton clusters may have an impact on NP extraction and coreference evaluation. [sent-63, score-0.371]

27 2 NP Extraction Following common practice, we employ different methods to extract NPs from the documents annotated with the two annotation schemes. [sent-65, score-0.118]

28 In other words, doing so could substantially simplify the coreference task. [sent-72, score-0.283]

29 Consequently, we follow the approach adopted by traditional learning-based resolvers and employ an NP chunker to extract NPs. [sent-73, score-0.101]

30 3 Coreference Models We evaluate the utility of world knowledge using the mention-pair model and the cluster-ranking model. [sent-79, score-0.188]

31 These features can also be categorized based on whether they are relational or not. [sent-85, score-0.09]

32 Relational features capture the relationship between NPj and NPk, whereas nonrelational features capture the linguistic property of one of these two NPs. [sent-86, score-0.107]

33 ’s (2001) method for creating training instances: we create (1) a positive instance for each anaphoric NP, NPk, and its closest antecedent, NPj ; and (2) a negative instance for NPk paired with each of the intervening NPs, NPj+1, NPj+2, . [sent-89, score-0.161]

34 The classification of a training instance is either positive or negative, depending on whether the two NPs are coreferent in the associated text. [sent-93, score-0.101]

35 Specifically, each NP, NPk, is compared in turn to each preceding NP, NPj, from right to left, and NPj is selected as its antecedent if the pair is classified as coreferent. [sent-96, score-0.143]

36 The process terminates as soon as an antecedent is found for NPk or the beginning of the text is reached. [sent-97, score-0.115]

37 First, since each candidate antecedent for an NP to be resolved (henceforth an active NP) is considered independently of the others, this model only determines how good a candidate antecedent is relative to the active NP, but not how good a candidate antecedent is relative to other candidates. [sent-99, score-0.278]

38 So, it fails to answer the critical question of which candidate antecedent is most probable. [sent-100, score-0.098]

39 Second, it has limitations in its expressiveness: the information extracted from the two NPs alone may not be sufficient for making a coreference decision. [sent-101, score-0.304]

40 Specifically, the CR model ranks the preceding clusters for an active NP so that the highest-ranked cluster is the one to which the active NP should be linked. [sent-112, score-0.22]

41 Employing a ranker addresses the first weakness, as a ranker allows all candidates to be compared simultaneously. [sent-113, score-0.124]

42 Considering preceding clusters rather than antecedents as candidates addresses the second weakness, as cluster-level features (i. [sent-114, score-0.158]

43 , features that are defined over any subset of NPs in a preceding cluster) can be employed. [sent-116, score-0.104]

44 Since the CR model ranks preceding clusters, a training instance i(cj, NPk) represents a preceding cluster, cj, and an anaphoric NP, NPk. [sent-118, score-0.203]

45 Each instance consists of features that are computed based solely on NPk as well as cluster-level features, which describe the relationship between cj and NPk. [sent-119, score-0.157]

46 (2007), we create cluster-level features from the relational features in our feature set using four predicates: NONE, MOSTFALSE, MOST-TRUE, and ALL. [sent-121, score-0.178]

47 Specifically, for each relational feature X, we first convert X into an equivalent set of binary-valued features if it is multivalued. [sent-122, score-0.089]

48 We train a cluster ranker to jointly learn anaphoricity determination and coreference resolution using SVMlight’s ranker-learning algorithm. [sent-124, score-0.486]

49 Specifically, for each NP, NPk, we create a training instance between NPk and each preceding cluster cj using the features described above. [sent-125, score-0.329]

50 Since we are learning a joint model, we need to provide the ranker with the option to start a new cluster by creating an additional training instance that contains the nonrelational features describing NPk. [sent-126, score-0.225]

51 The rank value of a training instance i(cj, NPk) created for NPk is the rank of cj among the competing clusters. [sent-127, score-0.162]

52 If NPk is non-anaphoric, its rank is LOW unless it is the additional training instance described above, which has rank HIGH. [sent-129, score-0.082]

53 After training, the cluster ranker processes the NPs in a test text in a left-to-right manner. [sent-130, score-0.119]

54 For each active NP, NPk, we create test instances for it by pairing it with each of its preceding clusters. [sent-131, score-0.163]

55 To allow for the possibility that NPk is non-anaphoric, we create an additional test instance as during training. [sent-132, score-0.088]

56 If the additional test instance is assigned the highest rank value, then we create a new cluster containing NPk. [sent-134, score-0.167]

57 Note that the partial clusters preceding NPk are formed incrementally based on the predictions of the ranker for the first k − 1 NPs. [sent-136, score-0.181]

58 POn| ttohe o bottaheinr hand, CEAF finds the best one-to-one alignment between the key clusters and the response clusters. [sent-145, score-0.098]

59 A complication arises when B3 is used to score a response partition containing automatically extracted NPs. [sent-146, score-0.093]

60 Hence, if the response is generated using goldstandard NPs, then every NP in the response is mapped to some NP in the key and vice versa. [sent-148, score-0.088]

61 To address this problem, we set the recall and precision of a twinless NP to zero, regardless of whether the NP appears in the key or the response. [sent-154, score-0.082]

62 Additionally, in order not to over-penalize a response partition, we remove all the twinless NPs in the response that are singletons. [sent-156, score-0.145]

63 Since B3 and CEAF align NPs/clusters, the lack of singleton clusters in the OntoNotes annotations implies that the resulting scores reflect solely how well a resolver identifies coreference links and do not take into account how well it identifies singleton clusters. [sent-158, score-0.453]

64 3 Extracting World Knowledge In this section, we describe how we extract world knowledge for coreference resolution from three different sources: large-scale knowledge bases, coreference-annotated data and unannotated data. [sent-159, score-0.676]

65 1 World Knowledge from Knowledge Bases We extract world knowledge from two large-scale knowledge bases, YAGO and FrameNet. [sent-161, score-0.274]

66 1 Extracting Knowledge from YAGO We choose to employ YAGO rather than the more popularly-used Wikipedia due to its potentially richer knowledge, which comprises 5 million facts extracted from Wikipedia and WordNet. [sent-164, score-0.087]

67 , 2011), we employ the two relation types that we believe are most useful for coreference resolution, TYPE and MEANS. [sent-168, score-0.325]

68 For instance, the two triples (E inst e in, MEANS, Albe rtE inst e in) and (E inst e in, MEANS, Al fredE inst e in) denote the facts that Einstein may refer to the physicist Albert Einstein and the musicologist Alfred Einstein, respectively. [sent-172, score-0.312]

69 We incorporate the world knowledge from YAGO into our coreference models as a binary-valued feature. [sent-180, score-0.471]

70 On the other hand, if the CR model is used, the YAGO feature for an instance involving NPk and preceding cluster c will have the value 1 if and only if NPk has a TYPE or MEANS relation with any of the NPs in c. [sent-182, score-0.184]

71 Since knowledge extraction from webbased encyclopedia is typically noisy (Ponzetto and Poesio, 2009), we use YAGO to determine whether two NPs have a relation only if one NP is a named entity (NE) of type person, organization, or location according to the Stanford NE recognizer (Finkel et al. [sent-183, score-0.106]

72 As a schematic representation of a situation, a frame contains the lexical predicates that can invoke it as well as the frame elements (i. [sent-189, score-0.201]

73 This frame has COMMUNICATOR and EVALUEE as its core frame elements and ADDRESSEE as its noncore frame elements, and can be invoked by more than 40 predicates, such as acclaim, accuse, com818 mend, decry, denounce, praise, and slam. [sent-193, score-0.21]

74 To better understand why FrameNet contains potentially useful knowledge for coreference resolution, consider the following text segment: Peter Anthony decries program trading as “limiting the game to a few,” but he is not sure whether he wants to denounce it because . [sent-194, score-0.448]

75 To establish the coreference relation between it and program trading, it may be helpful to know that decry and denounce appear in the same frame and the two NPs have the same semantic role. [sent-197, score-0.474]

76 This example suggests that features encoding both the semantic roles of the two NPs under consideration and whether the associated predicates are “related” to each other in FrameNet (i. [sent-198, score-0.185]

77 , whether they appear in the same frame) could be useful for identifying coreference relations. [sent-200, score-0.308]

78 First, since we do not employ verb sense dis- ambiguation, we consider two predicates related as long as there is at least one semantic frame in which they both appear. [sent-202, score-0.194]

79 , 2004), a semantic role labeler that provides PropBank-style semantic roles such as ARG0 (the PROTOAGENT, which is typically the subject of a transitive verb) and ARG1 (the PROTOPATIENT, which is typically its direct object). [sent-204, score-0.119]

80 Now, assuming that NPj and NPk are the arguments of two stemmed predicates, predj and predk, we create 15 features using the knowledge extracted from FrameNet and ASSERT as follows. [sent-205, score-0.243]

81 First, we encode the knowledge extracted from FrameNet as one of three possible values: (1) predj and predk are in the same frame; (2) they are both predicates in FrameNet but never appear in the same frame; and (3) one or both predicates do not appear in FrameNet. [sent-206, score-0.291]

82 2 Finally, we create 15 binary-valued features by pairing the 3 possible values extracted from FrameNet and the 5 possible values provided by ASSERT. [sent-208, score-0.136]

83 If this assumption fails, we will not create any features based on FrameNet for these two NPs. [sent-215, score-0.089]

84 To our knowledge, FrameNet has not been exploited for coreference resolution. [sent-216, score-0.283]

85 2 World Knowledge from Annotated Data Since world knowledge is needed for coreference resolution, a human annotator must have employed world knowledge when coreference-annotating a document. [sent-219, score-0.659]

86 We aim to design features that can “recover” such world knowledge from annotated data. [sent-220, score-0.227]

87 1 Features Based on Noun Pairs A natural question is: what kind of world knowledge can we extract from annotated data? [sent-223, score-0.212]

88 president if we see these two NPs appearing in the same coreference chain. [sent-226, score-0.283]

89 Note that any features computed based on WordNet distance or distributional similarity are likely to incorrectly suggest that lion and tiger are coreferent, since the two nouns are similar distributionally and according to WordNet. [sent-229, score-0.086]

90 819 To improve generalization, we instead create different kinds of noun-pair-based features given an annotated text. [sent-232, score-0.089]

91 Next, we create noun-pair-based features for the MP model, which will be used to augment the Baseline feature set. [sent-238, score-0.113]

92 Either an UNSEEN-SAME feature or an UNSEEN-DIFF feature is created, depend- ing on whether the two NPs are the same string before being replaced with the UNSEEN token. [sent-241, score-0.099]

93 If exactly one of NPj and NPk is tagged as a NE by the Stanford NE recognizer, we create a semi-lexical feature that is identical to the lexical feature described above, except that the NE is replaced with its NE label. [sent-246, score-0.124]

94 Specifically, since each instance now corresponds to an NP, NPk, and a preceding cluster, c, we can generate a noun-pair-based feature by applying the above method to NPk and each of the NPs in c, and its value is the number of times it is applicable to NPk and c. [sent-255, score-0.127]

95 2 Features Based on Verb Pairs As discussed above, features encoding the semantic roles of two NPs and the relatedness of the associated verbs could be useful for coreference resolution. [sent-258, score-0.403]

96 Rather than encoding verb relatedness, we may replace verb relatedness with the verbs themselves in these features, and have the learner learn directly from coreference-annotated data whether two NPs serving as the objects of decry and denounce are likely to be coreferent or not, for instance. [sent-259, score-0.184]

97 Specifically, assuming that NPj and NPk are the arguments of two stemmed predicates, predj and predk, in the training data, we create five features as follows. [sent-260, score-0.16]

98 Second, we create five binary-valued features by pairing each of these five values with the two stemmed predicates. [sent-262, score-0.143]

99 If this assumption fails, we will not create any features based on verb pairs for these two NPs. [sent-268, score-0.089]

100 3 World Knowledge from Unannotated Data Previous work has shown that syntactic appositions, which can be extracted using heuristics from unannotated documents or parse trees, are a useful source of world knowledge for coreference resolution (e. [sent-270, score-0.634]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('npk', 0.659), ('nps', 0.35), ('npj', 0.315), ('coreference', 0.283), ('yago', 0.139), ('np', 0.138), ('world', 0.126), ('framenet', 0.091), ('rahman', 0.086), ('resolution', 0.084), ('cj', 0.08), ('antecedent', 0.078), ('ne', 0.075), ('mp', 0.074), ('inst', 0.072), ('ace', 0.071), ('frame', 0.07), ('ontonotes', 0.067), ('cr', 0.066), ('preceding', 0.065), ('celebrity', 0.063), ('ranker', 0.062), ('knowledge', 0.062), ('predicates', 0.061), ('denounce', 0.057), ('einstein', 0.057), ('twinless', 0.057), ('cluster', 0.057), ('clusters', 0.054), ('ceaf', 0.051), ('create', 0.05), ('xb', 0.05), ('resolver', 0.048), ('response', 0.044), ('decry', 0.043), ('predj', 0.043), ('predk', 0.043), ('employ', 0.042), ('ponzetto', 0.042), ('features', 0.039), ('roles', 0.039), ('ng', 0.039), ('instance', 0.038), ('coreferent', 0.038), ('sources', 0.038), ('soon', 0.037), ('anaphoric', 0.035), ('resolvers', 0.035), ('stewart', 0.035), ('unannotated', 0.035), ('singleton', 0.034), ('specifically', 0.032), ('scheme', 0.032), ('albe', 0.029), ('coreferenceannotated', 0.029), ('kpj', 0.029), ('nonrelational', 0.029), ('rpj', 0.029), ('annotation', 0.029), ('partition', 0.028), ('stemmed', 0.028), ('wordnet', 0.027), ('wiki', 0.027), ('bases', 0.027), ('replaced', 0.026), ('relational', 0.026), ('pairing', 0.026), ('uryupina', 0.025), ('tiger', 0.025), ('communicator', 0.025), ('evaluee', 0.025), ('whether', 0.025), ('unseen', 0.025), ('feature', 0.024), ('facts', 0.024), ('extract', 0.024), ('wikipedia', 0.023), ('stoyanov', 0.023), ('bagga', 0.023), ('assert', 0.023), ('recalls', 0.023), ('documents', 0.023), ('benefits', 0.023), ('active', 0.022), ('true', 0.022), ('lion', 0.022), ('rank', 0.022), ('chains', 0.021), ('extracted', 0.021), ('semantic', 0.021), ('trading', 0.021), ('relatedness', 0.021), ('fails', 0.02), ('offer', 0.02), ('rel', 0.02), ('strube', 0.02), ('martha', 0.02), ('poesio', 0.019), ('luo', 0.019), ('typically', 0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000001 85 acl-2011-Coreference Resolution with World Knowledge

Author: Altaf Rahman ; Vincent Ng

Abstract: While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. Hence, it is not clear whether these benefits can carry over to a stronger baseline. Moreover, since there has been no attempt to apply different sources of world knowledge in combination to coreference resolution, it is not clear whether they offer complementary benefits to a resolver. We systematically compare commonly-used and under-investigated sources of world knowledge for coreference resolution by applying them to two learning-based coreference models and evaluating them on documents annotated with two different annotation schemes.

2 0.17378096 23 acl-2011-A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models

Author: Dingcheng Li ; Tim Miller ; William Schuler

Abstract: and Wellner, This paper presents a supervised pronoun anaphora resolution system based on factorial hidden Markov models (FHMMs). The basic idea is that the hidden states of FHMMs are an explicit short-term memory with an antecedent buffer containing recently described referents. Thus an observed pronoun can find its antecedent from the hidden buffer, or in terms of a generative model, the entries in the hidden buffer generate the corresponding pronouns. A system implementing this model is evaluated on the ACE corpus with promising performance.

3 0.15847063 63 acl-2011-Bootstrapping coreference resolution using word associations

Author: Hamidreza Kobdani ; Hinrich Schuetze ; Michael Schiehlen ; Hans Kamp

Abstract: In this paper, we present an unsupervised framework that bootstraps a complete coreference resolution (CoRe) system from word associations mined from a large unlabeled corpus. We show that word associations are useful for CoRe – e.g., the strong association between Obama and President is an indicator of likely coreference. Association information has so far not been used in CoRe because it is sparse and difficult to learn from small labeled corpora. Since unlabeled text is readily available, our unsupervised approach addresses the sparseness problem. In a self-training framework, we train a decision tree on a corpus that is automatically labeled using word associations. We show that this unsupervised system has better CoRe performance than other learning approaches that do not use manually labeled data. .

4 0.1496311 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters

Author: Ryan Gabbard ; Marjorie Freedman ; Ralph Weischedel

Abstract: As an alternative to requiring substantial supervised relation training data, many have explored bootstrapping relation extraction from a few seed examples. Most techniques assume that the examples are based on easily spotted anchors, e.g., names or dates. Sentences in a corpus which contain the anchors are then used to induce alternative ways of expressing the relation. We explore whether coreference can improve the learning process. That is, if the algorithm considered examples such as his sister, would accuracy be improved? With coreference, we see on average a 2-fold increase in F-Score. Despite using potentially errorful machine coreference, we see significant increase in recall on all relations. Precision increases in four cases and decreases in six.

5 0.14645457 9 acl-2011-A Cross-Lingual ILP Solution to Zero Anaphora Resolution

Author: Ryu Iida ; Massimo Poesio

Abstract: We present an ILP-based model of zero anaphora detection and resolution that builds on the joint determination of anaphoricity and coreference model proposed by Denis and Baldridge (2007), but revises it and extends it into a three-way ILP problem also incorporating subject detection. We show that this new model outperforms several baselines and competing models, as well as a direct translation of the Denis / Baldridge model, for both Italian and Japanese zero anaphora. We incorporate our model in complete anaphoric resolvers for both Italian and Japanese, showing that our approach leads to improved performance also when not used in isolation, provided that separate classifiers are used for zeros and for ex- plicitly realized anaphors.

6 0.140444 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

7 0.12249757 129 acl-2011-Extending the Entity Grid with Entity-Specific Features

8 0.099883772 114 acl-2011-End-to-End Relation Extraction Using Distant Supervision from External Semantic Repositories

9 0.081421405 274 acl-2011-Semi-Supervised Frame-Semantic Parsing for Unknown Predicates

10 0.076624155 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

11 0.069299184 315 acl-2011-Types of Common-Sense Knowledge Needed for Recognizing Textual Entailment

12 0.068331234 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

13 0.067162149 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

14 0.065504149 8 acl-2011-A Corpus of Scope-disambiguated English Text

15 0.058883801 191 acl-2011-Knowledge Base Population: Successful Approaches and Challenges

16 0.058209255 275 acl-2011-Semi-Supervised Modeling for Prenominal Modifier Ordering

17 0.056445502 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

18 0.056266841 293 acl-2011-Template-Based Information Extraction without the Templates

19 0.051992811 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction

20 0.047696695 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.123), (1, 0.034), (2, -0.121), (3, -0.012), (4, 0.067), (5, 0.041), (6, 0.031), (7, -0.052), (8, -0.188), (9, -0.003), (10, 0.053), (11, -0.047), (12, -0.08), (13, -0.052), (14, -0.008), (15, 0.004), (16, -0.051), (17, 0.013), (18, 0.013), (19, 0.048), (20, -0.01), (21, 0.05), (22, -0.017), (23, 0.058), (24, -0.064), (25, -0.024), (26, -0.063), (27, -0.065), (28, -0.066), (29, -0.129), (30, 0.12), (31, -0.157), (32, -0.022), (33, -0.053), (34, -0.149), (35, -0.025), (36, 0.0), (37, -0.023), (38, 0.06), (39, 0.007), (40, 0.121), (41, 0.034), (42, -0.064), (43, 0.034), (44, 0.103), (45, 0.031), (46, -0.024), (47, -0.03), (48, 0.038), (49, -0.125)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.92923617 85 acl-2011-Coreference Resolution with World Knowledge

Author: Altaf Rahman ; Vincent Ng

Abstract: While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. Hence, it is not clear whether these benefits can carry over to a stronger baseline. Moreover, since there has been no attempt to apply different sources of world knowledge in combination to coreference resolution, it is not clear whether they offer complementary benefits to a resolver. We systematically compare commonly-used and under-investigated sources of world knowledge for coreference resolution by applying them to two learning-based coreference models and evaluating them on documents annotated with two different annotation schemes.

2 0.85729361 9 acl-2011-A Cross-Lingual ILP Solution to Zero Anaphora Resolution

Author: Ryu Iida ; Massimo Poesio

Abstract: We present an ILP-based model of zero anaphora detection and resolution that builds on the joint determination of anaphoricity and coreference model proposed by Denis and Baldridge (2007), but revises it and extends it into a three-way ILP problem also incorporating subject detection. We show that this new model outperforms several baselines and competing models, as well as a direct translation of the Denis / Baldridge model, for both Italian and Japanese zero anaphora. We incorporate our model in complete anaphoric resolvers for both Italian and Japanese, showing that our approach leads to improved performance also when not used in isolation, provided that separate classifiers are used for zeros and for ex- plicitly realized anaphors.

3 0.80017304 63 acl-2011-Bootstrapping coreference resolution using word associations

Author: Hamidreza Kobdani ; Hinrich Schuetze ; Michael Schiehlen ; Hans Kamp

Abstract: In this paper, we present an unsupervised framework that bootstraps a complete coreference resolution (CoRe) system from word associations mined from a large unlabeled corpus. We show that word associations are useful for CoRe – e.g., the strong association between Obama and President is an indicator of likely coreference. Association information has so far not been used in CoRe because it is sparse and difficult to learn from small labeled corpora. Since unlabeled text is readily available, our unsupervised approach addresses the sparseness problem. In a self-training framework, we train a decision tree on a corpus that is automatically labeled using word associations. We show that this unsupervised system has better CoRe performance than other learning approaches that do not use manually labeled data. .

4 0.74660438 23 acl-2011-A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models

Author: Dingcheng Li ; Tim Miller ; William Schuler

Abstract: and Wellner, This paper presents a supervised pronoun anaphora resolution system based on factorial hidden Markov models (FHMMs). The basic idea is that the hidden states of FHMMs are an explicit short-term memory with an antecedent buffer containing recently described referents. Thus an observed pronoun can find its antecedent from the hidden buffer, or in terms of a generative model, the entries in the hidden buffer generate the corresponding pronouns. A system implementing this model is evaluated on the ACE corpus with promising performance.

5 0.60808063 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

Author: Sameer Singh ; Amarnag Subramanya ; Fernando Pereira ; Andrew McCallum

Abstract: Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1.5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach.

6 0.46019956 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters

7 0.37554082 129 acl-2011-Extending the Entity Grid with Entity-Specific Features

8 0.32828128 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

9 0.32346022 334 acl-2011-Which Noun Phrases Denote Which Concepts?

10 0.32251623 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

11 0.32027251 229 acl-2011-NULEX: An Open-License Broad Coverage Lexicon

12 0.31446242 8 acl-2011-A Corpus of Scope-disambiguated English Text

13 0.29752639 319 acl-2011-Unsupervised Decomposition of a Document into Authorial Components

14 0.29703075 293 acl-2011-Template-Based Information Extraction without the Templates

15 0.29606345 297 acl-2011-That's What She Said: Double Entendre Identification

16 0.29135031 1 acl-2011-(11-06-spirl)

17 0.28622147 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

18 0.27779916 315 acl-2011-Types of Common-Sense Knowledge Needed for Recognizing Textual Entailment

19 0.27716443 191 acl-2011-Knowledge Base Population: Successful Approaches and Challenges

20 0.27178556 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.016), (9, 0.021), (13, 0.024), (16, 0.025), (17, 0.057), (26, 0.01), (31, 0.012), (37, 0.115), (39, 0.042), (41, 0.045), (47, 0.214), (53, 0.02), (55, 0.077), (59, 0.049), (72, 0.037), (91, 0.038), (96, 0.091), (97, 0.017)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.75240099 85 acl-2011-Coreference Resolution with World Knowledge

Author: Altaf Rahman ; Vincent Ng

Abstract: While world knowledge has been shown to improve learning-based coreference resolvers, the improvements were typically obtained by incorporating world knowledge into a fairly weak baseline resolver. Hence, it is not clear whether these benefits can carry over to a stronger baseline. Moreover, since there has been no attempt to apply different sources of world knowledge in combination to coreference resolution, it is not clear whether they offer complementary benefits to a resolver. We systematically compare commonly-used and under-investigated sources of world knowledge for coreference resolution by applying them to two learning-based coreference models and evaluating them on documents annotated with two different annotation schemes.

2 0.71681333 300 acl-2011-The Surprising Variance in Shortest-Derivation Parsing

Author: Mohit Bansal ; Dan Klein

Abstract: We investigate full-scale shortest-derivation parsing (SDP), wherein the parser selects an analysis built from the fewest number of training fragments. Shortest derivation parsing exhibits an unusual range of behaviors. At one extreme, in the fully unpruned case, it is neither fast nor accurate. At the other extreme, when pruned with a coarse unlexicalized PCFG, the shortest derivation criterion becomes both fast and surprisingly effective, rivaling more complex weighted-fragment approaches. Our analysis includes an investigation of tie-breaking and associated dynamic programs. At its best, our parser achieves an accuracy of 87% F1 on the English WSJ task with minimal annotation, and 90% F1 with richer annotation.

3 0.63549501 237 acl-2011-Ordering Prenominal Modifiers with a Reranking Approach

Author: Jenny Liu ; Aria Haghighi

Abstract: In this work, we present a novel approach to the generation task of ordering prenominal modifiers. We take a maximum entropy reranking approach to the problem which admits arbitrary features on a permutation of modifiers, exploiting hundreds ofthousands of features in total. We compare our error rates to the state-of-the-art and to a strong Google ngram count baseline. We attain a maximum error reduction of 69.8% and average error reduction across all test sets of 59. 1% compared to the state-of-the-art and a maximum error reduction of 68.4% and average error reduction across all test sets of 41.8% compared to our Google n-gram count baseline.

4 0.61268258 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

Author: Yee Seng Chan ; Dan Roth

Abstract: In this paper, we observe that there exists a second dimension to the relation extraction (RE) problem that is orthogonal to the relation type dimension. We show that most of these second dimensional structures are relatively constrained and not difficult to identify. We propose a novel algorithmic approach to RE that starts by first identifying these structures and then, within these, identifying the semantic type of the relation. In the real RE problem where relation arguments need to be identified, exploiting these structures also allows reducing pipelined propagated errors. We show that this RE framework provides significant improvement in RE performance.

5 0.61161417 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering

Author: Ang Sun ; Ralph Grishman ; Satoshi Sekine

Abstract: We present a simple semi-supervised relation extraction system with large-scale word clustering. We focus on systematically exploring the effectiveness of different cluster-based features. We also propose several statistical methods for selecting clusters at an appropriate level of granularity. When training on different sizes of data, our semi-supervised approach consistently outperformed a state-of-the-art supervised baseline system. 1

6 0.61071956 164 acl-2011-Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features

7 0.61029661 186 acl-2011-Joint Training of Dependency Parsing Filters through Latent Support Vector Machines

8 0.60782665 289 acl-2011-Subjectivity and Sentiment Analysis of Modern Standard Arabic

9 0.6065653 88 acl-2011-Creating a manually error-tagged and shallow-parsed learner corpus

10 0.60562676 92 acl-2011-Data point selection for cross-language adaptation of dependency parsers

11 0.60520029 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning

12 0.60517389 275 acl-2011-Semi-Supervised Modeling for Prenominal Modifier Ordering

13 0.60496092 331 acl-2011-Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation

14 0.60391444 32 acl-2011-Algorithm Selection and Model Adaptation for ESL Correction Tasks

15 0.60360569 144 acl-2011-Global Learning of Typed Entailment Rules

16 0.60184312 9 acl-2011-A Cross-Lingual ILP Solution to Zero Anaphora Resolution

17 0.60167277 78 acl-2011-Confidence-Weighted Learning of Factored Discriminative Language Models

18 0.60145408 44 acl-2011-An exponential translation model for target language morphology

19 0.60137647 103 acl-2011-Domain Adaptation by Constraining Inter-Domain Variability of Latent Feature Representation

20 0.60107261 170 acl-2011-In-domain Relation Discovery with Meta-constraints via Posterior Regularization