acl acl2011 acl2011-129 knowledge-graph by maker-knowledge-mining

129 acl-2011-Extending the Entity Grid with Entity-Specific Features


Source: pdf

Author: Micha Elsner ; Eugene Charniak

Abstract: We extend the popular entity grid representation for local coherence modeling. The grid abstracts away information about the entities it models; we add discourse prominence, named entity type and coreference features to distinguish between important and unimportant entities. We improve the best result for WSJ document discrimination by 6%.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Abstract We extend the popular entity grid representation for local coherence modeling. [sent-3, score-1.052]

2 The grid abstracts away information about the entities it models; we add discourse prominence, named entity type and coreference features to distinguish between important and unimportant entities. [sent-4, score-1.378]

3 We improve the best result for WSJ document discrimination by 6%. [sent-5, score-0.161]

4 1 Introduction A well-written document is coherent (Halliday and Hasan, 1976)– it structures information so that each new piece of information is interpretable given the preceding context. [sent-6, score-0.059]

5 Models that distinguish coherent from incoherent documents are widely used in gen- eration, summarization and text evaluation. [sent-7, score-0.092]

6 Among the most popular models of coherence is the entity grid (Barzilay and Lapata, 2008), a statistical model based on Centering Theory (Grosz et al. [sent-8, score-1.045]

7 The grid models the way texts focus on important entities, assigning them repeatedly to prominent syntactic roles. [sent-10, score-0.508]

8 While the grid has been successful in a variety of applications, it is still a surprisingly unsophisticated model, and there have been few direct improvements to its simple feature set. [sent-11, score-0.508]

9 We present an extension to the entity grid which distinguishes between different types of entity, resulting in signi? [sent-12, score-0.858]

10 At its core, the grid model works by predicting whether an entity will appear in the next sentence 1A public implementation is available via https : / / bitbucket . [sent-14, score-0.949]

11 edu (and what syntactic role it will have) given its history of occurrences in the previous sentences. [sent-18, score-0.087]

12 For instance, it estimates the probability that “Clinton” will be the subject of sentence 2, given that it was the subject of sentence 1. [sent-19, score-0.176]

13 The standard grid model uses no information about the entity itself– the probability is the same whether the entity under discussion is “Hillary Clinton” or “wheat”. [sent-20, score-1.238]

14 Distinguishing important from unimportant entity types is important in coreference (Haghighi and Klein, 2010) and summarization (Nenkova et al. [sent-22, score-0.725]

15 , 2005); our model applies the same insight to the entity grid, by adding information from syntax, a named-entity tagger and statistics from an external coreference corpus. [sent-23, score-0.65]

16 2 Related work Since its initial appearance (Lapata and Barzilay, 2005; Barzilay and Lapata, 2005), the entity grid has been used to perform wide variety of tasks. [sent-24, score-0.858]

17 rst proposed application, sentence ordering for multidocument summarization, it has proven useful for story generation (McIntyre and Lapata, 2010), readability prediction (Pitler et al. [sent-26, score-0.226]

18 It also remains a critical component in state-of-the-art sentence ordering models (Soricut and Marcu, 2006; Elsner and Charniak, 2008), which typically combine it with other independently-trained models. [sent-29, score-0.068]

19 There have been few attempts to improve the entity grid directly by altering its feature representation. [sent-30, score-0.858]

20 cant improveProceedings ofP thoer t4l9atnhd A, Onrnuegaoln M,e Jeuntineg 19 o-f2 t4h,e 2 A0s1s1o. [sent-33, score-0.092]

21 2s1-Sconditions-pOlanSXflightl-Xaredo Figure 1: A short text (using NP-only mention detection), and its corresponding entity grid. [sent-41, score-0.418]

22 Cheung and Penn (2010) adapt the grid to German, where focused constituents are indicated by sentence position rather than syntactic role. [sent-44, score-0.542]

23 The best entity grid for English text, however, is still the original. [sent-45, score-0.858]

24 3 Entity grids The entity grid represents a document as a matrix (Figure 1) with a row for each sentence and a column for each entity. [sent-46, score-0.951]

25 The entry for (sentence i, entity j), which we write ri;j, represents the syntactic role that entity takes on in that sentence: subject (S), object (O), or some other role (X)2. [sent-47, score-0.819]

26 In addition, there is a special marker (-) for entities which do not appear at all in a given sentence. [sent-48, score-0.086]

27 rst decide which textual units are to be considered “entities”, and how the different mentions of an entity are to be linked. [sent-50, score-0.5]

28 We follow the -COREFERENCE setting from Barzilay and Lapata (2005) and perform heuristic coreference resolution by linking mentions which share a head noun. [sent-51, score-0.443]

29 Although some versions of the grid use an automatic coreference resolver, this often fails to improve results; in Barzilay and Lapata (2005), coreference improves results in only one of their target domains, and actually hurts for readability prediction. [sent-52, score-1.091]

30 Their results, moreover, rely on running coreference on the document in its original order; in a summarization task, the correct order is not known, which will cause even more resolver errors. [sent-53, score-0.453]

31 To build a model based on the grid, we treat the columns (entities) as independent, and look at local transitions between sentences. [sent-54, score-0.095]

32 We model the 2Roles the parser determined heuristically using trees produced by of (Charniak and Johnson, 2005). [sent-55, score-0.03]

33 are 126 transitions using the generative approach given in Lapata and Barzilay (2005)3, in which the model estimates the probability of an entity's role in the next sentence, ri;j, given its history in the previous two sentences, ri? [sent-56, score-0.175]

34 c feature, salience, determined by counting the total number of times the entity is mentioned in the document. [sent-60, score-0.35]

35 ight” after the last sentence of the example would be F3;flight = hX; S; sal = 2i. [sent-63, score-0.034]

36 Using two sentences of context and capping salience at 4, there are only 64 possible vectors, so we can learn an independent multinomial distribution for each F. [sent-64, score-0.079]

37 However, the number of vectors grows exponentially as we add features. [sent-65, score-0.031]

38 4 Experimental design We test our model on two experimental tasks, both testing its ability to distinguish between correct and incorrect orderings for WSJ articles. [sent-66, score-0.059]

39 In document discrimination (Barzilay and Lapata, 2005), we compare a document to a random permutation of its sentences, scoring the system correct if it prefers the original ordering4. [sent-67, score-0.287]

40 In this task, we remove each sentence from the article and test whether the model prefers to re-insert it at its original location. [sent-71, score-0.099]

41 cant differences in the medians of two 5 distributions5. [sent-76, score-0.092]

42 Mention detection Our main contribution is to extend the entity grid by adding a large number of entity-speci? [sent-77, score-0.858]

43 Before doing so, however, we add non-head nouns to the grid. [sent-79, score-0.073]

44 Doing so gives our feature-based model 3Barzilay and Lapata (2005) give a discriminative model, which relies on the same feature set as discussed here. [sent-80, score-0.03]

45 Since the original and permutation might tie, we report both accuracy and balanced F-score. [sent-82, score-0.032]

46 cance of differences in means, we would need to use a parametric test. [sent-84, score-0.053]

47 cantly different from the previous row of the table with p=. [sent-91, score-0.046]

48 We alter our mention detector to add all nouns in the document to the grid6, even those which do not head NPs. [sent-95, score-0.251]

49 ers in phrases like Bush spokesman”, which do not head NPs in the Penn Treebank. [sent-97, score-0.139]

50 Finding these is also necessary to maximize coreference recall (Elsner and Charniak, 2010). [sent-98, score-0.27]

51 The results of this change are shown in Table 1; discrimination performance increases about 4%, from 76% to 80%. [sent-100, score-0.102]

52 c features As we mentioned earlier, the standard grid model does not distinguish between different types of entity. [sent-102, score-0.567]

53 Given the same history and salience, the same probabilities are assigned to occurrences of “Hillary Clinton”, “the airlines”, or “May 25th”, even though we know a priori that a document is more likely to be about Hillary Clinton than it is to be about May 25th. [sent-103, score-0.106]

54 This problem is exacerbated by our same-head coreference heuristic, which sometimes creates spurious entities by lumping together mentions headed by nouns like “miles” or “dollars”. [sent-104, score-0.532]

55 In this section, we add features that separate important entities from less important or spurious ones. [sent-105, score-0.167]

56 127 News articles are likely to be about people and organizations, so we expect these named entity tags, and proper NPs in general, to be more important to the discourse. [sent-115, score-0.382]

57 ers throughout the document are also likely to be important, since this implies that the writer wishes to point out more information about them. [sent-117, score-0.147]

58 Finally, singular nouns are less likely to be generic. [sent-118, score-0.082]

59 We also add some features to pick out entities that are likely to be spurious or unimportant. [sent-119, score-0.167]

60 These features depend on in-domain coreference data, but they do not require us to run a coreference resolver on the target document itself. [sent-120, score-0.66]

61 This avoids the problem that coreference resolvers do not work well for disordered or automatically produced text such as multidocument summary sentences, and also avoids the computational cost associated with coreference resolution. [sent-121, score-0.653]

62 Linkable Was the head word of the entity ever marked as coreferring in MUC6? [sent-122, score-0.401]

63 Unlinkable Did the head word of the entity occur 5 times in MUC6 and never corefer? [sent-123, score-0.401]

64 Has pronouns Were there 5 or more pronouns coreferent with the head word of the entity in the NANC corpus? [sent-124, score-0.604]

65 (Pronouns in NANC are automatically resolved using an unsupervised model (Charniak and Elsner, 2009). [sent-125, score-0.03]

66 ) No pronouns Did the head word of the entity occur over 50 times in NANC, and have fewer than 5 coreferent pronouns? [sent-126, score-0.533]

67 To learn probabilities based on these features, we model the conditional probability p(ri;j jF) us- ing multilabel logistic regression. [sent-127, score-0.03]

68 Our model has a parameter for each combination of syntactic role r, entity-speci? [sent-128, score-0.07]

69 In Table 2, we examine the changes in our estimated probability in one particular context: an entity with salience 3 which appeared in a non-emphatic role in the previous sentence. [sent-133, score-0.469]

70 The standard entity grid estimates that such an entity will be the subject of the next sentence with a probability of about 7We train the regressor using OWLQN (Andrew and Gao, 2007), modi? [sent-134, score-1.311]

71 the next sentence, given the history - X, salience 3, and various entity-speci? [sent-136, score-0.126]

72 For most classes of entity, we can see that this is an overestimate; for an entity described by a common noun (such as “the airline”), the probability assigned by the extended grid model is . [sent-140, score-0.921]

73 However, given that the entity refers to a person, and some of its mentions are modi? [sent-145, score-0.434]

74 ed, suggesting the article gives a title or description (“Obama's Secretary of State, Hillary Clinton”), the chance that it will be the subject of the next sentence more than triples. [sent-146, score-0.073]

75 7 Experiments Table 3 gives results for the extended grid model on the test set. [sent-147, score-0.571]

76 cantly better than the standard grid on discrimination (84% versus 80%) and has a higher mean score on insertion (24% versus 21%)8. [sent-149, score-0.785]

77 The best WSJ results in previous work are those of Elsner and Charniak (2008), who combine the entity grid with models based on pronoun coreference and discourse-new NP detection. [sent-150, score-1.167]

78 This comparison is unfair, however, because the improvements from adding non-head nouns improve our baseline grid suf? [sent-152, score-0.55]

79 State-of-the-art results on a different corpus and task were achieved by Soricut and Marcu (2006) using a log-linear mixture of an entity grid, IBM translation models, and a wordcorrespondence model based on Lapata (2003). [sent-154, score-0.38]

80 8For insertion using the model on its own, the median changes less than the mean, and the change in median score is not signi? [sent-155, score-0.133]

81 Combination models incorporate pronoun coreference, discourse-new NP detection, and IBM model 1. [sent-160, score-0.069]

82 yindicates an extended model score better than its baseline counterpart at p=. [sent-161, score-0.063]

83 To perform a fair comparison of our extended grid with these model-combining approaches, we train our own combined model incorporating an entity grid, pronouns, discourse-newness and the IBM model. [sent-163, score-0.921]

84 We combine models using a log-linear mixture as in Soricut and Marcu (2006), training the weights to maximize discrimination accuracy. [sent-164, score-0.102]

85 The second section of Table 3 shows these model combination results. [sent-165, score-0.03]

86 Notably, our extended entity grid on its own is essentially just as good as the combined model, which represents our implementation of the previous state of the art. [sent-166, score-0.891]

87 Though the improvement is not perfectly additive, a good deal of it is retained, demonstrating that our additions to the entity grid are mostly orthogonal to previously de- scribed models. [sent-171, score-0.858]

88 These results are the best reported for sentence ordering of English news articles. [sent-172, score-0.068]

89 8 Conclusion We improve a widely used model of local discourse coherence. [sent-173, score-0.097]

90 Our extensions to the feature set involve distinguishing simple properties of entities, such as their named entity type, which are also useful in coreference and summarization tasks. [sent-174, score-0.715]

91 Although our method uses coreference information, it does not require coreference resolution to be run on the target documents. [sent-175, score-0.578]

92 Given the popularity of entity grid models for practical applications, we hope our model's improvements will transfer to summarization, generation and readability prediction. [sent-176, score-0.901]

93 Using entity-based features to model coherence in student essays. [sent-193, score-0.187]

94 Extending the entity-grid coherence model to semantically related entities. [sent-229, score-0.187]

95 Centering: A framework for modeling the local coherence of discourse. [sent-237, score-0.194]

96 Automatically learning cognitive status for multi-document summarization of newswire. [sent-269, score-0.063]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('grid', 0.508), ('entity', 0.35), ('coreference', 0.27), ('signi', 0.267), ('elsner', 0.189), ('lapata', 0.176), ('coherence', 0.157), ('barzilay', 0.135), ('charniak', 0.132), ('clinton', 0.11), ('hillary', 0.107), ('discrimination', 0.102), ('modi', 0.098), ('cant', 0.092), ('ers', 0.088), ('micha', 0.088), ('entities', 0.086), ('soricut', 0.084), ('mentions', 0.084), ('ight', 0.081), ('nanc', 0.08), ('salience', 0.079), ('mirella', 0.074), ('pronouns', 0.071), ('ri', 0.068), ('mention', 0.068), ('rst', 0.066), ('regina', 0.064), ('summarization', 0.063), ('eugene', 0.061), ('cheung', 0.061), ('mcintyre', 0.061), ('resolver', 0.061), ('coreferent', 0.061), ('wsj', 0.06), ('document', 0.059), ('cance', 0.053), ('head', 0.051), ('sweden', 0.05), ('spurious', 0.05), ('filippova', 0.049), ('multidocument', 0.049), ('history', 0.047), ('morton', 0.046), ('cantly', 0.046), ('johnson', 0.046), ('versus', 0.044), ('centering', 0.044), ('halliday', 0.044), ('nps', 0.043), ('readability', 0.043), ('unimportant', 0.042), ('nouns', 0.042), ('uppsala', 0.041), ('insertion', 0.041), ('grosz', 0.04), ('role', 0.04), ('singular', 0.04), ('opennlp', 0.039), ('burstein', 0.039), ('pronoun', 0.039), ('subject', 0.039), ('resolution', 0.038), ('pitler', 0.038), ('local', 0.037), ('structuring', 0.036), ('prefers', 0.035), ('sentence', 0.034), ('ordering', 0.034), ('association', 0.033), ('ani', 0.033), ('extended', 0.033), ('avoids', 0.032), ('named', 0.032), ('permutation', 0.032), ('nenkova', 0.032), ('add', 0.031), ('ibm', 0.031), ('median', 0.031), ('marcu', 0.031), ('estimates', 0.03), ('model', 0.03), ('discourse', 0.03), ('distinguish', 0.029), ('transitions', 0.028), ('unfair', 0.027), ('entitybased', 0.027), ('bene', 0.027), ('bitbucket', 0.027), ('cult', 0.027), ('plainly', 0.027), ('sner', 0.027), ('airline', 0.027), ('ciently', 0.027), ('corefer', 0.027), ('gann', 0.027), ('jf', 0.027), ('joern', 0.027), ('neil', 0.027), ('nonhead', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 129 acl-2011-Extending the Entity Grid with Entity-Specific Features

Author: Micha Elsner ; Eugene Charniak

Abstract: We extend the popular entity grid representation for local coherence modeling. The grid abstracts away information about the entities it models; we add discourse prominence, named entity type and coreference features to distinguish between important and unimportant entities. We improve the best result for WSJ document discrimination by 6%.

2 0.41847941 101 acl-2011-Disentangling Chat with Local Coherence Models

Author: Micha Elsner ; Eugene Charniak

Abstract: We evaluate several popular models of local discourse coherence for domain and task generality by applying them to chat disentanglement. Using experiments on synthetic multiparty conversations, we show that most models transfer well from text to dialogue. Coherence models improve results overall when good parses and topic models are available, and on a constrained task for real chat data.

3 0.25416225 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

Author: Sameer Singh ; Amarnag Subramanya ; Fernando Pereira ; Andrew McCallum

Abstract: Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1.5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach.

4 0.24985683 23 acl-2011-A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models

Author: Dingcheng Li ; Tim Miller ; William Schuler

Abstract: and Wellner, This paper presents a supervised pronoun anaphora resolution system based on factorial hidden Markov models (FHMMs). The basic idea is that the hidden states of FHMMs are an explicit short-term memory with an antecedent buffer containing recently described referents. Thus an observed pronoun can find its antecedent from the hidden buffer, or in terms of a generative model, the entries in the hidden buffer generate the corresponding pronouns. A system implementing this model is evaluated on the ACE corpus with promising performance.

5 0.24726306 12 acl-2011-A Generative Entity-Mention Model for Linking Entities with Knowledge Base

Author: Xianpei Han ; Le Sun

Abstract: Linking entities with knowledge base (entity linking) is a key issue in bridging the textual data with the structural knowledge base. Due to the name variation problem and the name ambiguity problem, the entity linking decisions are critically depending on the heterogenous knowledge of entities. In this paper, we propose a generative probabilistic model, called entitymention model, which can leverage heterogenous entity knowledge (including popularity knowledge, name knowledge and context knowledge) for the entity linking task. In our model, each name mention to be linked is modeled as a sample generated through a three-step generative story, and the entity knowledge is encoded in the distribution of entities in document P(e), the distribution of possible names of a specific entity P(s|e), and the distribution of possible contexts of a specific entity P(c|e). To find the referent entity of a name mention, our method combines the evidences from all the three distributions P(e), P(s|e) and P(c|e). Experimental results show that our method can significantly outperform the traditional methods. 1

6 0.20394824 280 acl-2011-Sentence Ordering Driven by Local and Global Coherence for Summary Generation

7 0.20067254 53 acl-2011-Automatically Evaluating Text Coherence Using Discourse Relations

8 0.19226858 314 acl-2011-Typed Graph Models for Learning Latent Attributes from Names

9 0.14823177 285 acl-2011-Simple supervised document geolocation with geodesic grids

10 0.14387567 191 acl-2011-Knowledge Base Population: Successful Approaches and Challenges

11 0.13485441 9 acl-2011-A Cross-Lingual ILP Solution to Zero Anaphora Resolution

12 0.13438725 63 acl-2011-Bootstrapping coreference resolution using word associations

13 0.13269146 86 acl-2011-Coreference for Learning to Extract Relations: Yes Virginia, Coreference Matters

14 0.12401186 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

15 0.12249757 85 acl-2011-Coreference Resolution with World Knowledge

16 0.10449297 328 acl-2011-Using Cross-Entity Inference to Improve Event Extraction

17 0.10027544 117 acl-2011-Entity Set Expansion using Topic information

18 0.093179055 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

19 0.069128461 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction

20 0.066477709 277 acl-2011-Semi-supervised Relation Extraction with Large-scale Word Clustering


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.19), (1, 0.07), (2, -0.212), (3, 0.041), (4, 0.091), (5, 0.047), (6, -0.017), (7, -0.025), (8, -0.316), (9, 0.066), (10, 0.031), (11, 0.047), (12, -0.2), (13, -0.2), (14, -0.003), (15, 0.164), (16, -0.055), (17, 0.293), (18, -0.019), (19, 0.016), (20, 0.04), (21, 0.042), (22, -0.029), (23, 0.135), (24, -0.035), (25, 0.046), (26, -0.162), (27, -0.046), (28, -0.093), (29, -0.019), (30, -0.006), (31, -0.027), (32, -0.008), (33, 0.015), (34, 0.076), (35, -0.109), (36, 0.038), (37, 0.081), (38, -0.07), (39, -0.012), (40, -0.073), (41, -0.092), (42, -0.027), (43, -0.028), (44, -0.031), (45, 0.045), (46, 0.128), (47, 0.071), (48, -0.028), (49, 0.128)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.95922387 129 acl-2011-Extending the Entity Grid with Entity-Specific Features

Author: Micha Elsner ; Eugene Charniak

Abstract: We extend the popular entity grid representation for local coherence modeling. The grid abstracts away information about the entities it models; we add discourse prominence, named entity type and coreference features to distinguish between important and unimportant entities. We improve the best result for WSJ document discrimination by 6%.

2 0.84633064 101 acl-2011-Disentangling Chat with Local Coherence Models

Author: Micha Elsner ; Eugene Charniak

Abstract: We evaluate several popular models of local discourse coherence for domain and task generality by applying them to chat disentanglement. Using experiments on synthetic multiparty conversations, we show that most models transfer well from text to dialogue. Coherence models improve results overall when good parses and topic models are available, and on a constrained task for real chat data.

3 0.64090413 12 acl-2011-A Generative Entity-Mention Model for Linking Entities with Knowledge Base

Author: Xianpei Han ; Le Sun

Abstract: Linking entities with knowledge base (entity linking) is a key issue in bridging the textual data with the structural knowledge base. Due to the name variation problem and the name ambiguity problem, the entity linking decisions are critically depending on the heterogenous knowledge of entities. In this paper, we propose a generative probabilistic model, called entitymention model, which can leverage heterogenous entity knowledge (including popularity knowledge, name knowledge and context knowledge) for the entity linking task. In our model, each name mention to be linked is modeled as a sample generated through a three-step generative story, and the entity knowledge is encoded in the distribution of entities in document P(e), the distribution of possible names of a specific entity P(s|e), and the distribution of possible contexts of a specific entity P(c|e). To find the referent entity of a name mention, our method combines the evidences from all the three distributions P(e), P(s|e) and P(c|e). Experimental results show that our method can significantly outperform the traditional methods. 1

4 0.63767689 23 acl-2011-A Pronoun Anaphora Resolution System based on Factorial Hidden Markov Models

Author: Dingcheng Li ; Tim Miller ; William Schuler

Abstract: and Wellner, This paper presents a supervised pronoun anaphora resolution system based on factorial hidden Markov models (FHMMs). The basic idea is that the hidden states of FHMMs are an explicit short-term memory with an antecedent buffer containing recently described referents. Thus an observed pronoun can find its antecedent from the hidden buffer, or in terms of a generative model, the entries in the hidden buffer generate the corresponding pronouns. A system implementing this model is evaluated on the ACE corpus with promising performance.

5 0.62374759 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

Author: Sameer Singh ; Amarnag Subramanya ; Fernando Pereira ; Andrew McCallum

Abstract: Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1.5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach.

6 0.58623344 53 acl-2011-Automatically Evaluating Text Coherence Using Discourse Relations

7 0.56004232 280 acl-2011-Sentence Ordering Driven by Local and Global Coherence for Summary Generation

8 0.48695624 314 acl-2011-Typed Graph Models for Learning Latent Attributes from Names

9 0.48006454 9 acl-2011-A Cross-Lingual ILP Solution to Zero Anaphora Resolution

10 0.47757217 191 acl-2011-Knowledge Base Population: Successful Approaches and Challenges

11 0.43819284 63 acl-2011-Bootstrapping coreference resolution using word associations

12 0.41596454 85 acl-2011-Coreference Resolution with World Knowledge

13 0.38582671 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

14 0.35808158 285 acl-2011-Simple supervised document geolocation with geodesic grids

15 0.33773836 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

16 0.32500321 320 acl-2011-Unsupervised Discovery of Domain-Specific Knowledge from Text

17 0.277587 99 acl-2011-Discrete vs. Continuous Rating Scales for Language Evaluation in NLP

18 0.26974714 301 acl-2011-The impact of language models and loss functions on repair disfluency detection

19 0.26595503 40 acl-2011-An Error Analysis of Relation Extraction in Social Media Documents

20 0.25979891 181 acl-2011-Jigs and Lures: Associating Web Queries with Structured Entities


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(5, 0.031), (17, 0.034), (26, 0.02), (30, 0.293), (31, 0.016), (37, 0.104), (39, 0.036), (41, 0.094), (55, 0.037), (59, 0.035), (72, 0.031), (91, 0.036), (96, 0.143), (97, 0.01)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.81018364 227 acl-2011-Multimodal Menu-based Dialogue with Speech Cursor in DICO II+

Author: Staffan Larsson ; Alexander Berman ; Jessica Villing

Abstract: Alexander Berman Jessica Villing Talkamatic AB University of Gothenburg Sweden Sweden alex@ t alkamat i . se c jessi ca@ l ing .gu . s e 2 In-vehicle dialogue systems This paper describes Dico II+, an in-vehicle dialogue system demonstrating a novel combination of flexible multimodal menu-based dialogueand a “speech cursor” which enables menu navigation as well as browsing long list using haptic input and spoken output.

same-paper 2 0.73709941 129 acl-2011-Extending the Entity Grid with Entity-Specific Features

Author: Micha Elsner ; Eugene Charniak

Abstract: We extend the popular entity grid representation for local coherence modeling. The grid abstracts away information about the entities it models; we add discourse prominence, named entity type and coreference features to distinguish between important and unimportant entities. We improve the best result for WSJ document discrimination by 6%.

3 0.60351455 101 acl-2011-Disentangling Chat with Local Coherence Models

Author: Micha Elsner ; Eugene Charniak

Abstract: We evaluate several popular models of local discourse coherence for domain and task generality by applying them to chat disentanglement. Using experiments on synthetic multiparty conversations, we show that most models transfer well from text to dialogue. Coherence models improve results overall when good parses and topic models are available, and on a constrained task for real chat data.

4 0.5862112 65 acl-2011-Can Document Selection Help Semi-supervised Learning? A Case Study On Event Extraction

Author: Shasha Liao ; Ralph Grishman

Abstract: Annotating training data for event extraction is tedious and labor-intensive. Most current event extraction tasks rely on hundreds of annotated documents, but this is often not enough. In this paper, we present a novel self-training strategy, which uses Information Retrieval (IR) to collect a cluster of related documents as the resource for bootstrapping. Also, based on the particular characteristics of this corpus, global inference is applied to provide more confident and informative data selection. We compare this approach to self-training on a normal newswire corpus and show that IR can provide a better corpus for bootstrapping and that global inference can further improve instance selection. We obtain gains of 1.7% in trigger labeling and 2.3% in role labeling through IR and an additional 1.1% in trigger labeling and 1.3% in role labeling by applying global inference. 1

5 0.5844273 196 acl-2011-Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models

Author: Sameer Singh ; Amarnag Subramanya ; Fernando Pereira ; Andrew McCallum

Abstract: Cross-document coreference, the task of grouping all the mentions of each entity in a document collection, arises in information extraction and automated knowledge base construction. For large collections, it is clearly impractical to consider all possible groupings of mentions into distinct entities. To solve the problem we propose two ideas: (a) a distributed inference technique that uses parallelism to enable large scale processing, and (b) a hierarchical model of coreference that represents uncertainty over multiple granularities of entities to facilitate more effective approximate inference. To evaluate these ideas, we constructed a labeled corpus of 1.5 million disambiguated mentions in Web pages by selecting link anchors referring to Wikipedia entities. We show that the combination of the hierarchical model with distributed inference quickly obtains high accuracy (with error reduction of 38%) on this large dataset, demonstrating the scalability of our approach.

6 0.58237612 58 acl-2011-Beam-Width Prediction for Efficient Context-Free Parsing

7 0.5807941 324 acl-2011-Unsupervised Semantic Role Induction via Split-Merge Clustering

8 0.58038604 126 acl-2011-Exploiting Syntactico-Semantic Structures for Relation Extraction

9 0.57869065 128 acl-2011-Exploring Entity Relations for Named Entity Disambiguation

10 0.57818264 311 acl-2011-Translationese and Its Dialects

11 0.57659703 246 acl-2011-Piggyback: Using Search Engines for Robust Cross-Domain Named Entity Recognition

12 0.57516092 3 acl-2011-A Bayesian Model for Unsupervised Semantic Parsing

13 0.57477343 119 acl-2011-Evaluating the Impact of Coder Errors on Active Learning

14 0.57346785 73 acl-2011-Collective Classification of Congressional Floor-Debate Transcripts

15 0.57330525 244 acl-2011-Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts

16 0.57316005 295 acl-2011-Temporal Restricted Boltzmann Machines for Dependency Parsing

17 0.57313502 40 acl-2011-An Error Analysis of Relation Extraction in Social Media Documents

18 0.57302952 172 acl-2011-Insertion, Deletion, or Substitution? Normalizing Text Messages without Pre-categorization nor Supervision

19 0.57291794 36 acl-2011-An Efficient Indexer for Large N-Gram Corpora

20 0.57267547 202 acl-2011-Learning Hierarchical Translation Structure with Linguistic Annotations