emnlp emnlp2012 emnlp2012-76 knowledge-graph by maker-knowledge-mining

76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge


Source: pdf

Author: Lev Ratinov ; Dan Roth

Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 The correct output groups the mentions {m1 , m2, m5} tions to one entity while leaving m3 {m ∗ We thank Nicholas Rizzolo and Kai Wei Chang for their invaluable help with modifying the baseline co-reference system. [sent-11, score-0.314]

2 A human reader can infer that since Kursk sank, it must be a vessel and vessels which suffer catastrophic torpedo detonations can sink. [sent-28, score-0.21]

3 The key contributions of this work are: (1) Using Wikipedia to assign a set of knowledge attributes to mentions in a context-sensitive way. [sent-31, score-0.508]

4 1, our system assigns to the mention “Kursk” the nationalities: Russian, Soviet and the attributes ship, incident, submarine, shipwreck (as opposed to city or battle). [sent-33, score-0.42]

5 1), assign these attributes to the document mentions (Sec. [sent-39, score-0.445]

6 , 2010) in several respects: (a) our sieves are machine-learning classifiers, (b) the same pair of mentions can fall into multiple sieves, (c) later sieves can override the decisions made by earlier sieves, allowing to recover from errors as additional evidence becomes available. [sent-50, score-0.981]

7 As sieves of classifiers are applied, our system attempts to model entities and share the attributes between the mentions belonging to the same entity. [sent-55, score-0.838]

8 However, in this work we allow the sieves to make conflicting decisions on the same pair of mentions. [sent-58, score-0.467]

9 Hence, obtaining entities and their attributes by straightforward transitive closure of co-reference predictions is impossible. [sent-59, score-0.364]

10 2 Baseline System In this work, we are using the state-of-the-art system of (Bengtson and Roth, 2008), which relies on a pairwise scoring function pc to assign an ordered pair of mentions a probability that they are coreferential. [sent-73, score-0.418]

11 It uses a rich set of features including: string edit distance, gender match, whether the mentions appear in the same sentence, whether the heads are synonyms in WordNet etc. [sent-74, score-0.437]

12 For the end system, we keep these parameters intact, our only modifications will be adding knowledge-rich features and adding intermediate classification sieves to the training and the inference, which we will discuss in the following sections. [sent-76, score-0.419]

13 For each mention m in document d, let Bm be the set of mentions appearing before m in d. [sent-78, score-0.463]

14 At this stage, we ask the reader to ignore the knowledge attributes at the bottom of the figure. [sent-83, score-0.264]

15 Let us assume that the pairwise classifier labeled the mentions (m2, m5) co-referent because they have identical surface form; mentions (m1, m4) are co-referred because the heads are synonyms in WordNet. [sent-84, score-0.726]

16 A set of knowledge attributes for selected mentions is shown as well. [sent-89, score-0.508]

17 We describe how to inject this knowledge into mentions in Sec. [sent-95, score-0.406]

18 3 illustrates the knowledge attributes our system injects to two sample mentions at this stage. [sent-99, score-0.508]

19 3 we describe a compatibility metric our system learns over the injected knowledge. [sent-102, score-0.214]

20 The nationality is assigned by matching the tokens in the original (unprocessed) categories of the Wikipedia page to a list of countries. [sent-133, score-0.216]

21 For each token, we track the list of titles it appears in, and if the union of the nationalities assigned to the titles it appears in is less than 7, we mark the token compatible with these nationalities. [sent-135, score-0.219]

22 2 Injecting Knowledge Attributes Once we have extracted the knowledge attributes of Wikipedia pages, we need to inject them into the mentions. [sent-139, score-0.363]

23 Therefore they used YAGO only for mention pairs where one mention was an NE of type PER/LOC/ORG and the other was a common noun. [sent-141, score-0.499]

24 1 and will motivate us to add features conservatively when building attribute compatibility metric in Sec. [sent-147, score-0.208]

25 Additionally, while (Rahman and Ng, 2011) uses the union of all possible meanings a mention may have in Wikipedia, we deploy GLOW (Ratinov et al. [sent-150, score-0.219]

26 Using context-sensitive disambiguation to Wikipedia as well as high-precision set of knowledge attributes allows us to inject the knowledge to more mention pairs when compared to (Rahman and Ng, 2011). [sent-152, score-0.761]

27 Our exact heuristic for injecting knowledge attributes to mentions is as follows: Named Entities with Wikipedia Disambiguation If the mention head is an NE matched to a Wikipedia page p by GLOW, we import all the knowledge attributes from p. [sent-153, score-1.119]

28 and Extent Keywords If the mention head is not mapped to Wikipedia by GLOW and the head contains keywords which appear in the list of 2088 fine-grained entity types, then the rightmost such keyword is added to the list of mention knowledge attributes. [sent-158, score-0.688]

29 This allows us to inject knowledge to mentions unmapped to Wikipedia, such as: “{current Cycle World publisher [Larry Little] } ”, w:h “i{chcu irsr assigned thoerl dat ptruibbluitseh publisher biutttl n]}ot” ,w wohrlidch or cycle. [sent-164, score-0.524]

30 3 Learning Attributes Compatibility In the previous section we have assigned knowledge attributes to the mentions. [sent-167, score-0.306]

31 The only non-trivial feature is measuring compatibility between sets of fine-grained entity types, which we describe below. [sent-171, score-0.235]

32 Let us assume that mention m1 was assigned the set of fine-grained entity types S1 and the mention m2 was assigned the set of fine-grained entity types S2. [sent-172, score-0.662]

33 The reason is that the pair of mentions “(Microsoft, Google) ” are not co-referent despite the fact that they both have the company attribute. [sent-183, score-0.334]

34 Therefore, if our system sees two named entities which share the same finegrained type but have a large string edit distance, it will label the pair as non-coref. [sent-187, score-0.258]

35 (Bengtson and Roth, 2008; Rahman and Ng, 2011) train a single model for predicting coreference of all mention pairs. [sent-190, score-0.317]

36 , 2010) characterize mention pairs by discourse structure and linguistic properties and apply rules in a prescribed order (high-precision rules first). [sent-194, score-0.28]

37 There is a subtle difference between mention pairs (m1, m2) and (m2, m3). [sent-205, score-0.28]

38 It turns out that string edit distance feature between two named entities has different “semantics” depending on whether the two mentions appear in the same sentence. [sent-208, score-0.465]

39 – Table 1: F1 performance on co-referent mention pairs by sieve type when trained with all data versus sieve-specific data only. [sent-216, score-0.546]

40 3, our goal is to link vessel to Kursk and assign it the Russian/Soviet nationality prior to applying the pairwise co-reference classifier on (vessel, Norwegian ship). [sent-219, score-0.386]

41 Therefore, our goal is to apply the pairwise classifier on pairs in prescribed order and to propagate the knowledge across mentions. [sent-220, score-0.256]

42 Hence, we divide the mention pairs as follows: Nested: are pairs such as “{{ [city]m1 } of [Jerusalem]m2 } ” where the extent of one of the mentions contains the extent of the other. [sent-223, score-0.657]

43 For some mentions, the extent is the entire clause, so we also added a requirement that mention heads are at most 7 tokens apart. [sent-224, score-0.308]

44 There are 5,804 training samples and 992 testing samples, out of which 208 are co-referent. [sent-226, score-0.207]

45 There are 13,041 training samples and 1,746 testing samples, out of which 86 are co-referent. [sent-229, score-0.207]

46 Adjacent: are pairs of mentions which appear closest to each other on the dependency tree. [sent-230, score-0.305]

47 There are training 5,872 samples and 895 testing samples, out of which 219 are co-referent. [sent-232, score-0.207]

48 SameSentenceOneNer: are pairs which appear in the same sentence and exactly one of the mentions is a named entity, and the other is not a pronoun. [sent-233, score-0.381]

49 There are 15,715 training samples and 2,635 testing samples, out of which 207 are co-referent. [sent-236, score-0.207]

50 NerMentionsDiffSent: are pairs of mentions in different sentences, both of which are named entities. [sent-237, score-0.381]

51 There are 189,807 training samples and 24,342 testing samples, out of which 1,628 are co-referent. [sent-238, score-0.207]

52 NonProSameSentence: are pairs in the same sentence, where both mentions are non-pronouns. [sent-239, score-0.305]

53 This sieve includes all the pairs in the SameSentenceOneNer sieve. [sent-240, score-0.327]

54 ClosestNonProDiffSent: are pairs of mentions in different sentences with no other mentions between the two. [sent-244, score-0.549]

55 TopSieve: The set of mention pairs classified by the baseline system. [sent-248, score-0.28]

56 1 we compare the performance sieve in two at each scenarios11 . [sent-251, score-0.266]

57 We were surprised to see that the F1 on the nested mentions, when trained on the 5,804 sievespecific samples improves to 79. [sent-256, score-0.315]

58 1 1 when trained on the 525,398 top sieve samples. [sent-258, score-0.266]

59 For example, 208 out of the 992 testing samples at the nested sieve are positive, while there are only 86 positive samples out of 1,746 testing samples in the SameSenBothNer sieve. [sent-261, score-0.95]

60 Second, the data for intermediate sieves is not always a subset of the top sieve. [sent-263, score-0.419]

61 The reason is that top sieve extracts a positive instance only for the closest co-referent mentions, while sieves such as AllSentencePairs extract samples for all co-referent pairs which appear in the same sentence. [sent-264, score-0.808]

62 Third, while our division to sieves may resemble witchcraft, it is motivated by the intuition that mentions appearing close to one another are easier instances of co-ref as well as linguistic insights of (Raghunathan et al. [sent-265, score-0.571]

63 In our case, C1 bise the nested mention pairs classifier, C2 is the Same- SenBothNer classifier, and C9 is the top sieve classifier. [sent-273, score-0.662]

64 We design entity-based features so that the subsequent sieves “see” the decisions of the previous sieves and use entity-based features based on the intermediate clustering. [sent-274, score-0.792]

65 , 2010), we allow the subsequent sieves to change the decisions made by the lower sieves (since additional information becomes available). [sent-276, score-0.7]

66 1 Intermediate Clustering Features (IC) Let Ri (m) be the set of all mentions which, when paired with the mention m, form valid sample pairs for sieve i. [sent-278, score-0.79]

67 1, 12We report pairwise performance on mention pairs because it is the more natural metric for the intermediate sieves. [sent-282, score-0.463]

68 We report only performance on co-referent pairs, because for many sieves, such as the top sieve, 99% of the mention pairs are noncoreferent, hence the baseline of labeling all samples as noncoreferent would result in 99% accuracy. [sent-283, score-0.472]

69 Let Ri+(m) be the set of all mentions which were labeled as co-referent to the mention m by the classifier Ci (including m, which is co-referent to itself). [sent-286, score-0.504]

70 We denote the union of mentions co-refed to m during inference up to sieve ias Ei+ (m) = ∪ij−=11Rj+ (m). [sent-288, score-0.51]

71 We note that both ICRi and ICEi can have the values +1 and -1 active at the same time if intermediate sieve classifiers generated conflicting predictions. [sent-292, score-0.415]

72 However, a classifier at sieve iwill use as features both ICR1,. [sent-293, score-0.307]

73 ICE, thus it will know the lowi−e1st sieve at which thie− conflicting evidence occurs. [sent-299, score-0.323]

74 The classifier at sieve i also uses set identity, set containment, set overlap and other set comparison features between E+/−(mj) E+/−(mk). [sent-300, score-0.307]

75 We also generate subtypes of set comparison features when restricting the elements to NE-mentions and non-pronominal mentions (e. [sent-302, score-0.244]

76 2 Surface Form Compatibility (SFC) The intermediate clustering features do not allow us to generalize predictions from pairs of mentions to pairs of surface strings. [sent-306, score-0.511]

77 For example, if we have three mentions: {[vessel]m1, [Kursk]m2, [Kursk]m5 }, tthhreene t mhee prediction on etlh]e pair (m1, m2) will not b}e, Table 2: Utility of knowledge and prediction features (F1 on co-referent mention pairs) by inference sieves. [sent-307, score-0.409]

78 The surface form compatibility features mirror the intermediate clustering features, but relax mention IDs and replace them by surface forms. [sent-311, score-0.582]

79 Therefore, both for (Putin, president) and for (Clinton, president), the surface from compatibility will be +1 and -1 simultaneously. [sent-315, score-0.218]

80 In this work, we are using NLP tools such as POS tagger, named entity recognizer, shallow parser, and a disambiguation to Wikipedia system to inject expressive features into a co-reference system. [sent-322, score-0.3]

81 We kept only those documents which contained named entities (according to manual ACE annotation) and at least 1/3 of the named entities started with a capital letter. [sent-327, score-0.284]

82 , 2007) and much other work, to make experiments more comparable across systems, we assume that perfect mention boundaries and mention type labels are given. [sent-331, score-0.438]

83 2 we report the pairwise F1 scores on coreferent mention pairs broken down by sieve and using different components. [sent-337, score-0.637]

84 This allows us to see, for example, that adding only the knowledge attributes improved the performance at NonProSameSentence sieve from 63. [sent-338, score-0.53]

85 We have ordered the sieves according to our initial intuition of “easy first”. [sent-341, score-0.327]

86 We were surprised to see that co-ref resolution for named entities in the same sentence was harder than cross-sentence (73. [sent-342, score-0.252]

87 We were also surprised to see that resolving all mention pairs within sentence when including pronouns was easier than resolving pairs where both mentions were non-pronouns (67. [sent-353, score-0.63]

88 We note that conceptually, the nested (B)+Predictions sieve should be identical to the baseline. [sent-357, score-0.382]

89 However, in practice, the surface form compatibility (SFC) features are generated for the nested sieve as well. [sent-358, score-0.6]

90 Given two mentions m1 and m2, the SFC features capture how many surface forms E+ (m1) and E+ (m2) share. [sent-359, score-0.297]

91 The baseline system assigns each mention to a separate cluster. [sent-387, score-0.219]

92 Our end system first co-refs (m1, m2) at theAllSameSentence sieve due to the knowledge features, and then co-refs (m1, m3) at the top sieve due to surface form compatibility features indicating that province was observed to refer to Jiangxi in the document. [sent-391, score-0.902]

93 Entity-based features currently do not propagate knowledge attributes directly, but through aggregating pairwise predictions at knowledge-infused intermediate sieves. [sent-394, score-0.447]

94 We rely on gold mention boundaries and exhaustive gold co-reference annotation. [sent-395, score-0.219]

95 This prevented us from applying our approach to the Ontonotes dataset where singleton clusters and co-referent nested mentions are removed. [sent-396, score-0.36]

96 Therefore the gold annotation for training several sieves of our scheme is missing (e. [sent-397, score-0.327]

97 However, our experience with multi-sieve approach with classifiers suggests that a single model would not perform well for both lower sieves with little entity-based information and higher sieves with a lot of entity-based features. [sent-406, score-0.654]

98 (Rahman and Ng, 2011) used the union of all possible inter1243 pretations a mention may have in YAGO, which means that Michael Jordan could be co-refed both to a scientist and basketball player in the same document. [sent-410, score-0.219]

99 We extract contextsensitive, high-precision knowledge attributes from Wikipedia pages and apply (among other features) WordNet similarity metric on pairs of knowledge attributes to determine attribute compatibility. [sent-414, score-0.632]

100 Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. [sent-545, score-0.359]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('sieves', 0.327), ('sieve', 0.266), ('mentions', 0.244), ('kursk', 0.229), ('mention', 0.219), ('wikipedia', 0.211), ('attributes', 0.201), ('raghunathan', 0.178), ('vessel', 0.172), ('compatibility', 0.165), ('samples', 0.154), ('bengtson', 0.131), ('rahman', 0.122), ('nested', 0.116), ('ace', 0.1), ('inject', 0.099), ('coreference', 0.098), ('mubarak', 0.096), ('intermediate', 0.092), ('pairwise', 0.091), ('province', 0.089), ('ceaf', 0.089), ('nationality', 0.082), ('muc', 0.077), ('glow', 0.076), ('jiangxi', 0.076), ('sfc', 0.076), ('named', 0.076), ('ship', 0.074), ('strube', 0.074), ('president', 0.071), ('entity', 0.07), ('roth', 0.069), ('yago', 0.066), ('clinton', 0.066), ('putin', 0.066), ('entities', 0.066), ('resolution', 0.065), ('titles', 0.064), ('knowledge', 0.063), ('gender', 0.061), ('culotta', 0.061), ('pairs', 0.061), ('faculty', 0.059), ('conflicting', 0.057), ('nastase', 0.055), ('disambiguation', 0.055), ('testing', 0.053), ('surface', 0.053), ('heads', 0.053), ('company', 0.053), ('transitive', 0.052), ('russian', 0.052), ('page', 0.051), ('wordnet', 0.05), ('injected', 0.049), ('nationalities', 0.049), ('ponzetto', 0.049), ('ng', 0.048), ('haghighi', 0.048), ('pc', 0.046), ('decisions', 0.046), ('prediction', 0.045), ('closure', 0.045), ('surprised', 0.045), ('norwegian', 0.045), ('encyclopedic', 0.044), ('attribute', 0.043), ('ratinov', 0.042), ('assigned', 0.042), ('classifier', 0.041), ('categories', 0.041), ('edit', 0.041), ('head', 0.04), ('wick', 0.039), ('allsentencepairs', 0.038), ('battle', 0.038), ('celebrity', 0.038), ('hosni', 0.038), ('icei', 0.038), ('icri', 0.038), ('lowrecall', 0.038), ('noncoreferent', 0.038), ('nonprosamesentence', 0.038), ('publisher', 0.038), ('recipients', 0.038), ('samesenbothner', 0.038), ('samesentenceonener', 0.038), ('sneh', 0.038), ('submarine', 0.038), ('torpedo', 0.038), ('vilalta', 0.038), ('wikirelate', 0.038), ('category', 0.038), ('string', 0.038), ('stages', 0.038), ('heuristic', 0.037), ('pair', 0.037), ('keywords', 0.037), ('extent', 0.036)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.000001 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge

Author: Lev Ratinov ; Dan Roth

Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.

2 0.33188197 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents

Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky

Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.

3 0.21264155 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic

Author: Yang Song ; Jing Jiang ; Wayne Xin Zhao ; Sujian Li ; Houfeng Wang

Abstract: Pairwise coreference resolution models must merge pairwise coreference decisions to generate final outputs. Traditional merging methods adopt different strategies such as the bestfirst method and enforcing the transitivity constraint, but most of these methods are used independently of the pairwise learning methods as an isolated inference procedure at the end. We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic. Experimental results show that our joint learning system outperforms independent learning systems. Our system gives a better performance than all the learning-based systems from the CoNLL-201 1shared task on the same dataset. Compared with the best system from CoNLL2011, which employs a rule-based method, our system shows competitive performance.

4 0.19692859 19 emnlp-2012-An Entity-Topic Model for Entity Linking

Author: Xianpei Han ; Le Sun

Abstract: Entity Linking (EL) has received considerable attention in recent years. Given many name mentions in a document, the goal of EL is to predict their referent entities in a knowledge base. Traditionally, there have been two distinct directions of EL research: one focusing on the effects of mention’s context compatibility, assuming that “the referent entity of a mention is reflected by its context”; the other dealing with the effects of document’s topic coherence, assuming that “a mention ’s referent entity should be coherent with the document’ ’s main topics”. In this paper, we propose a generative model called entitytopic model, to effectively join the above two complementary directions together. By jointly modeling and exploiting the context compatibility, the topic coherence and the correlation between them, our model can – accurately link all mentions in a document using both the local information (including the words and the mentions in a document) and the global knowledge (including the topic knowledge, the entity context knowledge and the entity name knowledge). Experimental results demonstrate the effectiveness of the proposed model. 1

5 0.1698243 84 emnlp-2012-Linking Named Entities to Any Database

Author: Avirup Sil ; Ernest Cronin ; Penghai Nie ; Yinfei Yang ; Ana-Maria Popescu ; Alexander Yates

Abstract: Existing techniques for disambiguating named entities in text mostly focus on Wikipedia as a target catalog of entities. Yet for many types of entities, such as restaurants and cult movies, relational databases exist that contain far more extensive information than Wikipedia. This paper introduces a new task, called Open-Database Named-Entity Disambiguation (Open-DB NED), in which a system must be able to resolve named entities to symbols in an arbitrary database, without requiring labeled data for each new database. We introduce two techniques for Open-DB NED, one based on distant supervision and the other based on domain adaptation. In experiments on two domains, one with poor coverage by Wikipedia and the other with near-perfect coverage, our Open-DB NED strategies outperform a state-of-the-art Wikipedia NED system by over 25% in accuracy.

6 0.14301577 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction

7 0.14161128 72 emnlp-2012-Joint Inference for Event Timeline Construction

8 0.13027957 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities

9 0.12774698 36 emnlp-2012-Domain Adaptation for Coreference Resolution: An Adaptive Ensemble Approach

10 0.125523 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling

11 0.11261031 112 emnlp-2012-Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge

12 0.10588618 41 emnlp-2012-Entity based QA Retrieval

13 0.10518527 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation

14 0.090056829 15 emnlp-2012-Active Learning for Imbalanced Sentiment Classification

15 0.078774489 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings

16 0.075478382 96 emnlp-2012-Name Phylogeny: A Generative Model of String Variation

17 0.072274745 110 emnlp-2012-Reading The Web with Learned Syntactic-Semantic Inference Rules

18 0.067750007 97 emnlp-2012-Natural Language Questions for the Web of Data

19 0.067478523 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers

20 0.065885648 134 emnlp-2012-User Demographics and Language in an Implicit Social Network


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.254), (1, 0.339), (2, -0.013), (3, -0.309), (4, -0.047), (5, -0.104), (6, -0.079), (7, -0.086), (8, 0.087), (9, -0.048), (10, 0.046), (11, -0.012), (12, 0.049), (13, -0.033), (14, 0.03), (15, 0.041), (16, 0.195), (17, 0.006), (18, -0.066), (19, 0.083), (20, 0.01), (21, -0.06), (22, -0.1), (23, 0.046), (24, -0.147), (25, 0.113), (26, 0.015), (27, 0.014), (28, 0.035), (29, 0.058), (30, -0.134), (31, 0.041), (32, -0.031), (33, 0.049), (34, 0.076), (35, -0.027), (36, -0.027), (37, -0.004), (38, 0.027), (39, 0.051), (40, -0.072), (41, 0.009), (42, -0.012), (43, 0.032), (44, -0.035), (45, 0.004), (46, 0.009), (47, -0.1), (48, 0.011), (49, -0.009)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.9646222 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge

Author: Lev Ratinov ; Dan Roth

Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.

2 0.78557873 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents

Author: Heeyoung Lee ; Marta Recasens ; Angel Chang ; Mihai Surdeanu ; Dan Jurafsky

Abstract: We introduce a novel coreference resolution system that models entities and events jointly. Our iterative method cautiously constructs clusters of entity and event mentions using linear regression to model cluster merge operations. As clusters are built, information flows between entity and event clusters through features that model semantic role dependencies. Our system handles nominal and verbal events as well as entities, and our joint formulation allows information from event coreference to help entity coreference, and vice versa. In a cross-document domain with comparable documents, joint coreference resolution performs significantly better (over 3 CoNLL F1 points) than two strong baselines that resolve entities and events separately.

3 0.76920837 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic

Author: Yang Song ; Jing Jiang ; Wayne Xin Zhao ; Sujian Li ; Houfeng Wang

Abstract: Pairwise coreference resolution models must merge pairwise coreference decisions to generate final outputs. Traditional merging methods adopt different strategies such as the bestfirst method and enforcing the transitivity constraint, but most of these methods are used independently of the pairwise learning methods as an isolated inference procedure at the end. We propose a joint learning model which combines pairwise classification and mention clustering with Markov logic. Experimental results show that our joint learning system outperforms independent learning systems. Our system gives a better performance than all the learning-based systems from the CoNLL-201 1shared task on the same dataset. Compared with the best system from CoNLL2011, which employs a rule-based method, our system shows competitive performance.

4 0.57265085 19 emnlp-2012-An Entity-Topic Model for Entity Linking

Author: Xianpei Han ; Le Sun

Abstract: Entity Linking (EL) has received considerable attention in recent years. Given many name mentions in a document, the goal of EL is to predict their referent entities in a knowledge base. Traditionally, there have been two distinct directions of EL research: one focusing on the effects of mention’s context compatibility, assuming that “the referent entity of a mention is reflected by its context”; the other dealing with the effects of document’s topic coherence, assuming that “a mention ’s referent entity should be coherent with the document’ ’s main topics”. In this paper, we propose a generative model called entitytopic model, to effectively join the above two complementary directions together. By jointly modeling and exploiting the context compatibility, the topic coherence and the correlation between them, our model can – accurately link all mentions in a document using both the local information (including the words and the mentions in a document) and the global knowledge (including the topic knowledge, the entity context knowledge and the entity name knowledge). Experimental results demonstrate the effectiveness of the proposed model. 1

5 0.55987018 84 emnlp-2012-Linking Named Entities to Any Database

Author: Avirup Sil ; Ernest Cronin ; Penghai Nie ; Yinfei Yang ; Ana-Maria Popescu ; Alexander Yates

Abstract: Existing techniques for disambiguating named entities in text mostly focus on Wikipedia as a target catalog of entities. Yet for many types of entities, such as restaurants and cult movies, relational databases exist that contain far more extensive information than Wikipedia. This paper introduces a new task, called Open-Database Named-Entity Disambiguation (Open-DB NED), in which a system must be able to resolve named entities to symbols in an arbitrary database, without requiring labeled data for each new database. We introduce two techniques for Open-DB NED, one based on distant supervision and the other based on domain adaptation. In experiments on two domains, one with poor coverage by Wikipedia and the other with near-perfect coverage, our Open-DB NED strategies outperform a state-of-the-art Wikipedia NED system by over 25% in accuracy.

6 0.50502759 91 emnlp-2012-Monte Carlo MCMC: Efficient Inference by Approximate Sampling

7 0.50290465 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities

8 0.46956903 41 emnlp-2012-Entity based QA Retrieval

9 0.43400243 36 emnlp-2012-Domain Adaptation for Coreference Resolution: An Adaptive Ensemble Approach

10 0.41355953 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction

11 0.40441692 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation

12 0.39270008 72 emnlp-2012-Joint Inference for Event Timeline Construction

13 0.38906401 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings

14 0.35676321 96 emnlp-2012-Name Phylogeny: A Generative Model of String Variation

15 0.30987713 112 emnlp-2012-Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge

16 0.29793382 15 emnlp-2012-Active Learning for Imbalanced Sentiment Classification

17 0.29386672 9 emnlp-2012-A Sequence Labelling Approach to Quote Attribution

18 0.28615582 134 emnlp-2012-User Demographics and Language in an Implicit Social Network

19 0.23547006 16 emnlp-2012-Aligning Predicates across Monolingual Comparable Texts using Graph-based Clustering

20 0.22517891 32 emnlp-2012-Detecting Subgroups in Online Discussions by Modeling Positive and Negative Relations among Participants


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(2, 0.021), (16, 0.025), (25, 0.025), (34, 0.052), (60, 0.12), (63, 0.04), (64, 0.013), (65, 0.468), (70, 0.011), (74, 0.033), (76, 0.041), (80, 0.026), (86, 0.035), (95, 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

1 0.99182481 40 emnlp-2012-Ensemble Semantics for Large-scale Unsupervised Relation Extraction

Author: Bonan Min ; Shuming Shi ; Ralph Grishman ; Chin-Yew Lin

Abstract: Discovering significant types of relations from the web is challenging because of its open nature. Unsupervised algorithms are developed to extract relations from a corpus without knowing the relations in advance, but most of them rely on tagging arguments of predefined types. Recently, a new algorithm was proposed to jointly extract relations and their argument semantic classes, taking a set of relation instances extracted by an open IE algorithm as input. However, it cannot handle polysemy of relation phrases and fails to group many similar (“synonymous”) relation instances because of the sparseness of features. In this paper, we present a novel unsupervised algorithm that provides a more general treatment of the polysemy and synonymy problems. The algorithm incorporates various knowledge sources which we will show to be very effective for unsupervised extraction. Moreover, it explicitly disambiguates polysemous relation phrases and groups synonymous ones. While maintaining approximately the same precision, the algorithm achieves significant improvement on recall compared to the previous method. It is also very efficient. Experiments on a realworld dataset show that it can handle 14.7 million relation instances and extract a very large set of relations from the web. Ralph Grishman1 Chin-Yew Lin2 2Microsoft Research Asia Beijing, China { shumings cyl } @mi cro s o ft . com , that has many applications in answering factoid questions, building knowledge bases and improving search engine relevance. The web has become a massive potential source of such relations. However, its open nature brings an open-ended set of relation types. To extract these relations, a system should not assume a fixed set of relation types, nor rely on a fixed set of relation argument types. The past decade has seen some promising solutions, unsupervised relation extraction (URE) algorithms that extract relations from a corpus without knowing the relations in advance. However, most algorithms (Hasegawa et al., 2004, Shinyama and Sekine, 2006, Chen et. al, 2005) rely on tagging predefined types of entities as relation arguments, and thus are not well-suited for the open domain. Recently, Kok and Domingos (2008) proposed Semantic Network Extractor (SNE), which generates argument semantic classes and sets of synonymous relation phrases at the same time, thus avoiding the requirement of tagging relation arguments of predefined types. However, SNE has 2 limitations: 1) Following previous URE algorithms, it only uses features from the set of input relation instances for clustering. Empirically we found that it fails to group many relevant relation instances. These features, such as the surface forms of arguments and lexical sequences in between, are very sparse in practice. In contrast, there exist several well-known corpus-level semantic resources that can be automatically derived from a source corpus and are shown to be useful for generating the key elements of a relation: its 2 argument semantic classes and a set of synonymous phrases. For example, semantic classes can be derived from a source corpus with contextual distributional simi1 Introduction Relation extraction aims at discovering semantic larity and web table co-occurrences. The “synonymy” 1 problem for clustering relation instances relations between entities. It is an important task * Work done during an internship at Microsoft Research Asia 1027 LParnogcue agdein Lgesa ornf tihneg, 2 p0a1g2e Jso 1in02t C7–o1n0f3e7re,n Jce ju on Is Elanmdp,ir Kicoarlea M,e 1t2h–o1d4s J iunly N 2a0tu1r2a.l ? Lc a2n0g1u2ag Aes Psorcoicaetsiosin fgo arn Cdo Cmopmutpauti oantiaoln Lailn Ngautiustriacls could potentially be better solved by adding these resources. 2) SNE assumes that each entity or relation phrase belongs to exactly one cluster, thus is not able to effectively handle polysemy of relation phrases2. An example of a polysemous phrase is be the currency of as in 2 triples

2 0.91829354 53 emnlp-2012-First Order vs. Higher Order Modification in Distributional Semantics

Author: Gemma Boleda ; Eva Maria Vecchi ; Miquel Cornudella ; Louise McNally

Abstract: Adjectival modification, particularly by expressions that have been treated as higherorder modifiers in the formal semantics tradition, raises interesting challenges for semantic composition in distributional semantic models. We contrast three types of adjectival modifiers intersectively used color terms (as in white towel, clearly first-order), subsectively used color terms (white wine, which have been modeled as both first- and higher-order), and intensional adjectives (former bassist, clearly higher-order) and test the ability of different composition strategies to model their behavior. In addition to opening up a new empirical domain for research on distributional semantics, our observations concerning the attested vectors for the different types of adjectives, the nouns they modify, and the resulting – – noun phrases yield insights into modification that have been little evident in the formal semantics literature to date.

same-paper 3 0.87982917 76 emnlp-2012-Learning-based Multi-Sieve Co-reference Resolution with Knowledge

Author: Lev Ratinov ; Dan Roth

Abstract: We explore the interplay of knowledge and structure in co-reference resolution. To inject knowledge, we use a state-of-the-art system which cross-links (or “grounds”) expressions in free text to Wikipedia. We explore ways of using the resulting grounding to boost the performance of a state-of-the-art co-reference resolution system. To maximize the utility of the injected knowledge, we deploy a learningbased multi-sieve approach and develop novel entity-based features. Our end system outperforms the state-of-the-art baseline by 2 B3 F1 points on non-transcript portion of the ACE 2004 dataset.

4 0.5692355 136 emnlp-2012-Weakly Supervised Training of Semantic Parsers

Author: Jayant Krishnamurthy ; Tom Mitchell

Abstract: We present a method for training a semantic parser using only a knowledge base and an unlabeled text corpus, without any individually annotated sentences. Our key observation is that multiple forms ofweak supervision can be combined to train an accurate semantic parser: semantic supervision from a knowledge base, and syntactic supervision from dependencyparsed sentences. We apply our approach to train a semantic parser that uses 77 relations from Freebase in its knowledge representation. This semantic parser extracts instances of binary relations with state-of-theart accuracy, while simultaneously recovering much richer semantic structures, such as conjunctions of multiple relations with partially shared arguments. We demonstrate recovery of this richer structure by extracting logical forms from natural language queries against Freebase. On this task, the trained semantic parser achieves 80% precision and 56% recall, despite never having seen an annotated logical form.

5 0.54533875 30 emnlp-2012-Constructing Task-Specific Taxonomies for Document Collection Browsing

Author: Hui Yang

Abstract: Taxonomies can serve as browsing tools for document collections. However, given an arbitrary collection, pre-constructed taxonomies could not easily adapt to the specific topic/task present in the collection. This paper explores techniques to quickly derive task-specific taxonomies supporting browsing in arbitrary document collections. The supervised approach directly learns semantic distances from users to propose meaningful task-specific taxonomies. The approach aims to produce globally optimized taxonomy structures by incorporating path consistency control and usergenerated task specification into the general learning framework. A comparison to stateof-the-art systems and a user study jointly demonstrate that our techniques are highly effective. .

6 0.52524269 98 emnlp-2012-No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities

7 0.50346708 71 emnlp-2012-Joint Entity and Event Coreference Resolution across Documents

8 0.49656346 73 emnlp-2012-Joint Learning for Coreference Resolution with Markov Logic

9 0.49199346 110 emnlp-2012-Reading The Web with Learned Syntactic-Semantic Inference Rules

10 0.49161929 26 emnlp-2012-Building a Lightweight Semantic Model for Unsupervised Information Extraction on Short Listings

11 0.48353571 93 emnlp-2012-Multi-instance Multi-label Learning for Relation Extraction

12 0.4727951 103 emnlp-2012-PATTY: A Taxonomy of Relational Patterns with Semantic Types

13 0.46537754 72 emnlp-2012-Joint Inference for Event Timeline Construction

14 0.46216783 62 emnlp-2012-Identifying Constant and Unique Relations by using Time-Series Text

15 0.45742866 97 emnlp-2012-Natural Language Questions for the Web of Data

16 0.44612396 47 emnlp-2012-Explore Person Specific Evidence in Web Person Name Disambiguation

17 0.44220087 6 emnlp-2012-A New Minimally-Supervised Framework for Domain Word Sense Disambiguation

18 0.43975878 85 emnlp-2012-Local and Global Context for Supervised and Unsupervised Metonymy Resolution

19 0.43676952 101 emnlp-2012-Opinion Target Extraction Using Word-Based Translation Model

20 0.43197778 24 emnlp-2012-Biased Representation Learning for Domain Adaptation