emnlp emnlp2011 emnlp2011-116 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum
Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.
Reference: text
sentIndex sentText sentNum sentScore
1 com Abstract Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. [sent-5, score-1.205]
2 This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. [sent-6, score-0.456]
3 It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. [sent-7, score-1.982]
4 The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. [sent-8, score-0.726]
5 1 Motivation Web pages, news articles, blog postings, and other Internet data contain mentions of named entities such as people, places, organizations, etc. [sent-11, score-0.721]
6 de Establishing these mappings between the mentions and the actual entities is the problem of named-entity disambiguation (NED). [sent-18, score-0.79]
7 com), or YAGO (Suchanek07), which have harvested Wikipedia redirects and disambiguation pages - then the simplest heuristics for name resolution is to choose the most prominent entity for a given name. [sent-25, score-0.372]
8 This could be the entity with the longest Wikipedia article or the largest number of incoming links in Wikipedia; or the place with the most inhabitants (for cities) or largest area, etc. [sent-26, score-0.331]
9 Alternatively, one could choose the entity that uses the mention most frequently as a hyperlink anchor text. [sent-27, score-0.594]
10 Key to improving the above approaches is to consider the context of the mention to be mapped, and compare it - by some similarity measure - to contextual information about the potential target entities. [sent-30, score-0.435]
11 The candidate entity with the highest similarity is chosen. [sent-33, score-0.431]
12 The key to further improvements is to jointly consider multiple mentions in an input and aim for a collective assignment onto entities (Kulkarni09). [sent-41, score-0.764]
13 This approach should consider the coherence of the resulting entities, in the sense of semantic relatedness, and it should combine such measures with the context similarity scores of each mention-entity pair. [sent-42, score-0.529]
14 In our example, one should treat “Page”, “Plant” and “Gibson” also as named-entity mentions and aim to disambiguate them together with “Kashmir”. [sent-43, score-0.399]
15 Collective disambiguation works very well when a text contains mentions of a sufficiently large number of entities within a thematically homogeneous context. [sent-44, score-0.803]
16 If the text is very short or is about multiple, unrelated or weakly related topics, collective mapping tends to produce errors by directing some mentions towards entities that fit into a single coherent topic but do not capture the given text. [sent-45, score-0.807]
17 For example, a text about a football game between “Manchester” and “Barcelona” that takes place in “Madrid” may end up mapping either all three of these mentions onto football clubs (i. [sent-46, score-0.501]
18 2 Contribution Our approach leverages recently developed knowledge bases like YAGO as an entity catalog and a rich source of entity types and semantic relationships among entities. [sent-51, score-0.563]
19 These are factored into new measures for the similarity and coherence parts of collectively disambiguating all mentions in an input text. [sent-52, score-0.957]
20 783 We cast the joint mapping into the following graph problem: mentions from the input text and candidate entities define the node set, and we consider weighted edges between mentions and entities, capturing context similarities, and weighted edges among entities, capturing coherence. [sent-54, score-1.372]
21 • For each mention, we compute popularity priors aFondr ecaocnhte mxte nsitimonil,a rwiteie cso fmorp autlle entity acraintyd pidraitoerss as input for our tests. [sent-60, score-0.463]
22 • We use a threshold test on the prior to decide whether popularity should be used (for mentions with a very high prior) or disregarded (for mentions with several reasonable candidates). [sent-61, score-1.091]
23 • When both the entity priors and the context similarities are reasonably similar in distribution for all the entity candidates, we keep the best candidate and remove all others, fixing this mention before running the coherence graph algorithm. [sent-62, score-1.301]
24 We then run the coherence graph algorithm on all the mentions and their remaining entity candidates. [sent-63, score-1.101]
25 This way, we restrict the coherence graph algorithm to the critical mentions, in situations where the goal of coherence may be misleading or would entail high risk of degradation. [sent-64, score-0.794]
26 (Bunescu06) defined a similarity measure that compared the context of a mention to the Wikipedia categories of an entity candidate. [sent-70, score-0.686]
27 (Milne08) additionally introduced a supervised classifier for mapping mentions to entities, with learned feature weights rather than using the similarity function directly. [sent-72, score-0.59]
28 (Milne08) introduced a notion of semantic relatedness between a mention’s candidate entities and the unambiguous mentions in the textual context. [sent-73, score-0.772]
29 While these features point towards semantic coherence, the approaches are still limited to mapping each mention separately. [sent-76, score-0.329]
30 The first work with an explicit collective-learning model for joint mapping of all mentions has been (Kulkarni09). [sent-79, score-0.457]
31 This method starts with a supervised learner for a similarity prior, and models the pairwise coherence of entity candidates for two different mentions as a probabilistic factor graph with all pairs as factors. [sent-80, score-1.262]
32 Coreference resolution is the task of mapping mentions like pronouns or short phrases to a preceding, more explicit, mention. [sent-84, score-0.457]
33 Recently, interest has arisen in cross-document coreference resolution (Mayfield09), which comes closer to NED, but does not aim at mapping names onto entities in a knowledge base. [sent-85, score-0.465]
34 ) and aim to map them to their proper entries in a knowledge base, thus giving a disambiguated meaning to entity mentions in the text. [sent-98, score-0.65]
35 Entity Candidates: For possible entities (with unique canonical names) that a mention could denote, we harness existing knowledge bases like DBpedia or YAGO. [sent-101, score-0.634]
36 Popularity Prior for Entities: Prominence or popularity of entities can be seen as a probabilistic prior for mapping a name to an entity. [sent-109, score-0.62]
37 Context Similarity of Mentions and Entities: The key for mapping mentions onto entities are the contexts on both sides of the mapping. [sent-111, score-0.77]
38 On the entity side of the mapping, we associate each entity with characteristic keyphrases or salient words, precomputed from Wikipedia articles and similar sources. [sent-116, score-0.718]
39 Now we can define and compute similarity measures between a mention and an entity candidate, e. [sent-119, score-0.708]
40 Coherence among Entities: On the entity side, each entity has a context in the underlying knowl- edge base(s): other entities that are connected via semantic relationships (e. [sent-123, score-0.829]
41 This way, we can quantify the coherence between two entities by the number of incoming links that their Wikipedia articles share. [sent-129, score-0.683]
42 When we consider candidate entities for different mentions, we can now define and compute a notion of coherence among the corresponding entities, e. [sent-130, score-0.688]
43 Overall Objective Function: To aim for the best disambiguation mappings, our framework combines prior, similarity, and coherence measures into a combined objective function: for each mention mi, i = 1. [sent-134, score-0.76]
44 where α + β + γ = 1, cnd(mi) is the set of possible meanings of mi, cxt( ) denotes the context of mentions and entities, respectively, and coh( ) is the coherence function for a set of entities. [sent-144, score-0.742]
45 For robustness, our solution selectively enables or disables the three components, based on tests on the mentions of the input text; see Section 5. [sent-146, score-0.427]
46 2 Mention-Entity Similarity Keyphrase-based Similarity: On the mention side, we use all tokens in the document (except stopwords and the mention itself) as context. [sent-154, score-0.542]
47 On the entity side, the knowledge base knows authoritative sources for each entity, for example, the corresponding Wikipedia article or an organizational or individual homepage. [sent-156, score-0.34]
48 These are the inputs for an offline data-mining step to determine characteristic keyphrases for each entity and their statistical weights. [sent-157, score-0.424]
49 As keyphrase candidates for an entity we consider its corresponding Wikipedia article’s link anchors texts, including category names, citation titles, and external references. [sent-159, score-0.438]
50 reflecting if w is contained in the keyphrase set of e or any of the keyphrase sets of an entity linking to e, IN(e), with N denoting the total number of entities. [sent-168, score-0.501]
51 For example, the phrase “Grammy Award winner” associated with entity Jimmy Page may occur only in the form “Grammy winner” near some mention “Page”. [sent-171, score-0.522]
52 Therefore, our algorithm for the similarity of mention m with regard to entity e computes partial matches of keyphrases in the text. [sent-172, score-0.8]
53 Using a large text corpus for training, we collect statistics about what kinds of entities tend to occur as subjects of “play”, and then rank the candidate entities according to their compatibility with the verb. [sent-183, score-0.585]
54 Syntax-based similarity between cxt(e) and the context cxt(m) of the mention is then defined as the sum of the scalar-product similarity between these two vectors for each substitute. [sent-189, score-0.537]
55 The former is more compatible with the given context than the latter, leading to higher similarity for the entity Jimmy Page. [sent-192, score-0.384]
56 3 Entity-Entity Coherence As all entities of interest are registered in a knowledge base (like YAGO), we can utilize the semantic type system, which is usually a DAG of classes. [sent-194, score-0.344]
57 The knowledge bases also provide same-as crossreferencing to Wikipedia, amd we quantify the coherence between two entities by the number of incoming links that their Wikipedia articles share. [sent-196, score-0.744]
58 2 Graph Algorithm Given a mention-entity graph, our goal is to compute a dense subgraph that would ideally contain all mention nodes and exactly one mention-entity edge for each mention, thus disambiguating all mentions. [sent-198, score-0.484]
59 The first is how to specify a notion of density that is best suited for capturing the coherence of the resulting entity nodes. [sent-200, score-0.658]
60 1 Mention-Entity Graph From the popularity, similarity, and coherence measures discussed in Section 4, we construct a weighted, undirected graph with mentions and candidate entities as nodes. [sent-207, score-1.219]
61 As shown in the example of Figure 1, the graph has two kinds of edges: • A mention-entity edge is weighted with a similarity measure or a combination of popularity and similarity measure. [sent-208, score-0.691]
62 Our experiments will focus on anchor-based popularity, keyphrase-based and/or syntactic similarity, and link-based coherence (mw coh). [sent-211, score-0.343]
63 The mentionentity graph is dense on the entities side and often has hundreds or thousands of nodes, as the YAGO knowl- edge base offers many candidate entities for common mentions (e. [sent-212, score-1.282]
64 The algorithm starts from the full mention-entity graph and iteratively removes the entity node with the smallest weighted degree. [sent-223, score-0.405]
65 To guarantee that we arrive at a coherent mention-entity mapping for all mentions, we enforce each mention node to remain connected to at least one entity. [sent-225, score-0.358]
66 For this reason, we apply a pre-processing phase to prune the entities that are only remotely related to the mention nodes. [sent-227, score-0.54]
67 For each entity node, we compute the distance from the set of all mention nodes in terms Figure 1: Mention-Entity Graph Example of the sum of the corresponding squared shortestpath distances. [sent-228, score-0.522]
68 We then restrict the input graph to the entity nodes that are closest to the mentions. [sent-229, score-0.359]
69 To overcome these problems, we introduce two robustness tests for individual mentions and, depending on the tests’ outcomes, use only a subset of our framework’s features and techniques. [sent-242, score-0.518]
70 Prior test: Our first test ensures that the popularity prior does not unduly dominate the outcome if the true entities are dominated by false alternatives. [sent-243, score-0.562]
71 We check, for each mention, whether the popularity prior for the most likely candidate entity is above some threshold ρ, e. [sent-244, score-0.591]
72 Coherence test: As a test for whether the coherence part of our framework makes sense or not, we compare the popularity prior and the similarityonly measure, on a per-mention basis. [sent-250, score-0.636]
73 , 1), the disagreement between popularity and similarity-only indicates that there is a situation that coherence may be able to fix. [sent-256, score-0.556]
74 If, on the other hand, there is hardly any disagreement, using coherence as an additional aspect would be risky for thematically heterogeneous texts and should better be disabled. [sent-257, score-0.417]
75 In that case, we choose an entity for the mention at hand, using the combination of prior and similarity. [sent-258, score-0.633]
76 Only the winning entity is included in the mention-entity graph, all other candidates are omitted for the graph algorithm. [sent-259, score-0.387]
77 We use the Stanford NER tagger (Finkel05) to identify mentions in input texts, the YAGO2 knowledge base (Hoffart1 1) as a repository of entities, and the English Wikipedia edition (as of 2010-08-17) as a source of mining keyphrases and various forms of weights. [sent-263, score-0.581]
78 To avoid unfair comparisons, we created our own dataset based on CoNLL 2003 789 articles mentions (total) mentions with no entity words per article (avg. [sent-269, score-1.144]
79 ) entities per mention (avg) initial annotator disagreement (%) 1,393 34,956 7,136 216 25 17 21 73 21. [sent-273, score-0.571]
80 10 • initial number of entites in graph: 5 · #mentions • threshold for coherence test: λ = 0. [sent-290, score-0.343]
81 Both measures can aggregate over of all mentions (across all texts) or over all input texts (each with several mentions). [sent-319, score-0.484]
82 As we use a knowledge base with millions of entities, we decided to neglect the situation that a mention may refer to an unknown entity not registered in the knowledge base. [sent-321, score-0.597]
83 We consider only mention-entity pairs where the ground-truth gives a known entity, and thus ignore roughly 20% of the mentions without known entity in the ground-truth. [sent-322, score-0.65]
84 The table includes variants of our framework, with different choices for the similarity and coherence computations. [sent-332, score-0.476]
85 The shorthand notation for the combinations in the table is as follows: prior: popularity prior; r-prior: popularity prior with robustness test; sim-k: keyphrase based similarity measure; sim-s: syntax-based similarity; coh: graph coherence; r-coh: graph coherence with robustness test. [sent-333, score-1.474]
86 The shorthand names for competitors are: Cuc: (Cucerzan07) similarity measure; Kul s: (Kulkarni09) similarity measure only; Kul sp: Kul s combined with plus popularity prior; Kul CI: Kul sp combined with coherence. [sent-334, score-0.577]
87 All coherence methods use the Milne-Witten inlink overlap measure mw coh. [sent-335, score-0.45]
88 0, which corresponds to the overall correctness of the methods for all mentions that are assigned to an entity in the ground-truth data. [sent-337, score-0.65]
89 The high MAP for the prior method is because we rank by mention-entity edge weight; for prior this is simply the prior probability. [sent-368, score-0.391]
90 As the prior is most probably correct for mentions with a very high prior for their most popular entity (by definition), the initial ranking of the prior is very good, but drops more sharply. [sent-369, score-0.983]
91 We believe that the main difficulty in named entity disambiguation lies exactly in the “long tail” of not-so-prominent entities. [sent-370, score-0.397]
92 We found that our measure already captures a notion of popularity because popular entities have more keyphrases and can thus accumulate a higher total score. [sent-376, score-0.656]
93 The popularity should only be used when one entitiy has a very high probability, and introducing the robustness test for the prior achieved this, improving on both our similarity and Kul sp. [sent-377, score-0.517]
94 Unconditionally adding the notion of coherence among entities improves the micro-average precision, 791 but not the macro-average. [sent-378, score-0.641]
95 Investigating potential problems, we found that the coherence can be led astray when parts of the document form a coherent cluster of entities, and other entities are then forced to be coherent to this cluster. [sent-379, score-0.67]
96 To overcome this issue, we introduced the coherence robustness test, and the results with r-coh show that it makes sense to fix an entity for a mention when the prior and similarity are in reasonable agreement. [sent-380, score-1.2]
97 Adding this coherence test leads to a signigicant (p-value < 0. [sent-381, score-0.343]
98 Our experiments showed that when adding this coherence test, around 32 of the mentions are solved using local similarity only and are assigned an entity before running the graph algorithm. [sent-383, score-1.234]
99 shtml Xianpei Han, Jun Zhao: Named entity disambiguation by leveraging wikipedia semantic knowledge. [sent-422, score-0.481]
100 de / yago-naga / yago / Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti: Collective annotation of Wikipedia entities in web text. [sent-428, score-0.395]
wordName wordTfidf (topN-words)
[('mentions', 0.399), ('coherence', 0.343), ('mention', 0.271), ('entities', 0.269), ('entity', 0.251), ('kul', 0.209), ('popularity', 0.182), ('aida', 0.145), ('keyphrases', 0.145), ('wikipedia', 0.137), ('similarity', 0.133), ('yago', 0.126), ('keyphrase', 0.125), ('prior', 0.111), ('graph', 0.108), ('kashmir', 0.097), ('disambiguation', 0.093), ('robustness', 0.091), ('gerhard', 0.083), ('coh', 0.08), ('cxt', 0.08), ('weikum', 0.08), ('subgraph', 0.079), ('anchor', 0.072), ('dbpedia', 0.069), ('page', 0.066), ('grammy', 0.064), ('jimmy', 0.064), ('sofie', 0.064), ('names', 0.063), ('ned', 0.063), ('bases', 0.061), ('mapping', 0.058), ('edge', 0.058), ('winner', 0.055), ('measures', 0.053), ('named', 0.053), ('collective', 0.052), ('article', 0.052), ('chords', 0.048), ('inlink', 0.048), ('mauro', 0.048), ('mentionentity', 0.048), ('sozio', 0.048), ('withheld', 0.048), ('www', 0.047), ('dense', 0.047), ('candidate', 0.047), ('gibson', 0.047), ('weighted', 0.046), ('onto', 0.044), ('articles', 0.043), ('thematically', 0.042), ('cnd', 0.042), ('rock', 0.042), ('mi', 0.041), ('fabian', 0.038), ('registered', 0.038), ('base', 0.037), ('conll', 0.037), ('precision', 0.037), ('suchanek', 0.035), ('competitors', 0.035), ('ner', 0.035), ('density', 0.035), ('link', 0.034), ('canonical', 0.033), ('ci', 0.032), ('bilyana', 0.032), ('eji', 0.032), ('guitar', 0.032), ('guitarist', 0.032), ('himalaya', 0.032), ('kashmi', 0.032), ('lastnames', 0.032), ('planck', 0.032), ('ramakrishnan', 0.032), ('simscore', 0.032), ('unconditionally', 0.032), ('webgraph', 0.032), ('texts', 0.032), ('coreference', 0.031), ('measure', 0.031), ('disagreement', 0.031), ('priors', 0.03), ('mappings', 0.029), ('apple', 0.029), ('disambiguating', 0.029), ('alternatively', 0.029), ('coherent', 0.029), ('notion', 0.029), ('tests', 0.028), ('characteristic', 0.028), ('mw', 0.028), ('incoming', 0.028), ('relatedness', 0.028), ('candidates', 0.028), ('asset', 0.028), ('discount', 0.028), ('harvested', 0.028)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999952 116 emnlp-2011-Robust Disambiguation of Named Entities in Text
Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum
Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.
2 0.23708875 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni
Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp
3 0.17157678 128 emnlp-2011-Structured Relation Discovery using Generative Models
Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum
Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.
4 0.15415676 114 emnlp-2011-Relation Extraction with Relation Topics
Author: Chang Wang ; James Fan ; Aditya Kalyanpur ; David Gondek
Abstract: This paper describes a novel approach to the semantic relation detection problem. Instead of relying only on the training instances for a new relation, we leverage the knowledge learned from previously trained relation detectors. Specifically, we detect a new semantic relation by projecting the new relation’s training instances onto a lower dimension topic space constructed from existing relation detectors through a three step process. First, we construct a large relation repository of more than 7,000 relations from Wikipedia. Second, we construct a set of non-redundant relation topics defined at multiple scales from the relation repository to characterize the existing relations. Similar to the topics defined over words, each relation topic is an interpretable multinomial distribution over the existing relations. Third, we integrate the relation topics in a kernel function, and use it together with SVM to construct detectors for new relations. The experimental results on Wikipedia and ACE data have confirmed that backgroundknowledge-based topics generated from the Wikipedia relation repository can significantly improve the performance over the state-of-theart relation detection approaches.
5 0.13406117 101 emnlp-2011-Optimizing Semantic Coherence in Topic Models
Author: David Mimno ; Hanna Wallach ; Edmund Talley ; Miriam Leenders ; Andrew McCallum
Abstract: Latent variable models have the potential to add value to large document collections by discovering interpretable, low-dimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirichlet allocation, often produce low-dimensional subspaces (topics) that are obviously flawed to human domain experts. The contributions of this paper are threefold: (1) An analysis of the ways in which topics can be flawed; (2) an automated evaluation metric for identifying such topics that does not rely on human annotators or reference collections outside the training data; (3) a novel statistical topic model based on this metric that significantly improves topic quality in a large-scale document collection from the National Institutes of Health (NIH).
6 0.10735326 90 emnlp-2011-Linking Entities to a Knowledge Base with Query Expansion
7 0.091809496 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues
8 0.075390555 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts
9 0.074177772 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
10 0.069622472 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
11 0.066081032 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week
12 0.061110195 112 emnlp-2011-Refining the Notions of Depth and Density in WordNet-based Semantic Similarity Measures
13 0.060837489 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction
14 0.060000468 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking
15 0.059373144 72 emnlp-2011-Improved Transliteration Mining Using Graph Reinforcement
16 0.056877859 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use
17 0.054419201 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
18 0.053718209 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge
19 0.053595398 107 emnlp-2011-Probabilistic models of similarity in syntactic context
20 0.050128393 7 emnlp-2011-A Joint Model for Extended Semantic Role Labeling
topicId topicWeight
[(0, 0.195), (1, -0.174), (2, -0.118), (3, -0.056), (4, -0.143), (5, -0.085), (6, 0.096), (7, -0.032), (8, -0.027), (9, 0.08), (10, 0.081), (11, 0.084), (12, -0.076), (13, 0.073), (14, 0.055), (15, -0.076), (16, 0.11), (17, 0.109), (18, 0.144), (19, -0.192), (20, -0.237), (21, -0.038), (22, -0.084), (23, -0.219), (24, -0.04), (25, 0.051), (26, 0.008), (27, 0.153), (28, 0.055), (29, -0.103), (30, 0.155), (31, 0.035), (32, 0.103), (33, -0.0), (34, -0.041), (35, -0.008), (36, -0.118), (37, -0.084), (38, 0.017), (39, -0.028), (40, 0.135), (41, -0.099), (42, -0.158), (43, 0.06), (44, 0.019), (45, 0.068), (46, -0.007), (47, 0.019), (48, -0.007), (49, 0.05)]
simIndex simValue paperId paperTitle
same-paper 1 0.97889924 116 emnlp-2011-Robust Disambiguation of Named Entities in Text
Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum
Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.
2 0.65927458 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
Author: Alan Ritter ; Sam Clark ; Mausam ; Oren Etzioni
Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-NER system doubles F1 score compared with the Stanford NER system. T-NER leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms cotraining, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http : / / github .com/ aritt er /twitte r_nlp
3 0.54152626 128 emnlp-2011-Structured Relation Discovery using Generative Models
Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum
Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.
4 0.52019405 84 emnlp-2011-Learning the Information Status of Noun Phrases in Spoken Dialogues
Author: Altaf Rahman ; Vincent Ng
Abstract: An entity in a dialogue may be old, new, or mediated/inferrable with respect to the hearer’s beliefs. Knowing the information status of the entities participating in a dialogue can therefore facilitate its interpretation. We address the under-investigated problem of automatically determining the information status of discourse entities. Specifically, we extend Nissim’s (2006) machine learning approach to information-status determination with lexical and structured features, and exploit learned knowledge of the information status of each discourse entity for coreference resolution. Experimental results on a set of Switchboard dialogues reveal that (1) incorporating our proposed features into Nissim’s feature set enables our system to achieve stateof-the-art performance on information-status classification, and (2) the resulting information can be used to improve the performance of learning-based coreference resolvers.
5 0.47445005 114 emnlp-2011-Relation Extraction with Relation Topics
Author: Chang Wang ; James Fan ; Aditya Kalyanpur ; David Gondek
Abstract: This paper describes a novel approach to the semantic relation detection problem. Instead of relying only on the training instances for a new relation, we leverage the knowledge learned from previously trained relation detectors. Specifically, we detect a new semantic relation by projecting the new relation’s training instances onto a lower dimension topic space constructed from existing relation detectors through a three step process. First, we construct a large relation repository of more than 7,000 relations from Wikipedia. Second, we construct a set of non-redundant relation topics defined at multiple scales from the relation repository to characterize the existing relations. Similar to the topics defined over words, each relation topic is an interpretable multinomial distribution over the existing relations. Third, we integrate the relation topics in a kernel function, and use it together with SVM to construct detectors for new relations. The experimental results on Wikipedia and ACE data have confirmed that backgroundknowledge-based topics generated from the Wikipedia relation repository can significantly improve the performance over the state-of-theart relation detection approaches.
6 0.40743154 64 emnlp-2011-Harnessing different knowledge sources to measure semantic relatedness under a uniform model
7 0.3770206 90 emnlp-2011-Linking Entities to a Knowledge Base with Query Expansion
8 0.35436532 23 emnlp-2011-Bootstrapped Named Entity Recognition for Product Attribute Extraction
9 0.34060064 112 emnlp-2011-Refining the Notions of Depth and Density in WordNet-based Semantic Similarity Measures
10 0.33806953 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts
11 0.29794547 72 emnlp-2011-Improved Transliteration Mining Using Graph Reinforcement
12 0.29110384 143 emnlp-2011-Unsupervised Information Extraction with Distributional Prior Knowledge
13 0.27262551 29 emnlp-2011-Collaborative Ranking: A Case Study on Entity Linking
14 0.25621146 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
15 0.2410367 85 emnlp-2011-Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming
16 0.23426238 63 emnlp-2011-Harnessing WordNet Senses for Supervised Sentiment Classification
17 0.23408289 101 emnlp-2011-Optimizing Semantic Coherence in Topic Models
18 0.22716616 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base
19 0.22171153 145 emnlp-2011-Unsupervised Semantic Role Induction with Graph Partitioning
20 0.21722677 18 emnlp-2011-Analyzing Methods for Improving Precision of Pivot Based Bilingual Dictionaries
topicId topicWeight
[(15, 0.011), (23, 0.102), (36, 0.022), (37, 0.023), (43, 0.283), (45, 0.109), (53, 0.017), (54, 0.015), (57, 0.021), (62, 0.025), (64, 0.014), (66, 0.038), (69, 0.025), (79, 0.038), (82, 0.032), (87, 0.014), (96, 0.083), (98, 0.034)]
simIndex simValue paperId paperTitle
same-paper 1 0.7927978 116 emnlp-2011-Robust Disambiguation of Named Entities in Text
Author: Johannes Hoffart ; Mohamed Amir Yosef ; Ilaria Bordino ; Hagen Furstenau ; Manfred Pinkal ; Marc Spaniol ; Bilyana Taneva ; Stefan Thater ; Gerhard Weikum
Abstract: Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as DBpedia or YAGO. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. It unifies prior approaches into a comprehensive framework that combines three measures: the prior probability of an entity being mentioned, the similarity between the contexts of a mention and a candidate entity, as well as the coherence among candidate entities for all mentions together. The method builds a weighted graph of mentions and candidate entities, and computes a dense subgraph that approximates the best joint mention-entity mapping. Experiments show that the new method significantly outperforms prior methods in terms of accuracy, with robust behavior across a variety of inputs.
2 0.53013623 101 emnlp-2011-Optimizing Semantic Coherence in Topic Models
Author: David Mimno ; Hanna Wallach ; Edmund Talley ; Miriam Leenders ; Andrew McCallum
Abstract: Latent variable models have the potential to add value to large document collections by discovering interpretable, low-dimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirichlet allocation, often produce low-dimensional subspaces (topics) that are obviously flawed to human domain experts. The contributions of this paper are threefold: (1) An analysis of the ways in which topics can be flawed; (2) an automated evaluation metric for identifying such topics that does not rely on human annotators or reference collections outside the training data; (3) a novel statistical topic model based on this metric that significantly improves topic quality in a large-scale document collection from the National Institutes of Health (NIH).
3 0.5188396 37 emnlp-2011-Cross-Cutting Models of Lexical Semantics
Author: Joseph Reisinger ; Raymond Mooney
Abstract: Context-dependent word similarity can be measured over multiple cross-cutting dimensions. For example, lung and breath are similar thematically, while authoritative and superficial occur in similar syntactic contexts, but share little semantic similarity. Both of these notions of similarity play a role in determining word meaning, and hence lexical semantic models must take them both into account. Towards this end, we develop a novel model, Multi-View Mixture (MVM), that represents words as multiple overlapping clusterings. MVM finds multiple data partitions based on different subsets of features, subject to the marginal constraint that feature subsets are distributed according to Latent Dirich- let Allocation. Intuitively, this constraint favors feature partitions that have coherent topical semantics. Furthermore, MVM uses soft feature assignment, hence the contribution of each data point to each clustering view is variable, isolating the impact of data only to views where they assign the most features. Through a series of experiments, we demonstrate the utility of MVM as an inductive bias for capturing relations between words that are intuitive to humans, outperforming related models such as Latent Dirichlet Allocation.
4 0.51787579 103 emnlp-2011-Parser Evaluation over Local and Non-Local Deep Dependencies in a Large Corpus
Author: Emily M. Bender ; Dan Flickinger ; Stephan Oepen ; Yi Zhang
Abstract: In order to obtain a fine-grained evaluation of parser accuracy over naturally occurring text, we study 100 examples each of ten reasonably frequent linguistic phenomena, randomly selected from a parsed version of the English Wikipedia. We construct a corresponding set of gold-standard target dependencies for these 1000 sentences, operationalize mappings to these targets from seven state-of-theart parsers, and evaluate the parsers against this data to measure their level of success in identifying these dependencies.
5 0.51261646 33 emnlp-2011-Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs
Author: Samuel Brody ; Nicholas Diakopoulos
Abstract: We present an automatic method which leverages word lengthening to adapt a sentiment lexicon specifically for Twitter and similar social messaging networks. The contributions of the paper are as follows. First, we call attention to lengthening as a widespread phenomenon in microblogs and social messaging, and demonstrate the importance of handling it correctly. We then show that lengthening is strongly associated with subjectivity and sentiment. Finally, we present an automatic method which leverages this association to detect domain-specific sentiment- and emotionbearing words. We evaluate our method by comparison to human judgments, and analyze its strengths and weaknesses. Our results are of interest to anyone analyzing sentiment in microblogs and social networks, whether for research or commercial purposes.
6 0.51181245 128 emnlp-2011-Structured Relation Discovery using Generative Models
7 0.50932872 107 emnlp-2011-Probabilistic models of similarity in syntactic context
8 0.50820535 81 emnlp-2011-Learning General Connotation of Words using Graph-based Algorithms
9 0.50712597 8 emnlp-2011-A Model of Discourse Predictions in Human Sentence Processing
10 0.50305408 56 emnlp-2011-Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases
11 0.50276297 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances
12 0.5019207 108 emnlp-2011-Quasi-Synchronous Phrase Dependency Grammars for Machine Translation
13 0.50067097 53 emnlp-2011-Experimental Support for a Categorical Compositional Distributional Model of Meaning
14 0.49937919 119 emnlp-2011-Semantic Topic Models: Combining Word Distributional Statistics and Dictionary Definitions
15 0.4986527 98 emnlp-2011-Named Entity Recognition in Tweets: An Experimental Study
16 0.49587926 1 emnlp-2011-A Bayesian Mixture Model for PoS Induction Using Multiple Features
17 0.49181238 9 emnlp-2011-A Non-negative Matrix Factorization Based Approach for Active Dual Supervision from Document and Word Labels
18 0.49106002 132 emnlp-2011-Syntax-Based Grammaticality Improvement using CCG and Guided Search
19 0.490742 123 emnlp-2011-Soft Dependency Constraints for Reordering in Hierarchical Phrase-Based Translation
20 0.49038559 78 emnlp-2011-Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus