emnlp emnlp2010 emnlp2010-28 knowledge-graph by maker-knowledge-mining

28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data


Source: pdf

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 edu Abstract We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. [sent-3, score-0.527]

2 In particular, we tackle relation extraction and entity identification jointly. [sent-4, score-0.575]

3 We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). [sent-5, score-0.795]

4 For inference we run an efficient Gibbs sampler that leads to linear time joint inference. [sent-6, score-0.028]

5 We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. [sent-7, score-0.031]

6 For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. [sent-8, score-0.163]

7 For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline. [sent-9, score-0.163]

8 1 Introduction Relation Extraction is the task of predicting semantic relations over entities expressed in structured or semi-structured text. [sent-10, score-0.18]

9 This includes, for example, the extraction of employer-employee relations mentioned in newswire, or protein-protein interactions expressed in biomedical papers. [sent-11, score-0.203]

10 It also includes the prediction of entity types such as country, citytown or person, if we consider entity types as unary relations. [sent-12, score-0.454]

11 A particularly attractive approach to relation extraction is based on distant supervision. [sent-13, score-0.581]

12 1013 place of annotated text, only an existing knowledge base (KB) is needed to train a relation extractor (Mintz et al. [sent-15, score-0.47]

13 The facts in the KB are heuristically aligned to an unlabelled training corpus, and the resulting alignment is the basis for learning the extractor. [sent-18, score-0.356]

14 Naturally, the predictions of a distantly supervised relation extractor will be less accurate than those of a supervised one. [sent-19, score-0.366]

15 While facts of existing knowledge bases are inexpensive to come by, the heuristic alignment to text will often lead to noisy patterns in learning. [sent-20, score-0.395]

16 When applied to unseen text, these patterns will produce noisy facts. [sent-21, score-0.032]

17 Indeed, we find that extraction precision still leaves much room for improvement. [sent-22, score-0.172]

18 This room is not as large as in previous work (Mintz et al. [sent-23, score-0.041]

19 However, when we use the knowledge base Freebase (Bollacker et al. [sent-25, score-0.067]

20 For example, the precision of the top-ranked 50 nat ional ity relation instances is only 28%. [sent-27, score-0.703]

21 On inspection, it turns out that many of the errors can be easily identified: they amount to violations of basic compatibility constraints between facts. [sent-28, score-0.08]

22 In particular, we observe unsatisfied selectional preferences of relations towards particular entity types as types of their arguments. [sent-29, score-0.464]

23 An example is the fact that the first argument of nat ional ity is always a pers on while the second is a count ry. [sent-30, score-0.449]

24 A simple way to address this is a pipeline: first predict entity types, and then condition on these when predicting relations. [sent-31, score-0.163]

25 However, this neglects the fact that relations could as well be used to help entity type prediction. [sent-32, score-0.262]

26 c od2s01 in0 N Aastsuorcaialt Lioanng foura Cgeom Prpoucteastisoinnga,l p Laignegsui 1s0ti1c3s–1023, While there is some existing work on enforcing such constraints in a joint fashion (Roth and Yih, 2007; Kate and Mooney, 2010; Riedel et al. [sent-35, score-0.065]

27 The difference is the amount of facts they take into account at the same time. [sent-37, score-0.208]

28 They focus on single sentence extractions, and only consider very few interacting facts. [sent-38, score-0.035]

29 Moreover, the fewer facts they consider at the same time, the lower the chance that some of these will be incompatible, and that modelling compatibility will make a difference. [sent-41, score-0.305]

30 In this work we present a novel approach that performs relation extraction across documents, enforces selectional preferences, and needs no labelled data. [sent-42, score-0.603]

31 It is based on an undirected graphical model in which variables correspond to facts, and factors between them measure compatibility. [sent-43, score-0.028]

32 For example, 200,000 documents take less than three hours for training and testing. [sent-47, score-0.057]

33 (2009), use Freebase as source of distant supervision, and employ Wikipedia as source of unlabelled text—we will call this an in-domain setting. [sent-50, score-0.255]

34 This scenario is somewhat artificial in that Freebase itself is partially derived from Wikipedia, and in practice we cannot expect text and training knowledge base to be so close. [sent-51, score-0.067]

35 When we compare to an isolated baseline that makes no use of entity types, our joint model improves average precision by 4%. [sent-54, score-0.326]

36 In the out-ofdomain setting, our collective model substantially outperforms both other approaches. [sent-56, score-0.051]

37 Compared to the isolated baseline, we achieve a 15% increase in 2The pyramid algorithm of Kate and Mooney (2010) may scale well, but it is not clear how to apply their scheme to crossdocument extraction. [sent-57, score-0.143]

38 With respect to the pipeline approach, the increase is 13%. [sent-59, score-0.069]

39 In the following we will first give some background information on relation extraction with distant supervision. [sent-60, score-0.611]

40 Then we will present our graphical model as well as the inference and learning techniques we apply. [sent-61, score-0.028]

41 We will also give a brief introduction to relation extraction, in particular in the context of distant supervision. [sent-64, score-0.477]

42 Example entities would be the company founder BILL GATES, the company MICROSOFT, and the country USA. [sent-67, score-0.344]

43 3 It denotes the membership Rof a th reel tuple c tiann tcehe. [sent-74, score-0.113]

44 For example, founded (BILL GATES, MICROSOFT) is a relation instance denoting that BILL GATES and MICROSOFT are related in the founded relation. [sent-76, score-0.446]

45 In the following we will always consider some set of candidate tuples C that may or may not be related. [sent-77, score-0.137]

46 We define Cn ⊂ C to be set of all n-ary tuples in C. [sent-78, score-0.066]

47 Note that ⊂whi Cle our ed seeftin oitfio anll c no-nasriyd etur-s general n-nary relations, in practice we will restrict us to unary and binary relations C1 and C2. [sent-79, score-0.153]

48 , 2003; Culotta and Sorensen, 2004) we make one more simplifying assumption: every candidate tuple can be member of at most one relation. [sent-82, score-0.184]

49 2 Entity Types An entity can be of one or several entity types. [sent-84, score-0.326]

50 For example, BILL GATES is a pers on, and a company founder. [sent-85, score-0.159]

51 Entity types correspond to the special case of relations with arity one, and will be treated as such in the following. [sent-86, score-0.171]

52 3Other commonly used terms are relational facts, ground facts, ground atoms, and assertions. [sent-87, score-0.092]

53 First, they can be important for downstream applications: if consumers of our extracted facts know the type of entities, they can find them more easily, visualize them more adequately, and perform operations specific to these types (write emails to persons, book a hotel in a city, etc. [sent-89, score-0.346]

54 Second, they are useful for extracting binary relations due to selectional preferences—see section 2. [sent-91, score-0.175]

55 3 Mentions In natural language text spans of tokens are used to refer to entities. [sent-94, score-0.032]

56 Here “Evo Morales” is an entity mention of president EVO MORALES, and “Bolivia” a mention of the country BOLIVIA he is the president of. [sent-99, score-0.659]

57 People often express relations between entities in natural language texts by mentioning the participating entities in specific syntactic and lexical patterns. [sent-100, score-0.261]

58 We will refer to any tuple of mentions of entities (e1, . [sent-101, score-0.328]

59 If such a candidate expresses the relation R, then it is a relation mention of the relation instance R (e1, . [sent-105, score-1.141]

60 Here the pair of entity mentions (“Evo Morales”, “Bolivia”) is a candidate mention tuple. [sent-110, score-0.543]

61 In fact, in this case the candidate is indeed a relation mention of the relation instance nat ional ity (EVO MORALES, BOLIVIA). [sent-111, score-1.229]

62 4 Relation Extraction We define the task of relation extraction as follows. [sent-113, score-0.412]

63 We are given a corpus of documents and a set of target relations. [sent-114, score-0.057]

64 Then we are asked to predict all relation instances I that for each R (c) ∈ I so there elaxtiisotns at sletaanstc one r seola tthioant fmoern eatiocnh iRn (tche) given corpus. [sent-115, score-0.343]

65 The above definition covers a range of existing approaches by varying over what we define as target corpus. [sent-116, score-0.037]

66 On one end, we have extractors that process text on a per sentence basis (Zelenko et al. [sent-117, score-0.031]

67 On the other end, we have methods that take relation mentions 1015 from several documents and use these as input features (Mintz et al. [sent-119, score-0.528]

68 There is a compelling reason for performing relation extraction within a larger scope that considers mentions across documents: redundancy. [sent-121, score-0.606]

69 Often facts are mentioned in several sentences and documents. [sent-122, score-0.208]

70 Some of these mentions may be difficult to parse, or they use unseen patterns. [sent-123, score-0.195]

71 But the more mentions we consider, the higher the probability that one does parse, and fits a pattern we have seen in the training data. [sent-124, score-0.163]

72 Note that for relation extraction that considers more than a single mention we have to solve the coreference problem in order to determine which mentions refer to the same entity. [sent-125, score-0.76]

73 In the following we will assume that coreference clusters are provided by a preprocessing step. [sent-126, score-0.039]

74 5 Distant Supervision In relation extraction we often encounter a lack of explicitly annotated text, but an abundance of structured data sources such as company databases or collaborative knowledge bases like Freebase. [sent-128, score-0.568]

75 In order to exploit this, many approaches use simple but effective heuristics to align existing facts with unlabelled text. [sent-129, score-0.331]

76 This labelled text can then be used as training material of a supervised learner. [sent-130, score-0.115]

77 One heuristic is to assume that each candidate mention tuple of a training fact is indeed expressing the corresponding relation (Bunescu and Mooney, 2007). [sent-131, score-0.674]

78 (2009) refer to this as the distant supervision assumption. [sent-133, score-0.279]

79 Let us again consider the nat ional ity relation between EVO MORALES and BOLIVIA. [sent-135, score-0.676]

80 In an 2007 article of the New York Times we find this relation mention candidate: (2) . [sent-136, score-0.454]

81 This sentence does not directly express that EVO MORALES is a citizen of BOLIVIA, and hence violates the distant supervision assumption. [sent-142, score-0.308]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('evo', 0.324), ('morales', 0.324), ('relation', 0.308), ('mintz', 0.283), ('bolivia', 0.243), ('facts', 0.208), ('distant', 0.169), ('entity', 0.163), ('mentions', 0.163), ('freebase', 0.162), ('mention', 0.146), ('ional', 0.138), ('nat', 0.138), ('gates', 0.115), ('labelled', 0.115), ('supervision', 0.11), ('isolated', 0.108), ('extraction', 0.104), ('kb', 0.104), ('relations', 0.099), ('ity', 0.092), ('mooney', 0.091), ('bill', 0.088), ('unlabelled', 0.086), ('tuple', 0.084), ('pers', 0.081), ('riedel', 0.081), ('entities', 0.081), ('company', 0.078), ('selectional', 0.076), ('country', 0.072), ('candidate', 0.071), ('founded', 0.069), ('pipeline', 0.069), ('bunescu', 0.069), ('president', 0.066), ('tuples', 0.066), ('sorensen', 0.062), ('kate', 0.062), ('wikipedia', 0.061), ('zelenko', 0.058), ('extractor', 0.058), ('documents', 0.057), ('microsoft', 0.057), ('unary', 0.054), ('preferences', 0.052), ('compatibility', 0.051), ('bases', 0.051), ('collective', 0.051), ('culotta', 0.048), ('modelling', 0.046), ('ground', 0.046), ('cn', 0.041), ('room', 0.041), ('york', 0.041), ('base', 0.04), ('coreference', 0.039), ('heuristic', 0.037), ('existing', 0.037), ('types', 0.037), ('pipelined', 0.035), ('irn', 0.035), ('amherst', 0.035), ('arity', 0.035), ('cle', 0.035), ('consumers', 0.035), ('crossdocument', 0.035), ('fmoern', 0.035), ('founder', 0.035), ('inexpensive', 0.035), ('interacting', 0.035), ('rof', 0.035), ('samplerank', 0.035), ('snippet', 0.035), ('tche', 0.035), ('visualize', 0.035), ('yao', 0.035), ('spans', 0.032), ('unseen', 0.032), ('hotel', 0.031), ('atoms', 0.031), ('compelling', 0.031), ('heuristically', 0.031), ('indomain', 0.031), ('lem', 0.031), ('persons', 0.031), ('basis', 0.031), ('background', 0.03), ('violations', 0.029), ('violates', 0.029), ('membership', 0.029), ('incompatible', 0.029), ('wick', 0.029), ('extractions', 0.029), ('member', 0.029), ('joint', 0.028), ('graphical', 0.028), ('indeed', 0.028), ('knowledge', 0.027), ('precision', 0.027)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000005 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

2 0.16477427 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

3 0.12442375 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text

Author: Stefan Schoenmackers ; Jesse Davis ; Oren Etzioni ; Daniel Weld

Abstract: input. Even the entire Web corpus does not explicitly answer all questions, yet inference can uncover many implicit answers. But where do inference rules come from? This paper investigates the problem of learning inference rules from Web text in an unsupervised, domain-independent manner. The SHERLOCK system, described herein, is a first-order learner that acquires over 30,000 Horn clauses from Web text. SHERLOCK embodies several innovations, including a novel rule scoring function based on Statistical Relevance (Salmon et al., 1971) which is effective on ambiguous, noisy and incomplete Web extractions. Our experiments show that inference over the learned rules discovers three times as many facts (at precision 0.8) as the TEXTRUNNER system which merely extracts facts explicitly stated in Web text.

4 0.11494692 20 emnlp-2010-Automatic Detection and Classification of Social Events

Author: Apoorv Agarwal ; Owen Rambow

Abstract: In this paper we introduce the new task of social event extraction from text. We distinguish two broad types of social events depending on whether only one or both parties are aware of the social contact. We annotate part of Automatic Content Extraction (ACE) data, and perform experiments using Support Vector Machines with Kernel methods. We use a combination of structures derived from phrase structure trees and dependency trees. A characteristic of our events (which distinguishes them from ACE events) is that the participating entities can be spread far across the parse trees. We use syntactic and semantic insights to devise a new structure derived from dependency trees and show that this plays a role in achieving the best performing system for both social event detection and classification tasks. We also use three data sampling approaches to solve the problem of data skewness. Sampling methods improve the F1-measure for the task of relation detection by over 20% absolute over the baseline.

5 0.11273268 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

Author: Longhua Qian ; Guodong Zhou

Abstract: Seed sampling is critical in semi-supervised learning. This paper proposes a clusteringbased stratified seed sampling approach to semi-supervised learning. First, various clustering algorithms are explored to partition the unlabeled instances into different strata with each stratum represented by a center. Then, diversity-motivated intra-stratum sampling is adopted to choose the center and additional instances from each stratum to form the unlabeled seed set for an oracle to annotate. Finally, the labeled seed set is fed into a bootstrapping procedure as the initial labeled data. We systematically evaluate our stratified bootstrapping approach in the semantic relation classification subtask of the ACE RDC (Relation Detection and Classification) task. In particular, we compare various clustering algorithms on the stratified bootstrapping performance. Experimental results on the ACE RDC 2004 corpus show that our clusteringbased stratified bootstrapping approach achieves the best F1-score of 75.9 on the subtask of semantic relation classification, approaching the one with golden clustering.

6 0.1074362 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

7 0.10359541 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input

8 0.097879976 59 emnlp-2010-Identifying Functional Relations in Web Text

9 0.095038362 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

10 0.09414643 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora

11 0.063707106 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

12 0.050827146 79 emnlp-2010-Mining Name Translations from Entity Graph Mapping

13 0.045969877 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

14 0.040987156 12 emnlp-2010-A Semi-Supervised Method to Learn and Construct Taxonomies Using the Web

15 0.039452344 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

16 0.037821278 51 emnlp-2010-Function-Based Question Classification for General QA

17 0.037441302 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

18 0.036733907 69 emnlp-2010-Joint Training and Decoding Using Virtual Nodes for Cascaded Segmentation and Tagging Tasks

19 0.036265973 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

20 0.035135943 22 emnlp-2010-Automatic Evaluation of Translation Quality for Distant Language Pairs


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.144), (1, 0.097), (2, -0.033), (3, 0.263), (4, 0.011), (5, -0.271), (6, 0.01), (7, 0.125), (8, 0.072), (9, -0.182), (10, -0.017), (11, -0.104), (12, -0.061), (13, -0.117), (14, -0.023), (15, 0.021), (16, 0.071), (17, -0.012), (18, 0.089), (19, 0.079), (20, 0.07), (21, -0.049), (22, 0.101), (23, 0.095), (24, 0.03), (25, -0.111), (26, -0.025), (27, 0.03), (28, -0.09), (29, 0.135), (30, 0.147), (31, 0.053), (32, 0.145), (33, -0.122), (34, 0.002), (35, 0.011), (36, -0.079), (37, 0.147), (38, 0.011), (39, -0.02), (40, -0.098), (41, -0.07), (42, 0.15), (43, -0.014), (44, 0.03), (45, -0.071), (46, 0.009), (47, 0.066), (48, 0.031), (49, 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97248572 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

2 0.61195552 59 emnlp-2010-Identifying Functional Relations in Web Text

Author: Thomas Lin ; Mausam ; Oren Etzioni

Abstract: Determining whether a textual phrase denotes a functional relation (i.e., a relation that maps each domain element to a unique range element) is useful for numerous NLP tasks such as synonym resolution and contradiction detection. Previous work on this problem has relied on either counting methods or lexico-syntactic patterns. However, determining whether a relation is functional, by analyzing mentions of the relation in a corpus, is challenging due to ambiguity, synonymy, anaphora, and other linguistic phenomena. We present the LEIBNIZ system that overcomes these challenges by exploiting the synergy between the Web corpus and freelyavailable knowledge resources such as Freebase. It first computes multiple typedfunctionality scores, representing functionality of the relation phrase when its arguments are constrained to specific types. It then aggregates these scores to predict the global functionality for the phrase. LEIBNIZ outperforms previous work, increasing area under the precisionrecall curve from 0.61 to 0.88. We utilize LEIBNIZ to generate the first public repository of automatically-identified functional relations.

3 0.5726797 11 emnlp-2010-A Semi-Supervised Approach to Improve Classification of Infrequent Discourse Relations Using Feature Vector Extension

Author: Hugo Hernault ; Danushka Bollegala ; Mitsuru Ishizuka

Abstract: Several recent discourse parsers have employed fully-supervised machine learning approaches. These methods require human annotators to beforehand create an extensive training corpus, which is a time-consuming and costly process. On the other hand, unlabeled data is abundant and cheap to collect. In this paper, we propose a novel semi-supervised method for discourse relation classification based on the analysis of cooccurring features in unlabeled data, which is then taken into account for extending the feature vectors given to a classifier. Our experimental results on the RST Discourse Treebank corpus and Penn Discourse Treebank indicate that the proposed method brings a significant improvement in classification accuracy and macro-average F-score when small training datasets are used. For instance, with training sets of c.a. 1000 labeled instances, the proposed method brings improvements in accuracy and macro-average F-score up to 50% compared to a baseline classifier. We believe that the proposed method is a first step towards detecting low-occurrence relations, which is useful for domains with a lack of annotated data.

4 0.56226933 8 emnlp-2010-A Multi-Pass Sieve for Coreference Resolution

Author: Karthik Raghunathan ; Heeyoung Lee ; Sudarshan Rangarajan ; Nate Chambers ; Mihai Surdeanu ; Dan Jurafsky ; Christopher Manning

Abstract: Most coreference resolution models determine if two mentions are coreferent using a single function over a set of constraints or features. This approach can lead to incorrect decisions as lower precision features often overwhelm the smaller number of high precision ones. To overcome this problem, we propose a simple coreference architecture based on a sieve that applies tiers of deterministic coreference models one at a time from highest to lowest precision. Each tier builds on the previous tier’s entity cluster output. Further, our model propagates global information by sharing attributes (e.g., gender and number) across mentions in the same cluster. This cautious sieve guarantees that stronger features are given precedence over weaker ones and that each decision is made using all of the information available at the time. The framework is highly modular: new coreference modules can be plugged in without any change to the other modules. In spite of its simplicity, our approach outperforms many state-of-the-art supervised and unsupervised models on several standard corpora. This suggests that sievebased approaches could be applied to other NLP tasks.

5 0.55994332 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text

Author: Stefan Schoenmackers ; Jesse Davis ; Oren Etzioni ; Daniel Weld

Abstract: input. Even the entire Web corpus does not explicitly answer all questions, yet inference can uncover many implicit answers. But where do inference rules come from? This paper investigates the problem of learning inference rules from Web text in an unsupervised, domain-independent manner. The SHERLOCK system, described herein, is a first-order learner that acquires over 30,000 Horn clauses from Web text. SHERLOCK embodies several innovations, including a novel rule scoring function based on Statistical Relevance (Salmon et al., 1971) which is effective on ambiguous, noisy and incomplete Web extractions. Our experiments show that inference over the learned rules discovers three times as many facts (at precision 0.8) as the TEXTRUNNER system which merely extracts facts explicitly stated in Web text.

6 0.45459023 27 emnlp-2010-Clustering-Based Stratified Seed Sampling for Semi-Supervised Relation Classification

7 0.37403202 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

8 0.36288589 20 emnlp-2010-Automatic Detection and Classification of Social Events

9 0.35584366 62 emnlp-2010-Improving Mention Detection Robustness to Noisy Input

10 0.31498346 44 emnlp-2010-Enhancing Mention Detection Using Projection via Aligned Corpora

11 0.27188993 21 emnlp-2010-Automatic Discovery of Manner Relations and its Applications

12 0.20565595 107 emnlp-2010-Towards Conversation Entailment: An Empirical Investigation

13 0.19363335 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

14 0.18418667 79 emnlp-2010-Mining Name Translations from Entity Graph Mapping

15 0.14980623 7 emnlp-2010-A Mixture Model with Sharing for Lexical Semantics

16 0.14432782 46 emnlp-2010-Evaluating the Impact of Alternative Dependency Graph Encodings on Solving Event Extraction Tasks

17 0.14032236 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

18 0.13620871 35 emnlp-2010-Discriminative Sample Selection for Statistical Machine Translation

19 0.13611057 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

20 0.1310923 64 emnlp-2010-Incorporating Content Structure into Text Analysis Applications


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.021), (10, 0.037), (12, 0.053), (29, 0.04), (30, 0.021), (52, 0.019), (56, 0.047), (62, 0.02), (66, 0.071), (72, 0.034), (76, 0.016), (77, 0.534)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.81537467 28 emnlp-2010-Collective Cross-Document Relation Extraction Without Labelled Data

Author: Limin Yao ; Sebastian Riedel ; Andrew McCallum

Abstract: We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant supervision to train a factor graph model for relation extraction based on an existing knowledge base (Freebase, derived in parts from Wikipedia). For inference we run an efficient Gibbs sampler that leads to linear time joint inference. We evaluate our approach both for an indomain (Wikipedia) and a more realistic outof-domain (New York Times Corpus) setting. For the in-domain setting, our joint model leads to 4% higher precision than an isolated local approach, but has no advantage over a pipeline. For the out-of-domain data, we benefit strongly from joint modelling, and observe improvements in precision of 13% over the pipeline, and 15% over the isolated baseline.

2 0.53338242 116 emnlp-2010-Using Universal Linguistic Knowledge to Guide Grammar Induction

Author: Tahira Naseem ; Harr Chen ; Regina Barzilay ; Mark Johnson

Abstract: We present an approach to grammar induction that utilizes syntactic universals to improve dependency parsing across a range of languages. Our method uses a single set of manually-specified language-independent rules that identify syntactic dependencies between pairs of syntactic categories that commonly occur across languages. During inference of the probabilistic model, we use posterior expectation constraints to require that a minimum proportion of the dependencies we infer be instances of these rules. We also automatically refine the syntactic categories given in our coarsely tagged input. Across six languages our approach outperforms state-of-theart unsupervised methods by a significant margin.1

3 0.25052002 6 emnlp-2010-A Latent Variable Model for Geographic Lexical Variation

Author: Jacob Eisenstein ; Brendan O'Connor ; Noah A. Smith ; Eric P. Xing

Abstract: The rapid growth of geotagged social media raises new computational possibilities for investigating geographic linguistic variation. In this paper, we present a multi-level generative model that reasons jointly about latent topics and geographical regions. High-level topics such as “sports” or “entertainment” are rendered differently in each geographic region, revealing topic-specific regional distinctions. Applied to a new dataset of geotagged microblogs, our model recovers coherent topics and their regional variants, while identifying geographic areas of linguistic consistency. The model also enables prediction of an author’s geographic location from raw text, outperforming both text regression and supervised topic models.

4 0.23712891 105 emnlp-2010-Title Generation with Quasi-Synchronous Grammar

Author: Kristian Woodsend ; Yansong Feng ; Mirella Lapata

Abstract: The task of selecting information and rendering it appropriately appears in multiple contexts in summarization. In this paper we present a model that simultaneously optimizes selection and rendering preferences. The model operates over a phrase-based representation of the source document which we obtain by merging PCFG parse trees and dependency graphs. Selection preferences for individual phrases are learned discriminatively, while a quasi-synchronous grammar (Smith and Eisner, 2006) captures rendering preferences such as paraphrases and compressions. Based on an integer linear programming formulation, the model learns to generate summaries that satisfy both types of preferences, while ensuring that length, topic coverage and grammar constraints are met. Experiments on headline and image caption generation show that our method obtains state-of-the-art performance using essentially the same model for both tasks without any major modifications.

5 0.23655461 13 emnlp-2010-A Simple Domain-Independent Probabilistic Approach to Generation

Author: Gabor Angeli ; Percy Liang ; Dan Klein

Abstract: Percy Liang UC Berkeley Berkeley, CA 94720 pliang@cs.berkeley.edu Dan Klein UC Berkeley Berkeley, CA 94720 klein@cs.berkeley.edu We operate in a setting in which we are only given examples consisting of (i) a set of database records We present a simple, robust generation system which performs content selection and surface realization in a unified, domain-independent framework. In our approach, we break up the end-to-end generation process into a sequence of local decisions, arranged hierarchically and each trained discriminatively. We deployed our system in three different domains—Robocup sportscasting, technical weather forecasts, and common weather forecasts, obtaining results comparable to state-ofthe-art domain-specific systems both in terms of BLEU scores and human evaluation.

6 0.23581052 51 emnlp-2010-Function-Based Question Classification for General QA

7 0.2355441 67 emnlp-2010-It Depends on the Translation: Unsupervised Dependency Parsing via Word Alignment

8 0.23192361 20 emnlp-2010-Automatic Detection and Classification of Social Events

9 0.23058045 31 emnlp-2010-Constraints Based Taxonomic Relation Classification

10 0.22546889 72 emnlp-2010-Learning First-Order Horn Clauses from Web Text

11 0.22437099 65 emnlp-2010-Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

12 0.22403039 86 emnlp-2010-Non-Isomorphic Forest Pair Translation

13 0.22395769 81 emnlp-2010-Modeling Perspective Using Adaptor Grammars

14 0.2191799 25 emnlp-2010-Better Punctuation Prediction with Dynamic Conditional Random Fields

15 0.21288109 4 emnlp-2010-A Game-Theoretic Approach to Generating Spatial Descriptions

16 0.21125942 98 emnlp-2010-Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions

17 0.20978424 110 emnlp-2010-Turbo Parsers: Dependency Parsing by Approximate Variational Inference

18 0.20872343 37 emnlp-2010-Domain Adaptation of Rule-Based Annotators for Named-Entity Recognition Tasks

19 0.20526798 42 emnlp-2010-Efficient Incremental Decoding for Tree-to-String Translation

20 0.20284699 48 emnlp-2010-Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails