emnlp emnlp2011 emnlp2011-113 knowledge-graph by maker-knowledge-mining

113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

Source: pdf

Author: Stijn De Saeger ; Kentaro Torisawa ; Masaaki Tsuchida ; Jun'ichi Kazama ; Chikara Hashimoto ; Ichiro Yamada ; Jong Hoon Oh ; Istvan Varga ; Yulan Yan

Abstract: This paper proposes a semi-supervised relation acquisition method that does not rely on extraction patterns (e.g. “X causes Y” for causal relations) but instead learns a combination of indirect evidence for the target relation semantic word classes and partial patterns. This method can extract long tail instances of semantic relations like causality from rare and complex expressions in a large Japanese Web corpus in extreme cases, patterns that occur only once in the entire corpus. Such patterns are beyond the reach ofcurrent pattern based methods. We show that our method performs on par with state-of-the-art pattern based methods, and maintains a reasonable level of accuracy even for instances — — acquired from infrequent patterns. This ability to acquire long tail instances is crucial for risk management and innovation, where an exhaustive database of high-level semantic relations like causation is of vital importance.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 i - Abstract This paper proposes a semi-supervised relation acquisition method that does not rely on extraction patterns (e. [sent-7, score-0.517]

2 “X causes Y” for causal relations) but instead learns a combination of indirect evidence for the target relation semantic word classes and partial patterns. [sent-9, score-0.719]

3 This method can extract long tail instances of semantic relations like causality from rare and complex expressions in a large Japanese Web corpus in extreme cases, patterns that occur only once in the entire corpus. [sent-10, score-0.823]

4 Such patterns are beyond the reach ofcurrent pattern based methods. [sent-11, score-0.566]

5 We show that our method performs on par with state-of-the-art pattern based methods, and maintains a reasonable level of accuracy even for instances — — acquired from infrequent patterns. [sent-12, score-0.498]

6 This ability to acquire long tail instances is crucial for risk management and innovation, where an exhaustive database of high-level semantic relations like causation is of vital importance. [sent-13, score-0.427]

7 1 Introduction Pattern based relation acquisition methods rely on lexico-syntactic patterns (Hearst, 1992) for extracting relation instances. [sent-14, score-0.613]

8 However, since extraction patterns are learned using statistical methods that require a certain fre- quency of observations, pattern based methods fail to capture relations from complex expressions in which the pattern connecting the two words is rarely observed. [sent-24, score-1.005]

9 Consider the following sentence: “Curing hypertension alleviates the deterioration speed of the renal function, thereby lowering the risk of causing intracranial bleeding” Humans can infer that this sentence expresses a causal relation between the underlined noun phrases. [sent-25, score-0.63]

10 In the sense that the term pattern implies a recurring event, this expression contains no pattern for detecting the causal relation between hypertension and intracranial bleeding. [sent-29, score-0.793]

11 We propose a semi-supervised relation extraction method that does not rely on direct pattern evidence connecting the two words in a sentence. [sent-39, score-0.491]

12 We argue that the role of binary patterns can be replaced by a combination of two types of indirect evidence: semantic class information about the target relation and partial patterns, which are fragments or subpatterns of binary patterns. [sent-40, score-0.822]

13 The intuition is this: if a sentence like the example sentence above contains some word X belonging to the class of medical conditions and another word Y from the class of trau- mas, and X matches the partial pattern “. [sent-41, score-0.566]

14 However, our method manages to extract a large number of instances from sentences that contain no pattern that can be learned by pattern induction methods. [sent-47, score-0.611]

15 In Stage 1 we apply a stateof-the-art pattern based relation extractor to a Web corpus to obtain an initial batch of relation instances. [sent-50, score-0.61]

16 Given the output of Stage 1 and access to a Web corpus, the Stage 2 extractor is completely self-sufficient, and the whole method requires no supervision other than a handful of seed patterns to start the first stage extractor. [sent-52, score-0.568]

17 Semantic word classes and partial patterns play a crucial role throughout all steps of the process. [sent-54, score-0.505]

18 We evaluate our method on three relation acquisition tasks (causation, prevention and material relations) using a 600 million Japanese Web page cor826 Figure 1: Proposed method: data flow. [sent-55, score-0.46]

19 In the extreme case, we acquired several thousand word pairs co-occurring only in patterns that appear once in the entire corpus. [sent-62, score-0.484]

20 Word pairs that co-occur only with SO patterns represent the theoretical limiting case of relations that cannot be acquired using existing pattern based methods. [sent-64, score-0.843]

21 In this sense our method can be seen as complementary with pattern based approaches, and merging our method’s output with that of a pattern based method may be beneficial. [sent-65, score-0.512]

22 CDP takes a set of seed patterns as input, and automatically learns new class dependent patterns as paraphrases of the seed patterns. [sent-69, score-0.841]

23 Class dependent patterns are semantic class restricted versions of ordinary lexico-syntactic patterns. [sent-70, score-0.525]

24 Existing methods use class independent patterns such as “X causes Y ” to learn causal relations between X and Y . [sent-71, score-0.706]

25 Class dependent patterns however place semantic class restrictions on the noun pairs they may extract, like “Yaccidents causes Xincidents”. [sent-72, score-0.89]

26 These class restrictions make it possible to distinguish between multiple senses of highly ambiguous patterns (so-called “generic” patterns). [sent-74, score-0.437]

27 CDP ranks each noun pair in the corpus according to a score that reflects its likelihood of being a proper instance of the target relation, by calculating the semantic similarity of a set of seed patterns to the class dependent patterns this noun pair co-occurs with. [sent-78, score-1.373]

28 The output of CDP is a list of noun pairs ranked by score, together with the highest scoring class dependent pattern each noun pair co-occurs with. [sent-79, score-0.882]

29 All sentences in our corpus are dependency parsed, and patterns consist of words on the path of dependency relations connecting two nouns. [sent-93, score-0.515]

30 827 3 Stage 2 Extractor We use CDP as our Stage 1 extractor, and the top N noun pairs along with the class dependent patterns that extract them are given as input to Stage 2, which represents the main contribution of this work. [sent-94, score-0.768]

31 It does so using the semantic class restrictions and partial patterns obtained from the output of CDP. [sent-100, score-0.635]

32 The set of all semantic class pairs obtained from the class dependent patterns that extracted the top N results become the target semantic class pairs from which new candidate instances are generated. [sent-101, score-1.033]

33 For this we decompose the class dependent patterns from the Stage 1 extractor into partial patterns. [sent-104, score-0.693]

34 As mentioned previously, patterns consist of words on the path of dependency relations connecting the two target words in a syntactic tree. [sent-105, score-0.533]

35 To obtain partial patterns we split this dependency path into its two constituent branches, each one leading from the leaf word (i. [sent-106, score-0.468]

36 For example, “X s←ub−j split into two partial patterns “X −ob→j Y ” is s←ub−j causes” and causes −ob→j “causes Y ”. [sent-109, score-0.521]

37 These partial patterns capture the predicate →struc Ytu ”. [sent-110, score-0.442]

38 n1s Wcaep tduirsecta hrde partial patterns with syntactic heads other than verbs or adjectives. [sent-112, score-0.442]

39 Restricting candidate noun pairs by this combination of semantic word classes and partial pattern matching proved to be quite powerful. [sent-116, score-0.804]

40 If one member of the target noun pair in the positive samples above matches a partial pattern but the other does not, we randomly replace the latter by another noun found in the same sentence, and generate this new (noun pair, sentence) triple as a negative training sample. [sent-123, score-0.94]

41 In the causal relation experiments this approach had about 5% chance of generating false negatives noun pairs contained in the top N results of the Stage 1 extractor. [sent-124, score-0.499]

42 In addition to these base features, we include the semantic classes to which the candidate noun pair belongs, the partial patterns they match in this sentence, and the infix words inbetween the target noun pair. [sent-133, score-1.152]

43 Note that this feature set is not intended to be optimal beyond the actual claims of this paper, and we have deliberately avoided exhaustive feature engineering so as not to obscure the contribution of semantic classes and partial pattern to our approach. [sent-134, score-0.548]

44 To obtain the final output of our method we assign each unique noun pair the maximum score from all (noun pair, sen- tence) triples it occurs in, and discard all other sentences for this noun pair. [sent-141, score-0.492]

45 4 Evaluation We demonstrate the effectiveness of semantic word classes and partial pattern matching for relation extraction by showing that the method proposed in this paper performs at the level of other state-of-the-art relation acquisition methods. [sent-143, score-0.85]

46 24×, 381492658) explore several criteria for what constitutes an infrequent pattern including the theoretical limiting case of patterns observed only once in the entire corpus. [sent-149, score-0.673]

47 These instances are impossible to acquire by pattern based methods. [sent-150, score-0.422]

48 The ability to acquire relations from extremely infrequent expres— sions with decent accuracy demonstrates the utility of combining semantic word classes with partial pattern matching. [sent-151, score-0.828]

49 1 Experimental Setting We evaluate our method on three semantic relation acquisition tasks: causality, prevention and material. [sent-153, score-0.462]

50 In a prevention relation the source concept directly or indirectly acts to avoid the occurrence of the target concept, and in a material relation the source concept is a material or ingredient of the target concept. [sent-155, score-0.716]

51 Furthermore, we estimated pat- tern frequencies in a subset of the corpus (50 million pages, or 1/12th of the entire corpus) and discarded patterns that co-occur with less than 10 unique noun pairs in this smaller corpus. [sent-160, score-0.558]

52 These restrictions do not apply to the proposed method, which can extract noun pairs connected by patterns of arbitrary length, even if found only once in the corpus. [sent-161, score-0.626]

53 If the judges find the sentence contains sufficient evidence that the target relation holds between the candidate nouns, they mark the noun pair correct. [sent-172, score-0.495]

54 Espresso is a popular bootstrapping based method that uses a set of seed instances to induce extraction patterns for the target relation and then acquire new instances in an iterative bootstrapping process. [sent-181, score-0.861]

55 In each iteration Espresso performs pattern induction, pattern ranking and selection using previously acquired instances, and uses the newly acquired patterns to extraction new instances. [sent-182, score-1.036]

56 Espresso computes a reliability score for both instances and patterns based on their pointwise mutual information (PMI) with the top-scoring patterns and instances from the previous iteration. [sent-183, score-0.793]

57 w/o pattern Espresso Espresso CDP CDP (S) (L) (S) (L) (S) Figure 2: Precision of acquired relations (causality). [sent-188, score-0.451]

58 For all methods compared we rank the acquired noun pairs by their score and evaluated 500 random samples from the top 100,000 results. [sent-191, score-0.427]

59 For noun pairs acquired by CDP and Espresso we select the pattern that extracted this noun pair (in the case of Espresso, the pattern with the highest PMI for this noun pair), and randomly select a sentence in which the noun pair co-occurs with that pattern from our corpus. [sent-192, score-1.758]

60 2 in (Pantel and Pennacchiotti, 2006a)), Espresso computes a reliability score for each candidate pattern based on the weighted PMI of the pattern with all instances extracted so far. [sent-201, score-0.655]

61 generic) patterns, so instead of using all instances for computing pattern reliability we only use the m most reliable instances from the previous iteration, which were used to extract the candidate patterns of the current iteration (m = 200, like the original). [sent-204, score-0.808]

62 w/o pattern Espresso Espresso CDP CDP (S) (L) (S) (L) (S) Figure 3: Precision of acquired relations (prevention). [sent-209, score-0.451]

63 w/o pattern Espresso Espresso CDP CDP (S) (L) (S) (L) (S) Figure 4: Precision of acquired relations (material). [sent-215, score-0.451]

64 In the material relations the proposed method slightly outperforms both pattern based methods in the top 10,000 results (92% precision, lenient). [sent-223, score-0.423]

65 Only 12 partial patterns were obtained, which greatly reduced the output of the proposed method. [sent-226, score-0.442]

66 Dealing with Difficult Extractions How does our method handle noun pairs that are difficult to acquire by pattern based methods? [sent-229, score-0.601]

67 w/o CDP” (Proposed without CDP) in Figures 2 , 3 and 4 show the number and precision of evaluated samples from the proposed method that do not co-occur in our corpus with any of the patterns that extracted the top N results of the first stage extractor. [sent-231, score-0.569]

68 These graphs show that our method is not simply regenerating CDP’s top results but actually extracts many noun pairs that do not co-occur in patterns that are easily learned. [sent-232, score-0.594]

69 The same conclusion holds for the prevention results (Figure 3), where over 80% of the proposed method’s samples are noun pairs that do not co-occur with easily learnable patterns. [sent-234, score-0.554]

70 Note that in theory it is possible that these noun pairs could not be acquired by pattern based methods due to this threshold patterns must be able to extract more than 10 different noun pairs in a subset of our corpus, while the proposed method does not have this constraint. [sent-241, score-1.184]

71 So at least in theory, pattern based methods might be able to acquire all noun pairs obtained by our method by lowering this threshold. [sent-242, score-0.636]

72 To see that this is unlikely to be the case, consider Figure 5, which shows the pattern frequency ofthe patterns induced by CDP and Espresso for the causality experiment. [sent-243, score-0.753]

73 The x-axis represents pattern frequency in terms of the number of unique noun pairs a pattern co-occurs with in our corpus (on a log scale), and the y-axis shows the percentage of samples that was extracted by patterns of a given frequency. [sent-244, score-1.157]

74 4 Figure 5 shows that for the pattern based methods, the large majority of noun pairs was extracted by patterns that co-occur with several thousand different noun pairs. [sent-245, score-1.006]

75 In Figure 5, the histograms for the pattern based methods CDP and Espresso start around 1000 noun pairs, which is far above this new lowerbound. [sent-247, score-0.448]

76 4 In the case of CDP we ignore semantic class restrictions on the patterns when comparing frequencies. [sent-248, score-0.503]

77 Thus, pattern based methods naturally tend to induce patterns that are much more frequent than the range of patterns our method can capture, and it is unlikely that this is a result ofimplementation details like pattern frequency threshold. [sent-252, score-1.132]

78 Finally, the theoretical limiting case for pattern based algorithms consists of patterns that only cooccur with a single noun pair in the entire corpus (single occurrence or SO patterns). [sent-257, score-0.85]

79 Pattern based methods learn new patterns that share many noun pairs with a set of reliable patterns in order to extract new relation instances. [sent-258, score-1.024]

80 If a noun pair that co-occurs with a SO pattern also co-occurs with more reliable patterns there is no need to learn the SO pattern. [sent-259, score-0.795]

81 If that same noun pair does not co-occur with any other reliable pattern, the SO pattern is beyond the reach of any pattern induction method. [sent-260, score-0.741]

82 Thus, SO patterns are effectively useless for pattern based methods. [sent-261, score-0.566]

83 832 For the 500 samples evaluated from the causality and prevention relations acquired by our method we found 7 causal noun pairs that co-occur only in SO patterns and 29 such noun pairs for prevention. [sent-262, score-1.619]

84 In total we found 8,716 causal noun pairs and 7,369 prevention noun pairs that co-occur only with SO patterns. [sent-266, score-0.84]

85 Table 2 shows some example relations from our causality and prevention experiments that were extracted from SO patterns. [sent-267, score-0.509]

86 In the remainder of this section we look at how the combination of semantic word classes and partial patterns benefits our method. [sent-270, score-0.571]

87 For each relation we evaluated 1000 random (noun pair, sentence) triples satisfying the two conditions from section 3 matching semantic class pairs and partial patterns. [sent-271, score-0.54]

88 ()irnc%oisep1 74396850 0 0 02 04 06 08 01 0 (noun pair, sentence) triples ranked by score Base features only All minus semantic classes All minus infix words All minus partial patterns All features Figure 7: Contribution of feature sets (prevention). [sent-276, score-0.965]

89 Evaluating 100 samples for causality and prevention, we found the precision of the semantic class baseline was 16% for causality and 5% for prevention. [sent-279, score-0.668]

90 The pattern fragment baseline gave 9% for causality and 22% for prevention. [sent-280, score-0.443]

91 This is consid- erably lower than the precision of random samples that satisfy both the semantic class and partial pattern conditions, showing that the combination of semantic classes and partial patterns is more effective than either one individually. [sent-281, score-1.253]

92 The “Base features” graph shows the per833 ()irnc%oisep1 74396850 0 0 02 04 06 08 01 0 (noun pair, sentence) triples ranked by score Base features only All minus semantic classes All minus infix words All minus partial patterns All features Figure 8: Contribution of feature sets (material). [sent-284, score-0.965]

93 In other words, the main contribution of semantic word classes and partial patterns to our method’s performance lies not in the final classification step but seems to occur at earlier stages of the process, in the candidate and training data generation steps. [sent-289, score-0.641]

94 5 Related Work Using lexico-syntactic patterns to extract semantic relations was first explored by Hearst (Hearst, 1992), and has inspired a large body of work on semisupervised relation acquisition methods (Berland and Charniak, 1999; Agichtein and Gravano, 2000; Etzioni et al. [sent-290, score-0.686]

95 (2006) alleviates pattern sparseness by using infix patterns that are generalized using classes of distributionally similar words. [sent-298, score-0.741]

96 , 2010) are another attempt at relation acquisition that goes beyond pattern matching. [sent-316, score-0.433]

97 6 Conclusion We have proposed a relation acquisition method that is able to acquire semantic relations from infrequent expressions by focusing on the evidence provided by semantic word classes and partial pattern matching instead of direct extraction patterns. [sent-326, score-1.1]

98 We experimentally demonstrated the effectiveness of this approach on three relation acquisition tasks, causality, prevention and material relations. [sent-327, score-0.46]

99 In addition we showed our method could acquire a significant number of relation instances that are found in extremely infrequent expressions, the most extreme case of which are single occurrence patterns, which are beyond the reach of existing pattern based methods. [sent-328, score-0.684]

100 Integrating probabilistic extraction models and data mining to discover relations and patterns in text. [sent-360, score-0.443]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('cdp', 0.54), ('patterns', 0.31), ('pattern', 0.256), ('espresso', 0.227), ('prevention', 0.219), ('noun', 0.192), ('causality', 0.187), ('partial', 0.132), ('relation', 0.126), ('causal', 0.125), ('stage', 0.12), ('saeger', 0.105), ('lenient', 0.103), ('relations', 0.103), ('extractor', 0.102), ('acquire', 0.097), ('acquired', 0.092), ('class', 0.089), ('samples', 0.087), ('minus', 0.082), ('infrequent', 0.081), ('causes', 0.079), ('infix', 0.077), ('triples', 0.071), ('instances', 0.069), ('semantic', 0.066), ('pennacchiotti', 0.064), ('material', 0.064), ('classes', 0.063), ('generator', 0.061), ('torisawa', 0.061), ('causation', 0.06), ('stijn', 0.06), ('tsuchida', 0.06), ('dependent', 0.06), ('pairs', 0.056), ('kazama', 0.055), ('indirect', 0.055), ('precision', 0.052), ('acquisition', 0.051), ('japanese', 0.051), ('connecting', 0.05), ('pantel', 0.048), ('schoenmackers', 0.047), ('kow', 0.045), ('kuroda', 0.045), ('shinzato', 0.045), ('target', 0.044), ('bootstrapping', 0.04), ('candidate', 0.039), ('kentaro', 0.039), ('masaaki', 0.039), ('restrictions', 0.038), ('web', 0.037), ('pair', 0.037), ('pas', 0.037), ('graphs', 0.036), ('seed', 0.036), ('classifier', 0.036), ('etzioni', 0.035), ('lowering', 0.035), ('alleviates', 0.035), ('reliability', 0.035), ('ichi', 0.034), ('downey', 0.032), ('freebase', 0.032), ('tail', 0.032), ('contribution', 0.031), ('agichtein', 0.03), ('strict', 0.03), ('curing', 0.03), ('decent', 0.03), ('deterioration', 0.03), ('huynh', 0.03), ('ichiro', 0.03), ('intracranial', 0.03), ('irnc', 0.03), ('komachi', 0.03), ('masaki', 0.03), ('murata', 0.03), ('nhk', 0.03), ('renal', 0.03), ('tsubaki', 0.03), ('generic', 0.03), ('extract', 0.03), ('extraction', 0.03), ('evidence', 0.029), ('occurrence', 0.029), ('judges', 0.028), ('japan', 0.028), ('nouns', 0.028), ('svm', 0.028), ('causing', 0.027), ('hearst', 0.027), ('culotta', 0.027), ('logic', 0.027), ('limiting', 0.026), ('dependency', 0.026), ('extreme', 0.026), ('oren', 0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999958 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

Author: Stijn De Saeger ; Kentaro Torisawa ; Masaaki Tsuchida ; Jun'ichi Kazama ; Chikara Hashimoto ; Ichiro Yamada ; Jong Hoon Oh ; Istvan Varga ; Yulan Yan

2 0.19418454 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

Author: Marjorie Freedman ; Lance Ramshaw ; Elizabeth Boschee ; Ryan Gabbard ; Gary Kratkiewicz ; Nicolas Ward ; Ralph Weischedel

Abstract: We report on empirical results in extreme extraction. It is extreme in that (1) from receipt of the ontology specifying the target concepts and relations, development is limited to one week and that (2) relatively little training data is assumed. We are able to surpass human recall and achieve an F1 of 0.5 1 on a question-answering task with less than 50 hours of effort using a hybrid approach that mixes active learning, bootstrapping, and limited (5 hours) manual rule writing. We compare the performance of three systems: extraction with handwritten rules, bootstrapped extraction, and a combination. We show that while the recall of the handwritten rules surpasses that of the learned system, the learned system is able to improve the overall recall and F1.

3 0.15279742 78 emnlp-2011-Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

Author: Su Nam Kim ; Preslav Nakov

Abstract: Responding to the need for semantic lexical resources in natural language processing applications, we examine methods to acquire noun compounds (NCs), e.g., orange juice, together with suitable fine-grained semantic interpretations, e.g., squeezed from, which are directly usable as paraphrases. We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations. In evaluation, we found that having one compound noun fixed yields both a higher number of semantically interpreted NCs and improved accuracy due to stronger semantic restrictions.

4 0.14971466 128 emnlp-2011-Structured Relation Discovery using Generative Models

Author: Limin Yao ; Aria Haghighi ; Sebastian Riedel ; Andrew McCallum

Abstract: We explore unsupervised approaches to relation extraction between two named entities; for instance, the semantic bornIn relation between a person and location entity. Concretely, we propose a series of generative probabilistic models, broadly similar to topic models, each which generates a corpus of observed triples of entity mention pairs and the surface syntactic dependency path between them. The output of each model is a clustering of observed relation tuples and their associated textual expressions to underlying semantic relation types. Our proposed models exploit entity type constraints within a relation as well as features on the dependency path between entity mentions. We examine effectiveness of our approach via multiple evaluations and demonstrate 12% error reduction in precision over a state-of-the-art weakly supervised baseline.

5 0.14356372 92 emnlp-2011-Minimally Supervised Event Causality Identification

Author: Quang Do ; Yee Seng Chan ; Dan Roth

Abstract: This paper develops a minimally supervised approach, based on focused distributional similarity methods and discourse connectives, for identifying of causality relations between events in context. While it has been shown that distributional similarity can help identifying causality, we observe that discourse connectives and the particular discourse relation they evoke in context provide additional information towards determining causality between events. We show that combining discourse relation predictions and distributional similarity methods in a global inference procedure provides additional improvements towards determining event causality.

6 0.11051948 26 emnlp-2011-Class Label Enhancement via Related Instances

7 0.10899357 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

8 0.10282163 114 emnlp-2011-Relation Extraction with Relation Topics

9 0.084163241 70 emnlp-2011-Identifying Relations for Open Information Extraction

10 0.071999989 99 emnlp-2011-Non-parametric Bayesian Segmentation of Japanese Noun Phrases

11 0.066599131 40 emnlp-2011-Discovering Relations between Noun Categories

12 0.060874626 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

13 0.058037344 15 emnlp-2011-A novel dependency-to-string model for statistical machine translation

14 0.057275571 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

15 0.056238763 31 emnlp-2011-Computation of Infix Probabilities for Probabilistic Context-Free Grammars

16 0.055823803 96 emnlp-2011-Multilayer Sequence Labeling

17 0.05532467 75 emnlp-2011-Joint Models for Chinese POS Tagging and Dependency Parsing

18 0.053101853 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

19 0.049156148 124 emnlp-2011-Splitting Noun Compounds via Monolingual and Bilingual Paraphrasing: A Study on Japanese Katakana Words

20 0.048583131 4 emnlp-2011-A Fast, Accurate, Non-Projective, Semantically-Enriched Parser

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.194), (1, -0.11), (2, -0.174), (3, 0.047), (4, -0.084), (5, -0.25), (6, 0.097), (7, 0.064), (8, -0.075), (9, 0.022), (10, 0.053), (11, 0.039), (12, -0.016), (13, -0.004), (14, -0.048), (15, -0.016), (16, -0.095), (17, -0.177), (18, -0.085), (19, 0.099), (20, 0.224), (21, -0.143), (22, 0.034), (23, 0.052), (24, 0.104), (25, 0.111), (26, 0.065), (27, 0.071), (28, -0.136), (29, -0.075), (30, -0.149), (31, -0.142), (32, 0.013), (33, 0.039), (34, 0.039), (35, -0.038), (36, -0.09), (37, 0.014), (38, -0.057), (39, -0.066), (40, -0.138), (41, 0.007), (42, -0.0), (43, 0.046), (44, 0.08), (45, 0.032), (46, -0.048), (47, 0.022), (48, 0.006), (49, -0.019)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.96689105 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

Author: Stijn De Saeger ; Kentaro Torisawa ; Masaaki Tsuchida ; Jun'ichi Kazama ; Chikara Hashimoto ; Ichiro Yamada ; Jong Hoon Oh ; Istvan Varga ; Yulan Yan

2 0.83649057 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

Author: Marjorie Freedman ; Lance Ramshaw ; Elizabeth Boschee ; Ryan Gabbard ; Gary Kratkiewicz ; Nicolas Ward ; Ralph Weischedel

3 0.80123389 78 emnlp-2011-Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

Author: Su Nam Kim ; Preslav Nakov

4 0.59730113 70 emnlp-2011-Identifying Relations for Open Information Extraction

Author: Anthony Fader ; Stephen Soderland ; Oren Etzioni

Abstract: Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state-ofthe-art Open IE systems is rife with uninformative and incoherent extractions. To overcome these problems, we introduce two simple syntactic and lexical constraints on binary relations expressed by verbs. We implemented the constraints in the REVERB Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TEXTRUNNER and WOEpos. More than 30% of REVERB’s extractions are at precision 0.8 or higher— compared to virtually none for earlier systems. The paper concludes with a detailed analysis of REVERB’s errors, suggesting directions for future work.1 1 Introduction and Motivation Typically, Information Extraction (IE) systems learn an extractor for each target relation from labeled training examples (Kim and Moldovan, 1993; Riloff, 1996; Soderland, 1999). This approach to IE does not scale to corpora where the number of target relations is very large, or where the target relations cannot be specified in advance. Open IE solves this problem by identifying relation phrases—phrases that denote relations in English sentences (Banko et al., 2007). The automatic identification of rela1The source code for REVERB is available at reverb . cs .washingt on .edu/ http : // 1535 tion phrases enables the extraction of arbitrary relations from sentences, obviating the restriction to a pre-specified vocabulary. Open IE systems have achieved a notable measure of success on massive, open-domain corpora drawn from the Web, Wikipedia, and elsewhere. (Banko et al., 2007; Wu and Weld, 2010; Zhu et al., 2009). The output of Open IE systems has been used to support tasks like learning selectional preferences (Ritter et al., 2010), acquiring common sense knowledge (Lin et al., 2010), and recognizing entailment (Schoen- mackers et al., 2010; Berant et al., 2011). In addition, Open IE extractions have been mapped onto existing ontologies (Soderland et al., 2010). We have observed that two types of errors are frequent in the output of Open IE systems such as TEXTRUNNER and WOE: incoherent extractions and uninformative extractions. Incoherent extractions are cases where the extracted relation phrase has no meaningful interpretation (see Table 1 for examples). Incoherent extractions arise because the learned extractor makes a sequence of decisions about whether to include each word in the relation phrase, often resulting in incomprehensible predictions. To solve this problem, we introduce a syntactic constraint: every multi-word relation phrase must begin with a verb, end with a preposition, and be a contiguous sequence of words in the sentence. Thus, the identification of a relation phrase is made in one fell swoop instead of on the basis of multiple, word-by-word decisions. Uninformative extractions are extractions that omit critical information. For example, consider the sentence “Faust made a deal with the devil.” PreviProce dEindgisnb oufr tgh e, 2 S0c1o1tl Canodn,f eUrKen,c Jeuol yn 2 E7m–3p1ir,ic 2a0l1 M1.e ?tc ho2d0s1 in A Nsasotucira tlio Lnan fogru Cagoem Ppruotcaetisosninagl, L pinag uesis 1ti5c3s5–1545, ous Open IE systems return the uninformative (Faust, made, a deal) instead of (Faust, made a deal with, the devil). This type of error is caused by improper handling of relation phrases that are expressed by a combination of a verb with a noun, such as light verb constructions (LVCs). An LVC is a multi-word expression composed of a verb and a noun, with the noun carrying the semantic content of the predicate (Grefenstette and Teufel, 1995; Stevenson et al., 2004; Allerton, 2002). Table 2 illustrates the wide range of relations expressed this way, which are not captured by existing open extractors. Our syntactic constraint leads the extractor to include nouns in the relation phrase, solving this problem. Although the syntactic constraint significantly reduces incoherent and uninformative extractions, it allows overly-specific relation phrases such as is offering only modest greenhouse gas reduction targets at. To avoid overly-specific relation phrases, we introduce an intuitive lexical constraint: a binary relation phrase ought to appear with at least a minimal number of distinct argument pairs in a large corpus. In summary, this paper articulates two simple but surprisingly powerful constraints on how binary relationships are expressed via verbs in English sentences, and implements them in the REVERB Open IE system. We release REVERB and the data used in our experiments to the research community. The rest of the paper is organized as follows. Section 2 analyzes previous work. Section 3 defines our constraints precisely. Section 4 describes REVERB, our implementation of the constraints. Section 5 reports on our experimental results. Section 6 concludes with a summary and discussion of future work. 2 Previous Work Open IE systems like TEXTRUNNER (Banko et al., 2007), WOEpos, and WOEparse (Wu and Weld, 2010) focus on extracting binary relations of the form (arg1, relation phrase, arg2) from text. These systems all use the following three-step method: 1. Label: Sentences are automatically labeled with extractions using heuristics or distant supervision. 1536 tTSblaoenh rgdpe atyoneMgdrm.uoahcietrsdkcaes1lci4neatodrwnsea.tlrhoianfctsehndNaterufplndergtcoli.sn hkcetrIwnoca stlhceindrstbeoanmgltiRonserplatdion Table 1: Examples of incoherent extractions. Incoherent extractions make up approximately 13% of TEXTRUNNER’s output, 15% of WOEpos’s output, and 30% of WOEparse’s output. tighmosatvdkeihgmosa tvdnkeaicbp lkaroedpucthmsalitonbw,yigtoanhivst,oekamfh,cedotanlusetkroahlntPpr,ohgv.aeDomfvrt,.isonue,atdkhowcainstmdygaevifncromtnaeg oinf Table 2: Examples of uninformative relations (left) and their completions (right). Uninformative relations occur in approximately 4% of WOEparse’s output, 6% of WOEpos’s output, and 7% of TEXTRUNNER’s output. 2. Learn: A relation phrase extractor is learned using a sequence-labeling (e.g., CRF). graphical model 3. Extract: the system takes a sentence as input, identifies a candidate pair of NP arguments (arg1, arg2) from the sentence, and then uses the learned extractor to label each word between the two arguments as part of the relation phrase or not. The extractor is applied to the successive sentences in the corpus, and the resulting extractions are collected. This method faces several challenges. First, the training phase requires a large number of labeled training examples (e.g., 200, 000 heuristicallylabeled sentences for TEXTRUNNER and 300, 000 for WOE). Heuristic labeling of examples obviates hand labeling but results in noisy labels and distorts the distribution of examples. Second, the extraction step is posed as a sequence-labeling problem, where each word is assigned its own label. Because each assignment is uncertain, the likelihood that the extracted relation phrase is flawed increases with the length of the sequence. Finally, the extractor chooses an extraction’s arguments heuristically, and cannot backtrack over this choice. This is problematic when a word that belongs in the relation phrase is chosen as an argument (for example, deal from the “made a deal with” sentence). Because of the feature sets utilized in previous work, the learned extractors ignore both “holistic” aspects of the relation phrase (e.g., is it contiguous?) as well as lexical aspects (e.g., how many instances of this relation are there?). Thus, as we show in Section 5, systems such as TEXTRUNNER are unable to learn the constraints embedded in REVERB. Of course, a learning system, utilizing a different hypothesis space, and an appropriate set of training examples, could potentially learn and refine the constraints in REVERB. This is a topic for future work, which we consider in Section 6. The first Open IE system was TEXTRUNNER (Banko et al., 2007), which used a Naive Bayes model with unlexicalized POS and NP-chunk features, trained using examples heuristically generated from the Penn Treebank. Subsequent work showed that utilizing a linear-chain CRF (Banko and Etzioni, 2008) or Markov Logic Network (Zhu et al., 2009) can lead to improved extraction. The WOE systems introduced by Wu and Weld make use of Wikipedia as a source of training data for their extractors, which leads to further improvements over TEXTRUNNER (Wu and Weld, 2010). Wu and Weld also show that dependency parse features result in a dramatic increase in precision and recall over shallow linguistic features, but at the cost of extraction speed. Other approaches to large-scale IE have included Preemptive IE (Shinyama and Sekine, 2006), OnDemand IE (Sekine, 2006), and weak supervision for IE (Mintz et al., 2009; Hoffmann et al., 2010). Preemptive IE and On-Demand IE avoid relationspecific extractors, but rely on document and entity clustering, which is too costly for Web-scale IE. Weakly supervised methods use an existing ontology to generate training data for learning relationspecific extractors. While this allows for learning relation-specific extractors at a larger scale than what was previously possible, the extractions are still restricted to a specific ontology. Many systems have used syntactic patterns based on verbs to extract relation phrases, usually rely1537 ing on a full dependency parse of the input sentence (Lin and Pantel, 2001 ; Stevenson, 2004; Specia and Motta, 2006; Kathrin Eichler and Neumann, 2008). Our work differs from these approaches by focusing on relation phrase patterns expressed in terms of POS tags and NP chunks, instead of full parse trees. Banko and Etzioni (Banko and Etzioni, 2008) showed that a small set of POS-tag patterns cover a large fraction of relationships in English, but never incorporated the patterns into an extractor. This paper reports on a substantially improved model of binary relation phrases, which increases the recall of the Banko-Etzioni model (see Section 3.3). Further, while previous work in Open IE has mainly focused on syntactic patterns for relation extraction, we introduce a lexical constraint that boosts precision and recall. Finally, Open IE is closely related to semantic role labeling (SRL) (Punyakanok et al., 2008; Toutanova et al., 2008) in that both tasks extract relations and arguments from sentences. However, SRL systems traditionally rely on syntactic parsers, which makes them susceptible to parser errors and substantially slower than Open IE systems such as REVERB. This difference is particularly important when operating on the Web corpus due to its size and heterogeneity. Finally, SRL requires hand-constructed semantic resources like Propbank and Framenet (Martha and Palmer, 2002; Baker et al., 1998) as input. In contrast, Open IE systems require no relation-specific training data. ReVerb, in particular, relies on its explicit lexical and syntactic constraints, which have no correlate in SRL systems. For a more detailed comparison of SRL and Open IE, see (Christensen et al., 2010). 3 Constraints on Relation Phrases In this section we introduce two constraints on relation phrases: a syntactic constraint and a lexical constraint. 3.1 Syntactic Constraint The syntactic constraint serves two purposes. First, it eliminates incoherent extractions, and second, it reduces uninformative extractions by capturing relation phrases expressed by a verb-noun combination, including light verb constructions. PVW= v(p enr oVbupn |a prVta iPcdrtljiec|?lVaedaW|dv in|∗?fpP.rmonar|kder)t Figure 1: A simple part-of-speech-based regular expression reduces the number of incoherent extractions like was central torpedo and covers relations expressed via light verb constructions like gave a talk at. The syntactic constraint requires the relation phrase to match the POS tag pattern shown in Figure 1. The pattern limits relation phrases to be either a verb (e.g., invented), a verb followed immediately by a preposition (e.g., located in), or a verb followed by nouns, adjectives, or adverbs ending in a preposition (e.g., has atomic weight of). Ifthere are multiple possible matches in a sentence for a single verb, the longest possible match is chosen. Finally, if the pattern matches multiple adjacent sequences, we merge them into a single relation phrase (e.g., wants to extend). This refinement enables the model to readily handle relation phrases containing multiple verbs. A consequence ofthis pattern is that the relation phrase must be a contiguous span of words in the sentence. The syntactic constraint eliminates the incoherent relation phrases returned by existing systems. example, given the sentence For Extendicare agreed to buy Arbor Health Care for about US $432 million in cash and assumed debt. TEXTRUNNER returns the extraction (Arbor Health Care, for assumed, debt). The phrase for assumed is clearly not a valid relation phrase: it begins with a preposition and splices together two distant words in the sentence. The syntactic constraint prevents this type of error by simply restricting relation phrases to match the pattern in Figure 1. The syntactic constraint reduces uninformative extractions by capturing relation phrases expressed via LVCs. For example, the POS pattern matched against the sentence “Faust made a deal with the Devil,” would result in the relation phrase made a deal with, instead of the uninformative made. Finally, we require the relation phrase to appear between its two arguments in the sentence. This is a common constraint that has been implicitly enforced in other open extractors. 1538 3.2 Lexical Constraint While the syntactic constraint greatly reduces uninformative extractions, it can sometimes match relation phrases that are so specific that they have only a few possible instances, even in a Web-scale corpus. Consider the sentence: The Obama administration is offering only modest greenhouse gas reduction targets at the conference. The POS pattern will match the phrase: is offering only modest greenhouse gas reduction targets at (1) Thus, there are phrases that satisfy the syntactic constraint, but are not relational. To overcome this limitation, we introduce a lexical constraint that is used to separate valid relation phrases from overspecified relation phrases, like the example in (1). The constraint is based on the intuition that a valid relation phrase should take many distinct arguments in a large corpus. The phrase in (1) is specific to the argument pair (Obama administration, conference), so it is unlikely to represent a bona fide relation. We describe the implementation details of the lexical constraint in Section 4. 3.3 Limitations Our constraints represent an idealized model of relation phrases in English. This raises the question: How much recall is lost due to the constraints? To address this question, we analyzed Wu and Weld’s set of 300 sentences from a set of random Web pages, manually identifying all verb-based relationships between noun phrase pairs. This resulted in a set of 327 relation phrases. For each relation phrase, we checked whether it satisfies our constraints. We found that 85% of the relation phrases do satisfy the constraints. Of the remaining 15%, we identified some of the common cases where the constraints were violated, summarized in Table 3. Many of the example relation phrases shown in Table 3 involve long-range dependencies between words in the sentence. These types of dependencies are not easily representable using a pattern over POS tags. A deeper syntactic analysis of the input sentence would provide a much more general language for modeling relation phrases. For example, one could create a model of relations expressed in Table 3: Approximately 85% of the binary verbal relation phrases in a sample of Web sentences satisfy our constraints. terms of dependency parse features that would capture the non-contiguous relation phrases in Table 3. Previous work has shown that dependency paths do indeed boost the recall of relation extraction systems (Wu and Weld, 2010; Mintz et al., 2009). While using dependency path features allows for a more flexible model of relations, it significantly increases pro- cessing time, which is problematic for Web-scale extraction. Further, we have found that this increased recall comes at the cost of lower precision on Web text (see Section 5). The results in Table 3 are similar to Banko and Etzioni’s findings that a set of eight POS patterns cover a large fraction of binary verbal relation phrases. However, their analysis was based on a set of sentences known to contain either a company acquisition or birthplace relationship, while our results are on a random sample of Web sentences. We applied Banko and Etzioni’s verbal patterns to our random sample of 300 Web sentences, and found that they cover approximately 69% of the relation phrases in the corpus. The gap in recall between this and the 85% shown in Table 3 is largely due to LVC relation phrases (made a deal with) and phrases containing multiple verbs (refuses to return to), which their patterns do not cover. In sum, our model is by no means complete. However, we have empirically shown that the majority of binary verbal relation phrases in a sample of Web sentences are captured by our model. By focusing on this subset of language, our model can 1539 be used to perform Open IE at significantly higher precision than before. 4 REVERB This section introduces REVERB, a novel open extractor based on the constraints defined in the previous section. REVERB first identifies relation phrases that satisfy the syntactic and lexical constraints, and then finds a pair of NP arguments for each identified relation phrase. The resulting extractions are then assigned a confidence score using a logistic regression classifier. This algorithm differs in three important ways from previous methods (Section 2). First, the relation phrase is identified “holistically” rather than word-by-word. Second, potential phrases are filtered based on statistics over a large corpus (the implementation of our lexical constraint). Finally, REVERB is “relation first” rather than “arguments first”, which enables it to avoid a common error made by previous methods—confusing a noun in the relation phrase for an argument, e.g. the noun deal in made a deal with. 4.1 Extraction Algorithm REVERB takes as input a POS-tagged and NPchunked sentence and returns a set of (x, r, y) extraction triples.2 Given an input sentence s, REVERB uses the following extraction algorithm: 1. Relation Extraction: For each verb v in s, find the longest sequence of words rv such that (1) rv starts at v, (2) rv satisfies the syntactic constraint, and (3) rv satisfies the lexical constraint. If any pair of matches are adjacent or overlap in s, merge them into a single match. 2. Argument Extraction: For each relation phrase r identified in Step 1, find the nearest noun phrase x to the left of r in s such that x is not a relative pronoun, WHO-adverb, or existential “there”. Find the nearest noun phrase y to the right of r in s. If such an (x, y) pair could be found, return (x, r, y) as an extraction. We check whether a candidate relation phrase rv satisfies the syntactic constraint by matching it against the regular expression in Figure 1. 2REVERB uses OpenNLP for POS tagging and NP chunking: http : / / opennlp . s ource forge . net / To determine whether rv satisfies the lexical constraint, we use a large dictionary D of relation phrases that are known to take many distinct arguments. In an offline step, we construct D by finding all matches of the POS pattern in a corpus of 500 million Web sentences. For each matching relation phrase, we heuristically identify its arguments (as in Step 2 above). We set D to be the set of all relation phrases that take at least k distinct argument pairs in the set of extractions. In order to allow for minor variations in relation phrases, we normalize each relation phrase by removing inflection, auxiliary verbs, adjectives, and adverbs. Based on experiments on a held-out set of sentences, we found that a value of k = 20 works well for filtering out overspecified relations. This results in a set of approximately 1.7 million distinct normalized relation phrases, which are stored in memory at extraction time. As an example of the extraction algorithm in action, consider the following input sentence: Hudson was born in Hampstead, which is a suburb of London. Step 1 of the algorithm identifies three relation phrases that satisfy the syntactic and lexical constraints: was, born in, and is a suburb of. The first two phrases are adjacent in the sentence, so they are merged into the single relation phrase was born in. Step 2 then finds an argument pair for each relation phrase. For was born in, the nearest NPs are (Hudson, Hampstead). For is a suburb of, the extractor skips over the NP which and chooses the argument pair (Hampstead, London). The final output is e1: (Hudson, was born in, Hampstead) e2: (Hampstead, is a suburb of, London). 4.2 Confidence Function The extraction algorithm in the previous section has high recall, but low precision. Like with previous open extractors, we want way to trade recall for precision by tuning a confidence threshold. We use a logistic regression classifier to assign a confidence score to each extraction, which uses the features shown in Table 4. All of these features are efficiently computable and relation independent. We trained the confidence function by manually labeling the extractions from a set of 1, 000 sentences from the Web and Wikipedia as correct or incorrect. 1540 Table 4: REVERB uses these features to assign a confidence score to an extraction (x, r, y) from a sentence s using a logistic regression classifier. Previous open extractors require labeled training data to learn a model of relations, which is then used to extract relation phrases from text. In contrast, REVERB uses a specified model of relations for extraction, and requires labeled data only for assigning confidence scores to its extractions. Learning a confidence function is a much simpler task than learning a full model of relations, using two orders of magnitude fewer training examples than TEXTRUNNER or WOE. 4.3 TEXTRUNNER-R The model of relation phrases used by REVERB is specified, but could a TEXTRUNNER-like system learn this model from training data? While it is difficult to answer such a question for all possible permutations of features sets, training examples, and learning biases, we demonstrate that TEXTRUNNER itself cannot learn REVERB’s model even when re-trained using the output of REVERB as labeled training data. The resulting system, TEXTRUNNER-R, uses the same feature representation as TEXTRUNNER, but different parameters, and a different set of training examples. To generate positive instances, we ran REVERB on the Penn Treebank, which is the same dataset that TEXTRUNNER is trained on. To generate negative instances from a sentence, we took each noun phrase pair in the sentence that does not appear as arguments in a REVERB extraction. This process resulted in a set of 67, 562 positive instances, and 356, 834 negative instances. We then passed these labeled examples to TEXTRUNNER’s training procedure, which learns a linear-chain CRF using closedclass features like POS tags, capitalization, punctuation, etc.TEXTRUNNER-R uses the argument-first extraction algorithm described in Section 2. 5 Experiments We compare REVERB to the following systems: • • REVERB¬lex - The REVERB system described iRn the previous section, but without the lexical constraint. REVERB¬lex uses the same confidence function as REVERB. TEXTRUNNER - Banko and Etzioni’s 2008 extractor, which uses a second order linear-chain CRF trained on extractions heuristically generated from the Penn Treebank. TEXTRUNNER uses shallow linguistic features in its CRF, which come from the same POS tagger and NPchunker that REVERB uses. • • • TEXTRUNNER-R - Our modification to TEXTRUNNER, which uses the same extraction code, but with a model of relations trained on REVERB extractions. WOEpos - Wu and Weld’s modification to TEXTRUNNER, which uses a model of relations learned from extractions heuristically generated from Wikipedia. WOEparse - Wu and Weld’s parser-based extractor, which uses a large dictionary of dependency path patterns learned from heuristic extractions generated from Wikipedia. Each system is given a set of sentences as input, and returns a set of binary extractions as output. We created a test set of 500 sentences sampled from the Web, using Yahoo’s random link service.3 After run3http : / /random . yahoo .com/bin/ryl 1541 rAaneUudeCrv aPdRCAPreu0 0 . 543210 REV RB EV RBWOET XT-WOET XT¬¬lleexx ppaarrssee RRUUNNNNEERR--RR ppooss RRUUNNNNEERR Figure 2: REVERB outperforms state-of-the-art open extractors, with an AUC more than twice that of TEXTRUNNER or WOEpos, and 38% higher than WOEparse. CCoommppaarriissoonn ooff RREEVVEERRBB--BBaasseedd SSyysstteemmss RReeccaall l Figure 3: The lexical constraint gives REVERB a boost in precision and recall over REVERB¬lex. TEXTRUNNER-R is unable to learn the model used by REVERB, which results in lower precision and recall. ning each extractor over the input sentences, two human judges independently evaluated each extraction as correct or incorrect. The judges reached agreement on 86% of the extractions, with an agreement score of κ = 0.68. We report results on the subset of the data where the two judges concur. Thejudges labeled uninformative extractions conservatively. That is, if critical information was dropped from the relation phrase but included in the second argument, it is labeled correct. For example, both the extractions (Ackerman, is a professor of, biology) and (Ackerman, is, a professor of biology) are considered correct. Each system returns confidence scores for its extractions. For a given threshold, we can measure the precision and recall of the output. Precision is the fraction of returned extractions that are correct. Recall is the fraction of correct extractions in EExxttrraaccttiioonnss RReeccaall l Figure 4: REVERB achieves significantly higher precision than state-of-the-art Open IE systems, and comparable recall to WOEparse. the corpus that are returned. We use the total number of extractions labeled as correct by the judges as our measure of recall for the corpus. In order to avoid double-counting, we treat extractions that differ superficially (e.g., different punctuation or dropping inessential modifiers) as a single extraction. We compute a precision-recall curve by varying the confidence threshold, and then compute the area under the curve (AUC). 5.1 Results Figure 2 shows the AUC of each system. REVERB achieves an AUC that is 30% higher than WOEparse and is more than double the AUC of WOEpos or TEXTRUNNER. The lexical constraint provides a significant boost in performance, with REVERB achieving an AUC 23% higher than REVERB¬lex. REVERB proves to be a useful source of training data, with TEXTRUNNER-R having an AUC 71% higher than TEXTRUNNER and performing on par with WOEpos. From the training data, TEXTRUNNER-R was able to learn a model that predicts contiguous relation phrases, but still returned incoherent relation phrases (e.g., starting with a preposition) and overspecified relation phrases. These errors are due to TEXTRUNNER-R overfitting the training data and not having access to the lexical constraint. Figure 3 shows the precision-recall curves of the systems introduced in this paper. TEXTRUNNER-R has much lower precision than REVERB and 1542 RReellaattiioonnss OOnnllyy RReeccaall l Figure 5: On the subtask of identifying relations phrases, REVERB is able to achieve even higher precision and recall than other systems. REVERB¬lex at all levels of recall. The lexical constraint gives REVERB a boost in precision over REVERB¬lex, reducing overspecified extractions from 20% of REVERB¬lex’s output to 1% of REVERB’s. The lexical constraint also boosts recall over REVERB¬lex, since REVERB is able to find a correct relation phrase where REVERB¬lex finds an overspecified one. Figure 4 shows the precision-recall curves of REVERB and the external systems. REVERB has much higher precision than the other systems at nearly all levels of recall. In particular, more than 30% of REVERB’s extractions are at precision 0.8 or higher, compared to virtually none for the other systems. WOEparse achieves a slightly higher recall than REVERB (0.62 versus 0.64), but at the cost of lower precision. In order to highlight the role of the relational model of each system, we also evaluate their performance on the subtask of extracting just the relation phrases from the input text. Figure 5 shows the precision-recall curves for each system on the relation phrase-only evaluation. In this case, REVERB has both higher precision and recall than the other systems. REVERB’s biggest improvement came from the elimination of incoherent extractions. Incoherent extractions were a large fraction of the errors made by previous systems, accounting for approximately 13% of TEXTRUNNER’s extractions, 15% of WOEpos’s, and 30% of WOEparse’s. Uninformative 162857% IOCNmoRt-vhparenE -yrsVc,paoiEtenlRrvcetailBfgu oev-tdinoIur bsgecplroPahtOliroeStsni/c ,phEiunrxhcaoktsreincgtieoransgument Table 5: The majority of the incorrect extractions returned by REVERB are due to errors in argument extraction. extractions had a smaller effect on other systems’ precision, accounting for 4% of WOEparse’s extractions, 5% of WOEpos’s, and 7% of TEXTRUNNER’s, while only appearing in 1% of REVERB’s extractions. REVERB’s reduction in uninformative extractions resulted in a boost in recall, capturing many LVC relation phrases missed by other systems (like those shown in Table 2). To test the systems’ speed, we ran each extractor on a set of 100, 000 sentences using a Pentium 4 machine with 4GB of RAM. The processing times were 16 minutes for REVERB, 21 minutes for TEXTRUNNER, 21 minutes for WOEpos, and 11 hours for WOEparse. The times for REVERB, TEXTRUNNER, and WOEpos are all approximately the same, since they all use the same POS-tagging and NP-chunking software. WOEparse processes each sentence with a dependency parser, resulting in much longer processing time. 5.2 REVERB Error Analysis To better understand the limitations of REVERB, we performed a detailed analysis of its errors in precision (incorrect extractions returned by REVERB) and its errors in recall (correct extractions that REVERB missed). Table 5 summarizes the types of incorrect extractions that REVERB returns. We found that 65% of the incorrect extractions returned by REVERB were cases where a relation phrase was correctly identified, but the argument-finding heuristics failed. The remaining errors were cases where REVERB extracted an incorrect relation phrase. One common mistake that REVERB made was extracting a relation phrase that expresses an n-ary relationship via a ditransitive verb. For example, given the sentence 1543 521873% ICRPdOeolEunSaVt/dciEofhneRuodBnftiakl-dmneMrgnotdiersfoyresudpctroebEcyrxieftlcrxa icertlgaounimcosne tsrain Table 6: The majority of extractions that were missed by REVERB were cases where the correct relation phrase was found, but the arguments were not correctly identified. “I gave him 15 photographs,” REVERB extracts (I, gave, him). These errors are due to the fact that REVERB only models binary relations. Table 6 summarizes the correct extractions that were extracted by other systems and were not extracted by REVERB. As with the false positive extractions, the majority of false negatives (52%) were due to the argument-finding heuristics choosing the wrong arguments, or failing to extract all possible arguments (in the case of coordinating conjunctions). Other sources of failure were due to the lexical constraint either failing to filter out an overspecified relation phrase or filtering out a valid relation phrase. These errors hurt both precision and recall, since each case results in the extractor overlooking a correct relation phrase and choosing another. 5.3 Evaluation At Scale Section 5.1 shows that REVERB outperforms existing Open IE systems when evaluated on a sample of sentences. Previous work has shown that the frequency of an extraction in a large corpus is useful for assessing the correctness of extractions (Downey et al., 2005). Thus, it is possible a priori that REVERB’s gains over previous systems will diminish when extraction frequency is taken into account. In fact, we found that REVERB’s advantage over TEXTRUNNER when run at scale is qualitatively similar to its advantage on single sentences. We ran both REVERB and TEXTRUNNER on Banko and Etzioni’s corpus of 500 million Web sentences and examined the effect of redundancy on precision. As Downey’s work predicts, precision increased in both systems for extractions found multiple times, compared with extractions found only once. However, REVERB had higher precision than TEXTRUNNER at all frequency thresholds. In fact, REVERB’s frequency 1 extractions had a precision of 0.75, which TEXTRUNNER could not approach even with frequency 10 extractions, which had a precision of 0.34. Thus, REVERB is able to return more correct extractions at a higher precision than TEXTRUNNER, even when redundancy is taken into account. 6 Conclusions and Future Work The paper’s contributions are as follows: • We have identified and analyzed the problems oWfe ein hcaovhee riedennt tainfide dun ainndfo armnaaltyivzeed de txhtera pctrioobnlse mfors Open IE systems, and shown their prevalence for systems such as TEXTRUNNER and WOE. • We articulated general, easy-to-enforce consWtreain atrst on binary, nveerrabl,-b aesaseyd rteo-laetniofonr phrases in English that ameliorate these problems and yield richer and more informative relations (see, for example, Table 2). • Based on these constraints, we designed, implemented, haensde e cvoanlustartaeind tsh,e w wReE dVeEsRigBn eedx,tr iamc-tor, which substantially outperforms previous Open IE systems in both recall and precision. • We make REVERB and the data used in our experiments available to the research community.4 In future work, we plan to explore utilizing our constraints to improve the performance of learned CRF models. Roth et al. have shown how to incorporate constraints into CRF learners (Roth and Yih, 2005). It is natural, then, to consider whether the combination of heuristically labeled training examples, CRF learning, and our constraints will result in superior performance. The error analysis in Section 5.2 also suggests natural directions for future work. For instance, since many of REVERB’s errors are due to incorrect arguments, improved methods for argument extraction are in order. Acknowledgments We would like to thank Mausam, Dan Weld, Yoav Artzi, Luke Zettlemoyer, members of the KnowItAll 4http : / / reverb . cs .washingt on . edu 1544 group, and the anonymous reviewers for their helpful comments. This research was supported in part by NSF grant IIS-0803481, ONR grant N00014-08- 1-0431, and DARPA contract FA8750-09-C-0179, and carried out at the University of Washington’s Turing Center. References David J. Allerton. 2002. Stretched Verb Constructions in English. Routledge Studies in Germanic Linguistics. Routledge (Taylor and Francis), New York. Collin F. Baker, Charles J. Fillmore, and John B. Lowe. 1998. The berkeley framenet project. In Proceedings of the 1 international conference on Computational 7th linguistics, pages 86–90. Michele Banko and Oren Etzioni. 2008. The tradeoffs between open and traditional relation extraction. In Proceedings of ACL-08: HLT, pages 28–36, Columbus, Ohio, June. Association for Computational Linguistics. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matt Broadhead, and Oren Etzioni. 2007. Open information extraction from the web. In In the Proceedings of the 20th International Joint Conference on Artificial Intelligence, pages 2670–2676, January. Jonathan Berant, Ido Dagan, and Jacob Goldberger. 2011. Global learning of typed entailment rules. In Proceedings of ACL, Portland, OR. Janara Christensen, Mausam, Stephen Soderland, and Oren Etzioni. 2010. Semantic role labeling for open information extraction. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, FAM-LbR ’ 10, pages 52–60, Stroudsburg, PA, USA. Association for Computational Linguistics. Doug Downey, Oren Etzioni, and Stephen Soderland. 2005. A probabilistic model of redundancy in information extraction. In IJCAI, pages 1034–1041. Gregory Grefenstette and Simone Teufel. 1995. Corpusbased method for automatic identification of support verbs for nominalizations. In Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics, pages 98–103, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Raphael Hoffmann, Congle Zhang, and Daniel S. Weld. 2010. Learning 5000 relational extractors. In Proceedings ofthe 48thAnnual Meeting ofthe Association for Computational Linguistics, ACL ’ 10, pages 286– 295, Stroudsburg, PA, USA. Association for Computational Linguistics. Holmer Hemsen Kathrin Eichler and Gnter Neumann. 2008. Unsupervised relation extraction from web documents. In LREC. http://www.lrecconf.org/proceedings/lrec2008/. J. Kim and D. Moldovan. 1993. Acquisition of semantic patterns for information extraction from corpora. In Procs. of Ninth IEEE Conference on Artificial Intelligence for Applications, pages 171–176. Dekang Lin and Patrick Pantel. 2001. DIRT-Discovery of Inference Rules from Text. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining(KDD-01), pages pp. 323–328. Thomas Lin, Mausam, and Oren Etzioni. 2010. Identifying Functional Relations in Web Text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1266–1276, Cambridge, MA, October. Association for Computational Linguistics. Paul Kingsbury Martha and Martha Palmer. 2002. From treebank to propbank. In In Proceedings of LREC2002. Mike Mintz, Steven Bills, Rion Snow, and Dan Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In ACL-IJCNLP ’09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pages 1003–101 1, Morristown, NJ, USA. Association for Computational Linguistics. V. Punyakanok, D. Roth, and W. Yih. 2008. The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics, 34(2). E. Riloff. 1996. Automatically constructing extraction patterns from untagged text. In Procs. of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), pages 1044–1049. Alan Ritter, Mausam, and Oren Etzioni. 2010. A Latent Dirichlet Allocation Method for Selectional Preferences. In ACL. Dan Roth and Wen-tau Yih. 2005. Integer linear programming inference for conditional random fields. In Proceedings of the 22nd international conference on Machine learning, ICML ’05, pages 736–743, New York, NY, USA. ACM. Stefan Schoenmackers, Oren Etzioni, Daniel S. Weld, and Jesse Davis. 2010. Learning first-order horn clauses from web text. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP ’ 10, pages 1088–1098, Stroudsburg, PA, USA. Association for Computational Linguistics. Satoshi Sekine. 2006. On-demand information extraction. In Proceedings of the COLING/ACL on Main 1545 conference poster sessions, pages 73 1–738, Morristown, NJ, USA. Association for Computational Linguistics. Yusuke Shinyama and Satoshi Sekine. 2006. Preemptive Information Extraction using Unrestricted Relation Discovery. In Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, pages 304–3 11, New York City, USA, June. Association for Computational Linguistics. Stephen Soderland, Brendan Roof, Bo Qin, Shi Xu, Mausam, and Oren Etzioni. 2010. Adapting open information extraction to domain-specific relations. AI Magazine, 3 1(3):93–102. S. Soderland. 1999. Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning, 34(1-3):233–272. Lucia Specia and Enrico Motta. 2006. M.: A hybrid approach for extracting semantic relations from texts. In In. Proceedings of the 2 nd Workshop on Ontology Learning and Population, pages 57–64. Suzanne Stevenson, Afsaneh Fazly, and Ryan North. 2004. Statistical measures of the semi-productivity of light verb constructions. In 2nd ACL Workshop on Multiword Expressions, pages 1–8. M. Stevenson. 2004. An unsupervised WordNet-based algorithm for relation extraction. In Proceedings of the “Beyond Named Entity ” workshop at the Fourth International Conference on Language Resources and Evalutaion (LREC’04). Kristina Toutanova, Aria Haghighi, and Christopher D. Manning. 2008. A global joint model for semantic role labeling. Computational Linguistics, 34(2): 161– 191. Fei Wu and Daniel S. Weld. 2010. Open information ex- traction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’ 10, pages 118–127, Morristown, NJ, USA. Association for Computational Linguistics. Jun Zhu, Zaiqing Nie, Xiaojiang Liu, Bo Zhang, and Ji-Rong Wen. 2009. StatSnowball: a statistical approach to extracting entity relationships. In WWW ’09: Proceedings of the 18th international conference on Worldwide web, pages 101–1 10, New York, NY, USA. ACM.

5 0.59674001 26 emnlp-2011-Class Label Enhancement via Related Instances

Author: Zornitsa Kozareva ; Konstantin Voevodski ; Shanghua Teng

Abstract: Class-instance label propagation algorithms have been successfully used to fuse information from multiple sources in order to enrich a set of unlabeled instances with class labels. Yet, nobody has explored the relationships between the instances themselves to enhance an initial set of class-instance pairs. We propose two graph-theoretic methods (centrality and regularization), which start with a small set of labeled class-instance pairs and use the instance-instance network to extend the class labels to all instances in the network. We carry out a comparative study with state-of-the-art knowledge harvesting algorithm and show that our approach can learn additional class labels while maintaining high accuracy. We conduct a comparative study between class-instance and instance-instance graphs used to propagate the class labels and show that the latter one achieves higher accuracy.

6 0.49825606 114 emnlp-2011-Relation Extraction with Relation Topics

7 0.46452048 14 emnlp-2011-A generative model for unsupervised discovery of relations and argument classes from clinical texts

8 0.44923165 128 emnlp-2011-Structured Relation Discovery using Generative Models

9 0.40687948 92 emnlp-2011-Minimally Supervised Event Causality Identification

10 0.31800529 2 emnlp-2011-A Cascaded Classification Approach to Semantic Head Recognition

11 0.31565744 40 emnlp-2011-Discovering Relations between Noun Categories

12 0.30570054 124 emnlp-2011-Splitting Noun Compounds via Monolingual and Bilingual Paraphrasing: A Study on Japanese Katakana Words

13 0.28742206 94 emnlp-2011-Modelling Discourse Relations for Arabic

14 0.28416649 142 emnlp-2011-Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities

15 0.25760141 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

16 0.24049325 109 emnlp-2011-Random Walk Inference and Learning in A Large Scale Knowledge Base

17 0.23811544 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

18 0.23751657 62 emnlp-2011-Generating Subsequent Reference in Shared Visual Scenes: Computation vs Re-Use

19 0.23517142 19 emnlp-2011-Approximate Scalable Bounded Space Sketch for Large Data NLP

20 0.22460525 99 emnlp-2011-Non-parametric Bayesian Segmentation of Japanese Noun Phrases

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(8, 0.018), (23, 0.075), (27, 0.01), (36, 0.044), (37, 0.019), (45, 0.054), (53, 0.01), (54, 0.02), (57, 0.416), (62, 0.045), (64, 0.013), (66, 0.026), (69, 0.026), (79, 0.05), (87, 0.011), (96, 0.039), (98, 0.016)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.88497031 113 emnlp-2011-Relation Acquisition using Word Classes and Partial Patterns

Author: Stijn De Saeger ; Kentaro Torisawa ; Masaaki Tsuchida ; Jun'ichi Kazama ; Chikara Hashimoto ; Ichiro Yamada ; Jong Hoon Oh ; Istvan Varga ; Yulan Yan

2 0.85176915 130 emnlp-2011-Summarize What You Are Interested In: An Optimization Framework for Interactive Personalized Summarization

Author: Rui Yan ; Jian-Yun Nie ; Xiaoming Li

Abstract: Most traditional summarization methods treat their outputs as static and plain texts, which fail to capture user interests during summarization because the generated summaries are the same for different users. However, users have individual preferences on a particular source document collection and obviously a universal summary for all users might not always be satisfactory. Hence we investigate an important and challenging problem in summary generation, i.e., Interactive Personalized Summarization (IPS), which generates summaries in an interactive and personalized manner. Given the source documents, IPS captures user interests by enabling interactive clicks and incorporates personalization by modeling captured reader preference. We develop . experimental systems to compare 5 rival algorithms on 4 instinctively different datasets which amount to 5197 documents. Evaluation results in ROUGE metrics indicate the comparable performance between IPS and the best competing system but IPS produces summaries with much more user satisfaction according to evaluator ratings. Besides, low ROUGE consistency among these user preferred summaries indicates the existence of personalization.

3 0.8283236 131 emnlp-2011-Syntactic Decision Tree LMs: Random Selection or Intelligent Design?

Author: Denis Filimonov ; Mary Harper

Abstract: Decision trees have been applied to a variety of NLP tasks, including language modeling, for their ability to handle a variety of attributes and sparse context space. Moreover, forests (collections of decision trees) have been shown to substantially outperform individual decision trees. In this work, we investigate methods for combining trees in a forest, as well as methods for diversifying trees for the task of syntactic language modeling. We show that our tree interpolation technique outperforms the standard method used in the literature, and that, on this particular task, restricting tree contexts in a principled way produces smaller and better forests, with the best achieving an 8% relative reduction in Word Error Rate over an n-gram baseline.

4 0.48692551 78 emnlp-2011-Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

Author: Su Nam Kim ; Preslav Nakov

5 0.46039212 57 emnlp-2011-Extreme Extraction - Machine Reading in a Week

Author: Marjorie Freedman ; Lance Ramshaw ; Elizabeth Boschee ; Ryan Gabbard ; Gary Kratkiewicz ; Nicolas Ward ; Ralph Weischedel

6 0.43251085 26 emnlp-2011-Class Label Enhancement via Related Instances

7 0.41315532 70 emnlp-2011-Identifying Relations for Open Information Extraction

8 0.41026559 147 emnlp-2011-Using Syntactic and Semantic Structural Kernels for Classifying Definition Questions in Jeopardy!

9 0.40649822 144 emnlp-2011-Unsupervised Learning of Selectional Restrictions and Detection of Argument Coercions

10 0.39653051 28 emnlp-2011-Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

11 0.37849635 128 emnlp-2011-Structured Relation Discovery using Generative Models

12 0.37112755 104 emnlp-2011-Personalized Recommendation of User Comments via Factor Models

13 0.36466464 114 emnlp-2011-Relation Extraction with Relation Topics