acl acl2010 acl2010-109 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Partha Pratim Talukdar ; Fernando Pereira
Abstract: Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area.
Reference: text
sentIndex sentText sentNum sentScore
1 We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. [sent-4, score-0.263]
2 We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. [sent-6, score-0.097]
3 1 Introduction Traditionally, named-entity recognition (NER) has focused on a small number of broad classes such as person, location, organization. [sent-8, score-0.126]
4 However, those classes are too coarse to support important applications such as sense disambiguation, semantic matching, and textual inference in Web search. [sent-9, score-0.126]
5 For those tasks, we need a much larger inventory of specific classes and accurate classification of terms into those classes. [sent-10, score-0.126]
6 While supervised learning methods perform well for traditional NER, they are impractical for fine-grained classification because sufficient labeled data to train classifiers for all the classes is unavailable and would be very expensive to obtain. [sent-11, score-0.147]
7 Starting with a few seed instances for some classes, these methods, through analysis of unstructured text, extract new instances of the same class. [sent-18, score-0.126]
8 We address this gap with a series of experiments comparing three graph-based SSL algorithms (Section 2) on graphs constructed from several sources (Metaweb Technologies, 2009; Banko et al. [sent-23, score-0.212]
9 We investigate whether semantic informatWioen i nnv tehstei faoterm w ohfe tihnestran sceem-aatntrtiicbu itenf edges derived from an independent knowledge base (Suchanek et al. [sent-25, score-0.068]
10 The intuition behind this is that instances that share attributes are more likely to belong to the same class. [sent-27, score-0.123]
11 We demonstrate that instance-attribute edges significantly improve the accuracy of class- instance extraction. [sent-28, score-0.068]
12 In Section 2, we review three graph-based SSL algorithms that are compared for the classinstance acquisition task in Section 3. [sent-34, score-0.179]
13 6, we show how additional instance-attribute based semantic constraints can be used to improve class-instance acquisition performance. [sent-36, score-0.12]
14 2 Graph-based SSL ×× × We now review the three graph-based SSL algorithms for class inference over graphs that we have evaluated. [sent-38, score-0.221]
15 1 Notation All the algorithms compute a soft assignment of labels to the nodes of a graph G = (V, E, W), where V is the set of nodes with |V | = n, E is wtheh rseet Vof i edges, atn odf Wnod eiss an edge weight matrix. [sent-40, score-0.509]
16 Out of the n = nl + nu nodes in G, nl nodes are labeled, while the remaining nu nodes are unlabeled. [sent-41, score-0.304]
17 aist an n m m traatinriixn gof l asboeftl ilnabfoerl assignments, Ywi iths representing tohfe s score eofl alassbiegln lon node v. [sent-49, score-0.09]
18 2 Label Propagation (LP-ZGL) The label propagation method presented by Zhu et al. [sent-52, score-0.075]
19 The objective minimized by LP-ZGL is: mYˆin XYˆl>LYˆl, s. [sent-54, score-0.057]
20 XYˆl>LYˆl = X Wuv(Yˆul −Yˆvl)2 Xl∈C u,v∈XV,l∈C From this, we observe that LP-ZGL penalizes any label assignment where two nodes connected by a highly weighted edge are assigned different labels. [sent-60, score-0.193]
21 This property is also shared by the two algorithms we shall review next. [sent-62, score-0.14]
22 , 2008) is a graph-based SSL algorithm which has been used for opendomain class-instance acquisition (Talukdar et al. [sent-66, score-0.078]
23 On each node v, the three probabilities sum to one, i. [sent-69, score-0.09]
24 , pivnj + pcvont + pavbnd = 1, and they are based on the random-walk interpretation of the Adsorption algorithm (Talukdar et al. [sent-71, score-0.195]
25 The main idea of Adsorption is to control label propagation more tightly by limiting the amount of information that passes through a node. [sent-73, score-0.075]
26 For instance, Adsorption can reduce the importance of a high-degree node v during the label inference process by increasing pavbnd on that node. [sent-74, score-0.198]
27 In contrast to LP-ZGL, Adsorption allows labels on labeled (seed) nodes to change, which is desirable in case of noisy input labels. [sent-76, score-0.133]
28 ters; is the Laplacian of an undirected graph derived from G, but with revised edge weights; and R is an n m matrix of per-node label prior, if any, awnit hn Rl representing tpheer- lntohd eco llaubmeln porfi oRr,. [sent-100, score-0.322]
29 fA asn iyn, Adsorption, MAD allows labels on seed nodes to change. [sent-101, score-0.138]
30 In case of MAD, the three random-walk probabilities, pivnj , pcvont, and pavbnd, defined by Adsorption on each node are folded inside the matrices S, , and R, respectively. [sent-102, score-0.161]
31 3 Experiments We now compare the experimental performance of the three graph-based SSL algorithms reviewed in the previous section, using graphs constructed from a variety of sources described below. [sent-107, score-0.235]
32 , 2008), we use Mean Reciprocal Rank (MRR) as the evaluation metric in all experiments: MRR =|Q1|vX∈Qr1v (4) where Q ⊆ V is the set of test nodes, and rv is the wranhekr oef Q Qthe ⊆ gold sl athbeel s among stth neo oldaebes,ls a assigned to node v. [sent-109, score-0.09]
33 Statistics of the graphs used during experiments in this section are presented in Table 1. [sent-112, score-0.088]
34 8 Amount of Supervision (# classes x seeds per class) Figure 3: Comparison of three graph transduction methods on a graph constructed from the Freebase dataset (see Section 3. [sent-116, score-0.828]
35 com/ 1475 YAGO graph, added for fair comparison with the TextRunner graph in Section 3. [sent-124, score-0.221]
36 6, had no attributes in YAGO KB, and hence these instance nodes had degree 0 in the YAGO graph. [sent-125, score-0.178]
37 (a) (b) Figure 2: (a) Example of a section of the graph constructed from the two tables in Figure 1. [sent-126, score-0.307]
38 Rectangular nodes are properties, oval nodes are entities or cell values. [sent-127, score-0.231]
39 (b) The graph in part (a) augmented with an attribute node, has attribue:albums, along with the edges incident on it. [sent-128, score-0.342]
40 This results is additional constraints for the nodes Johnny Cash and Bob Dylan to have similar labels (see Section 3. [sent-129, score-0.154]
41 A table consists of one or more properties (column names) and their corresponding cell values (column entries). [sent-132, score-0.059]
42 In this figure, Gender is a property in the table people-person, and Male is a corresponding cell value. [sent-134, score-0.12]
43 We use the following process to convert the Freebase data tables into a single graph: • • • Create a node for each unique cell value Create a node for each unique property name, wChreeartee unique property name pisr opbetratiyn nedam by prefixing the unique table ID to the property name. [sent-135, score-0.587]
44 For example, in Figure 1, peopleperson-gender is a unique property name. [sent-136, score-0.093]
45 0 from cell-value nAoddde v t oe unique property 0no frdoem p, iefllf- vvaalluuee v is present in the column corresponding to property p. [sent-138, score-0.18]
46 By applying this graph construction process on the first column of the two tables in Figure 1, we end up with the graph shown in Figure 2 (a). [sent-140, score-0.505]
47 The topics in this subset were further filtered so that only cell-value nodes with frequency 10 or more were retained. [sent-145, score-0.086]
48 We call the resulting graph Freebase-1 (see Table 1). [sent-146, score-0.221]
49 From this set, we selected all classes which had more than 10 instances overlapping with the Freebase graph constructed above. [sent-150, score-0.478]
50 This resulted in 23 classes, which along with their overlapping instances were used as the gold standard set for the experiments in this section. [sent-151, score-0.103]
51 Experimental results with 2 and 10 seeds (labeled nodes) per class are shown in Figure 3. [sent-152, score-0.266]
52 39 Amount of Supervision (# classes x seeds per class) Figure 4: Comparison of graph transduction methods on a graph constructed from the Freebase dataset (see Section 3. [sent-156, score-0.828]
53 To evaluate how the algorithms scale up, we construct a larger graph from the same 18 domains as in Section 3. [sent-160, score-0.269]
54 We shall call the resulting graph Freebase-2 (see Table 1). [sent-162, score-0.252]
55 , 2007), that had more than 100 instances overlapping with the larger Freebase graph constructed above. [sent-164, score-0.352]
56 This resulted in 192 WN classes which we use for the experiments in this section. [sent-165, score-0.147]
57 The reason behind imposing such frequency constraints during class selection is to make sure that each class is left with a sufficient number of instances during testing. [sent-166, score-0.262]
58 Experimental results comparing LP-ZGL, Adsorption, and MAD with 2 and 10 seeds per class are shown in Figure 4. [sent-167, score-0.266]
59 A total of 292k test nodes were used for testing in the 10 seeds per class condition, showing that these methods can be applied to large datasets. [sent-168, score-0.352]
60 It is interesting to note that MAD with 2 seeds per class outperforms LP-ZGL and adsorption even with 10 seeds per class. [sent-170, score-0.803]
61 35 LP-ZGL Adsorption MAD Amount of Supervision (# classes x seeds per class) Figure 5: Comparison of graph transduction methods on a graph constructed from the hypernym tuples extracted by the TextRunner system (Banko et al. [sent-173, score-0.891]
62 In contrast to graph construction from structured tables as in Sections 3. [sent-178, score-0.258]
63 2, in this section we use hypernym tuples extracted by TextRunner (Banko et al. [sent-180, score-0.063]
64 To convert such a tuple into a graph, we create a node for the instance (http) and a node for the class (protocol), and then connect the nodes with two 1477 directed edges in both directions, with the extraction confidence (0. [sent-185, score-0.448]
65 The graph created with this process from TextRunner output is called the TextRunner Graph (see Table 1). [sent-187, score-0.221]
66 , 2007), which had more than 50 instances overlapping with the constructed graph. [sent-191, score-0.131]
67 This resulted in 170 WordNet classes being used for the experiments in this section. [sent-192, score-0.147]
68 Experimental results with 2 and 10 seeds per class are shown in Figure 5. [sent-193, score-0.266]
69 4 Discussion If we correlate the graph statistics in Table 1 with the results of sections 3. [sent-196, score-0.221]
70 3, we see that MAD is most effective for graphs with high average degree, that is, graphs where nodes tend to connect to many other nodes. [sent-199, score-0.262]
71 For instance, the Freebase-1 graph has a high average degree of 29. [sent-200, score-0.244]
72 Based on this, we suggest that average degree, an easily computable struc- (Yˆ ) tural property of the graph, may be a useful indicator in choosing which graph-based SSL algorithm should be applied on a given graph. [sent-207, score-0.061]
73 6, each node was allowed to have a maximum of 15 classes during inference. [sent-214, score-0.244]
74 42 Maximum Allowed Classes per Node Figure 6: Effect of per node class sparsity (maximum number of classes allowed per node) during MAD inference in the experimental setting of Figure 4 (one random split). [sent-216, score-0.534]
75 on a node, all classes except for the top scoring 15 classes were discarded. [sent-217, score-0.252]
76 Without such sparsity constraints, a node in a connected graph will end up acquiring all the labels injected into the graph. [sent-218, score-0.411]
77 In order to estimate the effect of such sparsity constraints, we varied the number of classes allowed per node from 5 to 45 on the graph and experimental setup of Figure 4, with 10 seeds per class. [sent-220, score-0.745]
78 We observe that performance can vary significantly as the maximum number of classes allowed per node is changed, with the performance peaking at 25. [sent-222, score-0.297]
79 This suggests that sparsity constraints during graph based SSL may have a crucial role to play, a question that needs further investigation. [sent-223, score-0.309]
80 Given a set of seed instance-attribute pairs, these methods attempt to extract more instance-attribute pairs automatically 1478 170 WordNet Classes, 2 Seeds per Class 0. [sent-229, score-0.079]
81 45 TextRunner Graph Algorithms Figure 7: Comparison of class-instance acquisition performance on the three different graphs described in Section 3. [sent-231, score-0.166]
82 Addition of YAGO attributes to the TextRunner graph significantly improves performance. [sent-234, score-0.268]
83 nodes (out of 80) in the TextRunner + YAGO graph used in Figure 7 (see Section 3. [sent-235, score-0.307]
84 A few example instances of each WordNet class is shown within brackets. [sent-237, score-0.135]
85 Top ranked class for each attribute is shown in bold. [sent-238, score-0.16]
86 In this section, plore whether class-instance assignment we excan be improved by incorporating new semantic constraints derived from (instance, attribute) pairs. [sent-240, score-0.064]
87 In particular, we experiment with the following type of constraint: two instances with a common attribute are likely to belong to the same class. [sent-241, score-0.151]
88 For example, in Figure 2 (b), instances Johnny Cash and Bob Dylan are more likely to belong to the same class as they have a common attribute, albums. [sent-242, score-0.161]
89 2), such constraints are naturally captured by the methods reviewed in Section 2. [sent-244, score-0.065]
90 In Figure 7, we compare class-instance acquisition performance of the three graph-based SSL methods (Section 2) on the following three graphs (also see Table 1): TextRunner Graph: Graph constructed from the hypernym tuples extracted by TextRunner, as in Figure 5 (Section 3. [sent-246, score-0.278]
91 TextRunner + YAGO Graph: Union of the 1479 two graphs above, with 237k nodes and 1. [sent-250, score-0.174]
92 In all experimental conditions with 2 and 10 seeds per class in Figure 7, we observe that the three methods consistently achieved the best performance on the TextRunner + YAGO graph. [sent-252, score-0.266]
93 This suggests that addition of attribute based semantic constraints from YAGO to the TextRunner graph results in a better connected graph which in turn results in better inference by the graphbased SSL algorithms, compared to using either of the sources, i. [sent-253, score-0.609]
94 Because of the label propagation behavior, graph-based SSL algorithms assign classes to all nodes reachable in the graph from at least one of the labeled instance nodes. [sent-261, score-0.599]
95 This allows us to check the classes assigned to nodes corresponding to YAGO attributes in the TextRunner + YAGO graph, as shown in Table 2. [sent-262, score-0.259]
96 For example, the algorithm is able to learn that works at is an attribute ofthe WordNet class word- net scientist 110560637, and thereby its instances (e. [sent-264, score-0.21]
97 4 Conclusion We have started a systematic experimental comparison of graph-based SSL algorithms for classinstance acquisition on a variety of graphs constructed from different domains. [sent-267, score-0.316]
98 Topics for future work include the incorporation of other kinds of semantic constraint for improved class-instance acquisition, further investigation into per-node sparsity constraints in graph-based SSL, and moving beyond bipartite graph constructions. [sent-271, score-0.309]
99 What you seek is what you get: Extraction of class attributes from query logs. [sent-369, score-0.132]
100 Finding cars, goddesses and enzymes: Parametrizable acquisition of labeled instances for open-domain information extraction. [sent-445, score-0.149]
wordName wordTfidf (topN-words)
[('ssl', 0.436), ('mad', 0.427), ('adsorption', 0.356), ('textrunner', 0.254), ('talukdar', 0.243), ('graph', 0.221), ('yago', 0.197), ('freebase', 0.187), ('seeds', 0.128), ('classes', 0.126), ('node', 0.09), ('graphs', 0.088), ('nodes', 0.086), ('class', 0.085), ('acquisition', 0.078), ('attribute', 0.075), ('pavbnd', 0.071), ('pivnj', 0.071), ('property', 0.061), ('cell', 0.059), ('wordnet', 0.059), ('suchanek', 0.058), ('pantel', 0.054), ('classinstance', 0.053), ('metaweb', 0.053), ('pcvont', 0.053), ('per', 0.053), ('instances', 0.05), ('constructed', 0.049), ('algorithms', 0.048), ('durme', 0.048), ('kb', 0.048), ('attributes', 0.047), ('sparsity', 0.046), ('edges', 0.046), ('banko', 0.045), ('pasca', 0.043), ('dylan', 0.043), ('constraints', 0.042), ('edge', 0.042), ('pennacchiotti', 0.042), ('reisinger', 0.04), ('propagation', 0.038), ('tables', 0.037), ('label', 0.037), ('bob', 0.036), ('crammer', 0.036), ('baluja', 0.036), ('ferbruary', 0.036), ('laplacian', 0.036), ('probst', 0.036), ('wuv', 0.036), ('mrr', 0.035), ('film', 0.035), ('hypernym', 0.034), ('rightmost', 0.032), ('overlapping', 0.032), ('unique', 0.032), ('minimized', 0.031), ('pas', 0.031), ('shall', 0.031), ('bellare', 0.031), ('johnny', 0.031), ('transduction', 0.03), ('extraction', 0.029), ('tuples', 0.029), ('albums', 0.029), ('reproducible', 0.029), ('connected', 0.028), ('allowed', 0.028), ('soderland', 0.027), ('partha', 0.027), ('sources', 0.027), ('labels', 0.026), ('seed', 0.026), ('averaged', 0.026), ('belong', 0.026), ('objective', 0.026), ('column', 0.026), ('yl', 0.025), ('rao', 0.025), ('mountain', 0.025), ('cash', 0.025), ('etzioni', 0.025), ('supervision', 0.025), ('carlson', 0.024), ('nu', 0.023), ('ravichandran', 0.023), ('reviewed', 0.023), ('degree', 0.023), ('iterative', 0.023), ('instance', 0.022), ('graphbased', 0.022), ('derived', 0.022), ('protocol', 0.022), ('ly', 0.022), ('resulted', 0.021), ('pereira', 0.021), ('labeled', 0.021), ('van', 0.021)]
simIndex simValue paperId paperTitle
same-paper 1 1.0 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition
Author: Partha Pratim Talukdar ; Fernando Pereira
Abstract: Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area.
2 0.14187323 185 acl-2010-Open Information Extraction Using Wikipedia
Author: Fei Wu ; Daniel S. Weld
Abstract: Information-extraction (IE) systems seek to distill semantic relations from naturallanguage text, but most systems use supervised learning of relation-specific examples and are thus limited by the availability of training data. Open IE systems such as TextRunner, on the other hand, aim to handle the unbounded number of relations found on the Web. But how well can these open systems perform? This paper presents WOE, an open IE system which improves dramatically on TextRunner’s precision and recall. The key to WOE’s performance is a novel form of self-supervised learning for open extractors using heuris— tic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Like TextRunner, WOE’s extractor eschews lexicalized features and handles an unbounded set of semantic relations. WOE can operate in two modes: when restricted to POS tag features, it runs as quickly as TextRunner, but when set to use dependency-parse features its precision and recall rise even higher.
3 0.12761152 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
Author: Zornitsa Kozareva ; Eduard Hovy
Abstract: A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algorithm on multiple semantic relations expressed using “verb”, “noun”, and “verb prep” lexico-syntactic patterns. Humanbased evaluation shows that the accuracy of the harvested information is about 90%. We also compare our results with existing knowledge base to outline the similarities and differences of the granularity and diversity of the harvested knowledge.
4 0.11773825 161 acl-2010-Learning Better Data Representation Using Inference-Driven Metric Learning
Author: Paramveer S. Dhillon ; Partha Pratim Talukdar ; Koby Crammer
Abstract: We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semisupervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA). Through a variety of experiments on different realworld datasets, we find IDML-IT, a semisupervised metric learning algorithm to be the most effective.
5 0.11111505 181 acl-2010-On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds
Author: Ashwin Ittoo ; Gosse Bouma
Abstract: An important relation in information extraction is the part-whole relation. Ontological studies mention several types of this relation. In this paper, we show that the traditional practice of initializing minimally-supervised algorithms with a single set that mixes seeds of different types fails to capture the wide variety of part-whole patterns and tuples. The results obtained with mixed seeds ultimately converge to one of the part-whole relation types. We also demonstrate that all the different types of part-whole relations can still be discovered, regardless of the type characterized by the initializing seeds. We performed our experiments with a state-ofthe-art information extraction algorithm. 1
6 0.10229772 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
7 0.099145159 127 acl-2010-Global Learning of Focused Entailment Graphs
8 0.076412901 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
9 0.074724481 20 acl-2010-A Transition-Based Parser for 2-Planar Dependency Structures
10 0.073779747 10 acl-2010-A Latent Dirichlet Allocation Method for Selectional Preferences
11 0.072076522 89 acl-2010-Distributional Similarity vs. PU Learning for Entity Set Expansion
12 0.071745105 141 acl-2010-Identifying Text Polarity Using Random Walks
13 0.071452267 115 acl-2010-Filtering Syntactic Constraints for Statistical Machine Translation
14 0.061108239 27 acl-2010-An Active Learning Approach to Finding Related Terms
15 0.06001462 210 acl-2010-Sentiment Translation through Lexicon Induction
16 0.059020512 198 acl-2010-Predicate Argument Structure Analysis Using Transformation Based Learning
17 0.058926988 156 acl-2010-Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems
18 0.056029975 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
19 0.051202718 245 acl-2010-Understanding the Semantic Structure of Noun Phrase Queries
20 0.050737891 158 acl-2010-Latent Variable Models of Selectional Preference
topicId topicWeight
[(0, -0.142), (1, 0.069), (2, -0.02), (3, 0.019), (4, 0.072), (5, -0.005), (6, 0.067), (7, 0.06), (8, -0.049), (9, -0.046), (10, -0.109), (11, -0.007), (12, -0.06), (13, -0.112), (14, 0.112), (15, 0.053), (16, 0.084), (17, 0.118), (18, -0.054), (19, -0.019), (20, -0.063), (21, 0.102), (22, -0.071), (23, 0.003), (24, -0.056), (25, 0.005), (26, -0.06), (27, -0.053), (28, 0.019), (29, 0.071), (30, -0.056), (31, -0.009), (32, -0.093), (33, -0.0), (34, -0.018), (35, 0.098), (36, 0.037), (37, 0.032), (38, -0.028), (39, -0.055), (40, -0.063), (41, -0.146), (42, 0.167), (43, -0.045), (44, 0.089), (45, -0.007), (46, 0.123), (47, 0.033), (48, -0.091), (49, 0.065)]
simIndex simValue paperId paperTitle
same-paper 1 0.94330275 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition
Author: Partha Pratim Talukdar ; Fernando Pereira
Abstract: Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area.
2 0.70476019 181 acl-2010-On Learning Subtypes of the Part-Whole Relation: Do Not Mix Your Seeds
Author: Ashwin Ittoo ; Gosse Bouma
Abstract: An important relation in information extraction is the part-whole relation. Ontological studies mention several types of this relation. In this paper, we show that the traditional practice of initializing minimally-supervised algorithms with a single set that mixes seeds of different types fails to capture the wide variety of part-whole patterns and tuples. The results obtained with mixed seeds ultimately converge to one of the part-whole relation types. We also demonstrate that all the different types of part-whole relations can still be discovered, regardless of the type characterized by the initializing seeds. We performed our experiments with a state-ofthe-art information extraction algorithm. 1
3 0.64496875 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
Author: Zornitsa Kozareva ; Eduard Hovy
Abstract: A challenging problem in open information extraction and text mining is the learning of the selectional restrictions of semantic relations. We propose a minimally supervised bootstrapping algorithm that uses a single seed and a recursive lexico-syntactic pattern to learn the arguments and the supertypes of a diverse set of semantic relations from the Web. We evaluate the performance of our algorithm on multiple semantic relations expressed using “verb”, “noun”, and “verb prep” lexico-syntactic patterns. Humanbased evaluation shows that the accuracy of the harvested information is about 90%. We also compare our results with existing knowledge base to outline the similarities and differences of the granularity and diversity of the harvested knowledge.
4 0.5803321 161 acl-2010-Learning Better Data Representation Using Inference-Driven Metric Learning
Author: Paramveer S. Dhillon ; Partha Pratim Talukdar ; Koby Crammer
Abstract: We initiate a study comparing effectiveness of the transformed spaces learned by recently proposed supervised, and semisupervised metric learning algorithms to those generated by previously proposed unsupervised dimensionality reduction methods (e.g., PCA). Through a variety of experiments on different realworld datasets, we find IDML-IT, a semisupervised metric learning algorithm to be the most effective.
5 0.50124496 185 acl-2010-Open Information Extraction Using Wikipedia
Author: Fei Wu ; Daniel S. Weld
Abstract: Information-extraction (IE) systems seek to distill semantic relations from naturallanguage text, but most systems use supervised learning of relation-specific examples and are thus limited by the availability of training data. Open IE systems such as TextRunner, on the other hand, aim to handle the unbounded number of relations found on the Web. But how well can these open systems perform? This paper presents WOE, an open IE system which improves dramatically on TextRunner’s precision and recall. The key to WOE’s performance is a novel form of self-supervised learning for open extractors using heuris— tic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Like TextRunner, WOE’s extractor eschews lexicalized features and handles an unbounded set of semantic relations. WOE can operate in two modes: when restricted to POS tag features, it runs as quickly as TextRunner, but when set to use dependency-parse features its precision and recall rise even higher.
6 0.49938911 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
7 0.4795827 141 acl-2010-Identifying Text Polarity Using Random Walks
8 0.47180939 250 acl-2010-Untangling the Cross-Lingual Link Structure of Wikipedia
9 0.46029985 248 acl-2010-Unsupervised Ontology Induction from Text
10 0.44600102 43 acl-2010-Automatically Generating Term Frequency Induced Taxonomies
11 0.4453671 127 acl-2010-Global Learning of Focused Entailment Graphs
12 0.43592656 150 acl-2010-Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
13 0.40676823 138 acl-2010-Hunting for the Black Swan: Risk Mining from Text
14 0.38832533 222 acl-2010-SystemT: An Algebraic Approach to Declarative Information Extraction
15 0.38701075 186 acl-2010-Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two
16 0.37365901 166 acl-2010-Learning Word-Class Lattices for Definition and Hypernym Extraction
17 0.36800399 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
18 0.36447552 64 acl-2010-Complexity Assumptions in Ontology Verbalisation
19 0.34692746 111 acl-2010-Extracting Sequences from the Web
20 0.34599614 7 acl-2010-A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices
topicId topicWeight
[(14, 0.013), (17, 0.121), (18, 0.011), (25, 0.064), (33, 0.035), (42, 0.031), (44, 0.014), (59, 0.092), (71, 0.031), (72, 0.028), (73, 0.051), (76, 0.013), (78, 0.054), (80, 0.011), (83, 0.076), (84, 0.03), (86, 0.109), (98, 0.119)]
simIndex simValue paperId paperTitle
same-paper 1 0.84624159 109 acl-2010-Experiments in Graph-Based Semi-Supervised Learning Methods for Class-Instance Acquisition
Author: Partha Pratim Talukdar ; Fernando Pereira
Abstract: Graph-based semi-supervised learning (SSL) algorithms have been successfully used to extract class-instance pairs from large unstructured and structured text collections. However, a careful comparison of different graph-based SSL algorithms on that task has been lacking. We compare three graph-based SSL algorithms for class-instance acquisition on a variety of graphs constructed from different domains. We find that the recently proposed MAD algorithm is the most effective. We also show that class-instance extraction can be significantly improved by adding semantic information in the form of instance-attribute edges derived from an independently developed knowledge base. All of our code and data will be made publicly available to encourage reproducible research in this area.
2 0.81689954 247 acl-2010-Unsupervised Event Coreference Resolution with Rich Linguistic Features
Author: Cosmin Bejan ; Sanda Harabagiu
Abstract: This paper examines how a new class of nonparametric Bayesian models can be effectively applied to an open-domain event coreference task. Designed with the purpose of clustering complex linguistic objects, these models consider a potentially infinite number of features and categorical outcomes. The evaluation performed for solving both within- and cross-document event coreference shows significant improvements of the models when compared against two baselines for this task.
3 0.80487114 174 acl-2010-Modeling Semantic Relevance for Question-Answer Pairs in Web Social Communities
Author: Baoxun Wang ; Xiaolong Wang ; Chengjie Sun ; Bingquan Liu ; Lin Sun
Abstract: Quantifying the semantic relevance between questions and their candidate answers is essential to answer detection in social media corpora. In this paper, a deep belief network is proposed to model the semantic relevance for question-answer pairs. Observing the textual similarity between the community-driven questionanswering (cQA) dataset and the forum dataset, we present a novel learning strategy to promote the performance of our method on the social community datasets without hand-annotating work. The experimental results show that our method outperforms the traditional approaches on both the cQA and the forum corpora.
4 0.74853516 185 acl-2010-Open Information Extraction Using Wikipedia
Author: Fei Wu ; Daniel S. Weld
Abstract: Information-extraction (IE) systems seek to distill semantic relations from naturallanguage text, but most systems use supervised learning of relation-specific examples and are thus limited by the availability of training data. Open IE systems such as TextRunner, on the other hand, aim to handle the unbounded number of relations found on the Web. But how well can these open systems perform? This paper presents WOE, an open IE system which improves dramatically on TextRunner’s precision and recall. The key to WOE’s performance is a novel form of self-supervised learning for open extractors using heuris— tic matches between Wikipedia infobox attribute values and corresponding sentences to construct training data. Like TextRunner, WOE’s extractor eschews lexicalized features and handles an unbounded set of semantic relations. WOE can operate in two modes: when restricted to POS tag features, it runs as quickly as TextRunner, but when set to use dependency-parse features its precision and recall rise even higher.
5 0.73868054 211 acl-2010-Simple, Accurate Parsing with an All-Fragments Grammar
Author: Mohit Bansal ; Dan Klein
Abstract: We present a simple but accurate parser which exploits both large tree fragments and symbol refinement. We parse with all fragments of the training set, in contrast to much recent work on tree selection in data-oriented parsing and treesubstitution grammar learning. We require only simple, deterministic grammar symbol refinement, in contrast to recent work on latent symbol refinement. Moreover, our parser requires no explicit lexicon machinery, instead parsing input sentences as character streams. Despite its simplicity, our parser achieves accuracies of over 88% F1 on the standard English WSJ task, which is competitive with substantially more complicated state-of-theart lexicalized and latent-variable parsers. Additional specific contributions center on making implicit all-fragments parsing efficient, including a coarse-to-fine inference scheme and a new graph encoding.
6 0.73604751 218 acl-2010-Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation
7 0.73493338 169 acl-2010-Learning to Translate with Source and Target Syntax
8 0.73379755 158 acl-2010-Latent Variable Models of Selectional Preference
9 0.73367107 160 acl-2010-Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns
10 0.73318839 71 acl-2010-Convolution Kernel over Packed Parse Forest
11 0.73311329 120 acl-2010-Fully Unsupervised Core-Adjunct Argument Classification
12 0.73252821 248 acl-2010-Unsupervised Ontology Induction from Text
13 0.73142534 70 acl-2010-Contextualizing Semantic Representations Using Syntactically Enriched Vector Models
14 0.73139924 113 acl-2010-Extraction and Approximation of Numerical Attributes from the Web
15 0.72782356 184 acl-2010-Open-Domain Semantic Role Labeling by Modeling Word Spans
16 0.72604728 162 acl-2010-Learning Common Grammar from Multilingual Corpus
17 0.7252171 172 acl-2010-Minimized Models and Grammar-Informed Initialization for Supertagging with Highly Ambiguous Lexicons
18 0.7250666 153 acl-2010-Joint Syntactic and Semantic Parsing of Chinese
19 0.72460043 55 acl-2010-Bootstrapping Semantic Analyzers from Non-Contradictory Texts
20 0.7243548 53 acl-2010-Blocked Inference in Bayesian Tree Substitution Grammars