acl acl2012 acl2012-133 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Sindhu Raghavan ; Raymond Mooney ; Hyeonseo Ku
Abstract: Most information extraction (IE) systems identify facts that are explicitly stated in text. However, in natural language, some facts are implicit, and identifying them requires “reading between the lines”. Human readers naturally use common sense knowledge to infer such implicit information from the explicitly stated facts. We propose an approach that uses Bayesian Logic Programs (BLPs), a statistical relational model combining firstorder logic and Bayesian networks, to infer additional implicit information from extracted facts. It involves learning uncertain commonsense knowledge (in the form of probabilistic first-order rules) from natural language text by mining a large corpus of automatically extracted facts. These rules are then used to derive additional facts from extracted information using BLP inference. Experimental evaluation on a benchmark data set for machine reading demonstrates the efficacy of our approach.
Reference: text
sentIndex sentText sentNum sentScore
1 edu Abstract Most information extraction (IE) systems identify facts that are explicitly stated in text. [sent-5, score-0.334]
2 However, in natural language, some facts are implicit, and identifying them requires “reading between the lines”. [sent-6, score-0.243]
3 Human readers naturally use common sense knowledge to infer such implicit information from the explicitly stated facts. [sent-7, score-0.233]
4 We propose an approach that uses Bayesian Logic Programs (BLPs), a statistical relational model combining firstorder logic and Bayesian networks, to infer additional implicit information from extracted facts. [sent-8, score-0.348]
5 These rules are then used to derive additional facts from extracted information using BLP inference. [sent-10, score-0.442]
6 IE systems (Cowie and Lehnert, 1996; Sarawagi, 2008) are trained to extract facts that are stated explicitly in text. [sent-13, score-0.334]
7 However, some facts are implicit, and human readers naturally “read between the lines” and infer them from the stated facts using commonsense knowledge. [sent-14, score-0.68]
8 The standard approach to inferring implicit information involves using commonsense knowledge in the form of logical rules to deduce additional information from the extracted facts. [sent-20, score-0.45]
9 Since manually developing such a knowledge base is difficult and arduous, an effective alternative is to automatically learn such rules by mining a substantial database of facts that an IE system has already automatically extracted from a large corpus of text (Nahm and Mooney, 2000). [sent-21, score-0.543]
10 However, the facts extracted by an IE system are always quite noisy and incomplete. [sent-23, score-0.301]
11 Consequently, a purely logical approach to learning and inference is unlikely to be effective. [sent-24, score-0.218]
12 We demonstrate that it is possible to learn the structure and the parameters of BLPs automatically using only noisy extractions from natural language text, which we then use to infer additional facts from text. [sent-31, score-0.5]
13 , 2011) have mined inference rules from data automatically extracted from text by an IE system. [sent-42, score-0.242]
14 Similar to our approach, these systems use the learned rules to infer additional information from facts directly extracted from a document. [sent-43, score-0.623]
15 Nahm and Mooney (2000) learn propositional rules using C4. [sent-44, score-0.21]
16 5 (Quinlan, 1993) from data extracted from computer-related job-postings, and therefore cannot learn multi-relational rules with quantified variables. [sent-45, score-0.268]
17 (2010) modify an ILP system simi- lar to FOIL (Quinlan, 1990) to learn rules with probabilistic conclusions. [sent-54, score-0.253]
18 They use purely logical deduction (forward-chaining) to infer additional facts. [sent-55, score-0.366]
19 (2010) used a human judge to manually evaluate the quality of the learned rules before using them to infer additional facts. [sent-58, score-0.322]
20 Our approach, on the other hand, is completely automated and learns fully parameterized rules in a well-defined probabilistic logic. [sent-59, score-0.184]
21 , 2008), an inference engine based on MLNs (Domingos and Lowd, 2009) (an SRL approach that combines first-order logic and Markov networks) to infer additional facts. [sent-67, score-0.239]
22 However, MLNs include all possible type-consistent groundings of the rules in the corresponding Markov net, which, for larger datasets, can result in an intractably large graphical model. [sent-68, score-0.199]
23 They propose several approaches to score the rules, which are used to infer additional facts using purely logical deduction. [sent-73, score-0.498]
24 (201 1) propose a probabilistic approach to modeling implicit information as missing facts and use MLNs to infer these missing facts. [sent-75, score-0.428]
25 They learn first-order rules for the MLN by performing exhaustive search. [sent-76, score-0.21]
26 As mentioned earlier, inference using both these approaches, logical deduction and MLNs, have certain limitations, which BLPs help overcome. [sent-77, score-0.292]
27 DIRT (Lin and Pantel, 2001) and RESOLVER (Yates and Etzioni, 2007) learn inference rules, also called entailment rules that capture synonymous relations and entities from text. [sent-78, score-0.35]
28 , 2011) propose an approach that uses transitivity constraints for learning entailment rules for typed predicates. [sent-81, score-0.17]
29 Unlike the systems described above, these systems do not learn complex first-order rules that capture common sense knowledge. [sent-82, score-0.21]
30 Further, most of these systems do not use extractions from an IE system to learn entailment rules, thereby making them less related to our approach. [sent-83, score-0.167]
31 3 Bayesian Logic Programs Bayesian logic programs (BLPs) (Kersting and De Raedt, 2007; Kersting and Raedt, 2008) can be considered as templates for constructing directed graphical models (Bayes nets). [sent-84, score-0.18]
32 Given a knowledge base as a BLP, standard logical inference (SLD resolution) is used to automatically construct a Bayes net for a given problem. [sent-106, score-0.212]
33 More specifically, given a set of facts and a query, all possible Horn-clause proofs of the query are constructed and used to build a Bayes net for answering the query. [sent-107, score-0.344]
34 Once a ground network is constructed, standard probabilistic inference methods can be used to answer various types of queries as reviewed by Koller and Friedman (2009). [sent-112, score-0.168]
35 1 Learning Rules from Extracted Data The first step involves learning commonsense knowledge in the form of first-order Horn rules from text. [sent-115, score-0.192]
36 We first extract facts that are explicitly stated in the text using SIRE (Florian et al. [sent-116, score-0.334]
37 We then learn first-order rules from these extracted facts using LIME (Mccreath and Sharma, 1998), an ILP system designed for noisy training data. [sent-118, score-0.511]
38 Typically, an ILP system takes a set of positive and negative instances for a target relation, along with a background knowledge base (in our case, other facts extracted from the same document) from which the positive instances are potentially in- ferable. [sent-120, score-0.446]
39 Since LIME can learn rules using only positive instances, or both positive and negative instances, we learn rules using both settings. [sent-131, score-0.42]
40 We include all unique rules learned from both settings in the final set, since the goal of this step is to learn a large set of potentially useful rules whose relative strengths will be determined in the next step of parameter learn- ing. [sent-132, score-0.452]
41 2 Learning BLP Parameters The parameters of a BLP include the CPT entries associated with the Bayesian clauses and the parameters of combining rules associated with the Bayesian predicates. [sent-136, score-0.28]
42 For simplicity, we use a deterministic logical-and model to encode the CPT entries associated with Bayesian clauses, and use noisy-or to combine evidence coming from multiple ground rules that have the same head (Pearl, 1988). [sent-137, score-0.25]
43 In our task, the supervised training data consists of facts that are extracted from the natural language text. [sent-140, score-0.301]
44 However, we usually do not have evidence for inferred facts as well as noisy-or nodes. [sent-141, score-0.319]
45 We then construct a ground Bayesian network using the resulting deductive proofs for all target relations and learned parameters using the standard approach described in Section 3. [sent-146, score-0.397]
46 Finally, we perform standard probabilistic inference to estimate the marginal probability of each inferred fact. [sent-147, score-0.189]
47 We learned first-order rules for the 13 target relations shown in Table 3 from the facts extracted from the training documents (Section 4. [sent-168, score-0.638]
48 Consequently, we split the 9, 000 training documents into four disjoint subsets and learned first-order rules from each subset. [sent-172, score-0.242]
49 The final knowledge base included all unique rules learned from any subset. [sent-173, score-0.242]
50 LIME learned several rules that had only entity types in their bodies. [sent-174, score-0.242]
51 Such rules make many incorrect inferences; hence we eliminated them. [sent-175, score-0.173]
52 Table 1: A sample set of rules learned using LIME For each test document, we performed BLP inference as described in Section 4. [sent-184, score-0.285]
53 We ranked all inferences by their marginal probability, and evaluated the results by either choosing the top n inferences or accepting inferences whose marginal probability was equal to or exceeded a specified threshold. [sent-186, score-0.981]
54 We evaluated two BLPs with different parameter settings: BLP-Learned-Weights used noisy-or parameters learned using EM, BLP-Manual-Weights used fixed noisy-or weights of 0. [sent-187, score-0.184]
55 3 Evaluation Metrics The lack ofground truth annotation for inferred facts prevents an automated evaluation, so we resorted to a manual evaluation. [sent-190, score-0.348]
56 We randomly sampled 40 documents (4 from each test fold), judged the accuracy of the inferences for those documents, and computed precision, the fraction of inferences that were deemed correct. [sent-191, score-0.618]
57 For probabilistic methods like BLPs and MLNs that provide certainties for their inferences, we also computed precision at top n, which measures the precision of the n inferences with the highest marginal probability across the 40 test documents. [sent-192, score-0.505]
58 Measuring recall for making inferences is very difficult since it would require labeling a reasonable-sized corpus of documents with all of the correct inferences for a given set of target relations, which would be extremely time consuming. [sent-193, score-0.645]
59 SIRE frequently makes incorrect extractions, and therefore inferences made from these extractions are also inaccurate. [sent-197, score-0.41]
60 That is, if an inference is incorrect because it was based on incorrect extracted facts, we remove it from the set of inferences and calculate precision for the remaining inferences. [sent-201, score-0.537]
61 Therefore, we compared to the following methods: • • Logical Deduction: This method forward Lchoagiincsa on tehdeu cetixotrna:cted facts using the firstorder rules learned by LIME to infer additional facts. [sent-204, score-0.597]
62 In online generative learning, gradients are calculated and weights are estimated after processing each example and the learned weights are used as the starting weights for the next example. [sent-208, score-0.233]
63 “UA” and “AD” refer to the unadjusted and adjusted scores respectively In our approach, the initial weights of clauses are set to 10. [sent-213, score-0.319]
64 We then use the learned rules and parameters to probabilistically infer additional facts using the MC-SAT algorithm implemented in Alchemy,1 an open-source MLN package. [sent-217, score-0.604]
65 1 Comparison to Baselines Table 2 gives the unadjusted (UA) and adjusted (AD) precision for logical deduction. [sent-219, score-0.415]
66 Out of 1, 490 inferences for the 40 evaluation documents, 443 were judged correct, giving an unadjusted precision of 29. [sent-220, score-0.48]
67 Out of these 1, 490 inferences, 233 were determined to be incorrect due to extraction errors, improving the adjusted precision to a modest 35. [sent-222, score-0.201]
68 MLNs made about 127, 000 inferences for the 40 evaluation documents. [sent-224, score-0.309]
69 Since it is not feasible to manually evaluate all the inferences made by the MLN, we calculated precision using only the top 1000 inferences. [sent-225, score-0.372]
70 Figure 1 shows both unadjusted and adjusted precision at top-n for various values of n for different BLP and MLN models. [sent-226, score-0.277]
71 For both BLPs and MLNs, simple manual weights result in superior performance than the learned weights. [sent-227, score-0.173]
72 Despite the fairly large size of the overall training sets (9,000 documents), the amount of data for each target relation is apparently still not sufficient to learn particularly accurate weights for both BLPs and MLNs. [sent-228, score-0.167]
73 edu / 354 top 25–50 inferences), with an average of 1 inference per document at 91% adjusted precision as opposed to an average of 5 inferences per document at 85% adjusted precision for BLP-ManualWeights. [sent-234, score-0.69]
74 For MLNs, learned weights show a small improvement initially only with respect to adjusted precision. [sent-235, score-0.251]
75 For BLPs, as n increases towards including all of the logically sanctioned inferences, as expected, the precision converges to the results for logical deduction. [sent-239, score-0.247]
76 However, as n decreases, both adjusted and unadjusted precision increase fairly steadily. [sent-240, score-0.304]
77 This demonstrates that probabilistic BLP inference provides a clear improvement over logical deduction, allowing the system to accurately select the best inferences that are most likely to be correct. [sent-241, score-0.533]
78 2 Results for Individual Target Relations Table 3 shows the adjusted precision for each relation for instances inferred using logical deduction, BLP-Manual-Weights and BLP-LearnedWeights with a confidence threshold of 0. [sent-245, score-0.48]
79 The probabilities estimated for inferences by MLNs are not directly comparable to those estimated by BLPs. [sent-247, score-0.309]
80 For this evaluation, using a confidence threshold based cutoff is more appropriate than using topn inferences made by the BLP models since the estimated probabilities can be directly compared across target relations. [sent-249, score-0.374]
81 Unlike relations like hasMember that are easily inferred from relations like employs and isLedBy, certain relations like hasBirthPlace are not easily inferable using the information in the ontology. [sent-251, score-0.325]
82 As a result, it might not be possible to learn accurate rules for such target relations. [sent-252, score-0.237]
83 However, the actual number of inferences can be fairly low. [sent-255, score-0.336]
84 For instance, 103 instances of hasMemberHumanAgent are inferred by logical deduction (i. [sent-256, score-0.384]
85 95 confidence threshold, indicating that the parameters learned for the corresponding rules are not very high. [sent-259, score-0.319]
86 For several relations like hasMember, hasMemberPerson, and employs, no instances were inferred by BLP-LearnedWeights at 0. [sent-260, score-0.203]
87 Probabilistic reasoning used in BLPs allows for a principled way of determining the most confident inferences, thereby allowing for improved precision over purely logical deduction. [sent-267, score-0.238]
88 In BLPs, only propositions that can be logically deduced from the extracted evidence are included in the ground network. [sent-269, score-0.186]
89 On the other hand, MLNs include all possible type-consistent groundings of all rules in the network, introducing many ground literals which cannot be logically deduced from the evidence. [sent-270, score-0.342]
90 Even though learned weights in BLPs do not result in a superior performance, learned weights in MLNs are substantially worse. [sent-272, score-0.318]
91 Due to MLN’s grounding process, several spurious facts like employs(a,a) were inferred. [sent-278, score-0.282]
92 These inferences can be prevented by including additional clauses in the MLN that impose integrity constraints that prevent such nonsensical proposi- tions. [sent-279, score-0.37]
93 ” 7 Future Work A primary goal for future research is developing an on-line structure learner for BLPs that can directly learn probabilistic first-order rules from uncertain training data. [sent-286, score-0.285]
94 This will address important limitations of LIME, which cannot accept uncertainty in the extractions used for training, is not specifically 356 optimized for learning rules for BLPs, and does not scale well to large datasets. [sent-287, score-0.21]
95 Given the relatively poor performance of BLP parameters learned using EM, tests on larger training corpora of extracted facts and the development of improved parameter-learning algorithms are clearly indicated. [sent-288, score-0.441]
96 We also plan to perform a larger-scale evaluation by employing crowdsourcing to evaluate inferred facts for a bigger corpus of test documents. [sent-289, score-0.319]
97 8 Conclusions We have introduced a novel approach using Bayesian Logic Programs to learn to infer implicit information from facts extracted from natural language text. [sent-292, score-0.512]
98 We have demonstrated that it can learn effective rules from a large database of noisy extractions. [sent-293, score-0.21]
99 Our experimental evaluation on the IC data set demonstrates the advantage of BLPs over logical deduction and an approach based on MLNs. [sent-294, score-0.249]
100 Inverting Grice’s maxims to learn rules from natural language extractions. [sent-435, score-0.21]
wordName wordTfidf (topN-words)
[('blps', 0.463), ('mlns', 0.37), ('inferences', 0.309), ('blp', 0.262), ('facts', 0.243), ('rules', 0.141), ('logical', 0.138), ('logic', 0.116), ('deduction', 0.111), ('mln', 0.111), ('kersting', 0.108), ('sorower', 0.108), ('unadjusted', 0.108), ('lime', 0.108), ('adjusted', 0.106), ('bayesian', 0.105), ('learned', 0.101), ('doppa', 0.093), ('sire', 0.093), ('schoenmackers', 0.086), ('ground', 0.082), ('infer', 0.08), ('hasmemberhumanagent', 0.077), ('raedt', 0.077), ('inferred', 0.076), ('ilp', 0.072), ('extractions', 0.069), ('learn', 0.069), ('relations', 0.068), ('ic', 0.066), ('carlson', 0.066), ('programs', 0.064), ('ie', 0.064), ('precision', 0.063), ('stated', 0.063), ('implicit', 0.062), ('hasmember', 0.062), ('nahm', 0.062), ('clauses', 0.061), ('instances', 0.059), ('extracted', 0.058), ('commonsense', 0.051), ('cpt', 0.046), ('literals', 0.046), ('logically', 0.046), ('lowd', 0.046), ('employs', 0.045), ('weights', 0.044), ('proofs', 0.043), ('probabilistic', 0.043), ('inference', 0.043), ('networks', 0.043), ('domingos', 0.039), ('grounding', 0.039), ('parameters', 0.039), ('confidence', 0.038), ('reading', 0.037), ('florian', 0.037), ('ua', 0.037), ('mohammad', 0.037), ('horn', 0.037), ('deductive', 0.037), ('purely', 0.037), ('mooney', 0.036), ('getoor', 0.032), ('firstorder', 0.032), ('berant', 0.032), ('austin', 0.032), ('developing', 0.032), ('incorrect', 0.032), ('net', 0.031), ('srl', 0.031), ('aleph', 0.031), ('cowie', 0.031), ('cpts', 0.031), ('gogate', 0.031), ('intractably', 0.031), ('janardhan', 0.031), ('mccreath', 0.031), ('nijssen', 0.031), ('sld', 0.031), ('xiaoli', 0.031), ('country', 0.03), ('ontology', 0.03), ('lack', 0.029), ('entailment', 0.029), ('body', 0.029), ('explicitly', 0.028), ('superior', 0.028), ('marginal', 0.027), ('query', 0.027), ('singla', 0.027), ('srinivasan', 0.027), ('groundings', 0.027), ('citizen', 0.027), ('lavra', 0.027), ('fairly', 0.027), ('head', 0.027), ('target', 0.027), ('darpa', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999923 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
Author: Sindhu Raghavan ; Raymond Mooney ; Hyeonseo Ku
Abstract: Most information extraction (IE) systems identify facts that are explicitly stated in text. However, in natural language, some facts are implicit, and identifying them requires “reading between the lines”. Human readers naturally use common sense knowledge to infer such implicit information from the explicitly stated facts. We propose an approach that uses Bayesian Logic Programs (BLPs), a statistical relational model combining firstorder logic and Bayesian networks, to infer additional implicit information from extracted facts. It involves learning uncertain commonsense knowledge (in the form of probabilistic first-order rules) from natural language text by mining a large corpus of automatically extracted facts. These rules are then used to derive additional facts from extracted information using BLP inference. Experimental evaluation on a benchmark data set for machine reading demonstrates the efficacy of our approach.
2 0.10030111 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
Author: Yafang Wang ; Maximilian Dylla ; Marc Spaniol ; Gerhard Weikum
Abstract: The Web and digitized text sources contain a wealth of information about named entities such as politicians, actors, companies, or cultural landmarks. Extracting this information has enabled the automated construction oflarge knowledge bases, containing hundred millions of binary relationships or attribute values about these named entities. However, in reality most knowledge is transient, i.e. changes over time, requiring a temporal dimension in fact extraction. In this paper we develop a methodology that combines label propagation with constraint reasoning for temporal fact extraction. Label propagation aggressively gathers fact candidates, and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints. Our method is able to improve on recall while keeping up with precision, which we demonstrate by experiments with biography-style Wikipedia pages and a large corpus of news articles.
3 0.081744321 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
4 0.080555297 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
Author: Enrique Alfonseca ; Katja Filippova ; Jean-Yves Delort ; Guillermo Garrido
Abstract: We describe the use of a hierarchical topic model for automatically identifying syntactic and lexical patterns that explicitly state ontological relations. We leverage distant supervision using relations from the knowledge base FreeBase, but do not require any manual heuristic nor manual seed list selections. Results show that the learned patterns can be used to extract new relations with good precision.
5 0.078877702 176 acl-2012-Sentence Compression with Semantic Role Constraints
Author: Katsumasa Yoshikawa ; Ryu Iida ; Tsutomu Hirao ; Manabu Okumura
Abstract: For sentence compression, we propose new semantic constraints to directly capture the relations between a predicate and its arguments, whereas the existing approaches have focused on relatively shallow linguistic properties, such as lexical and syntactic information. These constraints are based on semantic roles and superior to the constraints of syntactic dependencies. Our empirical evaluation on the Written News Compression Corpus (Clarke and Lapata, 2008) demonstrates that our system achieves results comparable to other state-of-the-art techniques.
6 0.070282482 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
7 0.057392649 191 acl-2012-Temporally Anchored Relation Extraction
8 0.056614336 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
9 0.054890063 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
10 0.052518226 73 acl-2012-Discriminative Learning for Joint Template Filling
11 0.051340804 157 acl-2012-PDTB-style Discourse Annotation of Chinese Text
13 0.049981955 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction
14 0.0497691 78 acl-2012-Efficient Search for Transformation-based Inference
15 0.048559114 51 acl-2012-Collective Generation of Natural Image Descriptions
16 0.047745418 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
17 0.047457151 142 acl-2012-Mining Entity Types from Query Logs via User Intent Modeling
18 0.045508802 56 acl-2012-Computational Approaches to Sentence Completion
19 0.045443065 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
20 0.045110993 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
topicId topicWeight
[(0, -0.144), (1, 0.057), (2, -0.032), (3, 0.07), (4, 0.006), (5, 0.025), (6, -0.058), (7, 0.091), (8, 0.017), (9, 0.01), (10, -0.014), (11, 0.025), (12, -0.03), (13, -0.055), (14, 0.0), (15, -0.038), (16, -0.012), (17, -0.05), (18, 0.05), (19, 0.021), (20, 0.037), (21, 0.003), (22, -0.021), (23, 0.043), (24, 0.036), (25, 0.018), (26, 0.043), (27, -0.061), (28, -0.063), (29, -0.072), (30, -0.069), (31, 0.02), (32, 0.004), (33, 0.024), (34, -0.027), (35, 0.008), (36, 0.067), (37, -0.102), (38, -0.018), (39, 0.083), (40, 0.081), (41, 0.079), (42, 0.03), (43, -0.042), (44, 0.03), (45, -0.007), (46, 0.038), (47, -0.063), (48, -0.128), (49, -0.012)]
simIndex simValue paperId paperTitle
same-paper 1 0.90645701 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
Author: Sindhu Raghavan ; Raymond Mooney ; Hyeonseo Ku
Abstract: Most information extraction (IE) systems identify facts that are explicitly stated in text. However, in natural language, some facts are implicit, and identifying them requires “reading between the lines”. Human readers naturally use common sense knowledge to infer such implicit information from the explicitly stated facts. We propose an approach that uses Bayesian Logic Programs (BLPs), a statistical relational model combining firstorder logic and Bayesian networks, to infer additional implicit information from extracted facts. It involves learning uncertain commonsense knowledge (in the form of probabilistic first-order rules) from natural language text by mining a large corpus of automatically extracted facts. These rules are then used to derive additional facts from extracted information using BLP inference. Experimental evaluation on a benchmark data set for machine reading demonstrates the efficacy of our approach.
2 0.61928809 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
3 0.57393134 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
4 0.55627394 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
Author: Yafang Wang ; Maximilian Dylla ; Marc Spaniol ; Gerhard Weikum
Abstract: The Web and digitized text sources contain a wealth of information about named entities such as politicians, actors, companies, or cultural landmarks. Extracting this information has enabled the automated construction oflarge knowledge bases, containing hundred millions of binary relationships or attribute values about these named entities. However, in reality most knowledge is transient, i.e. changes over time, requiring a temporal dimension in fact extraction. In this paper we develop a methodology that combines label propagation with constraint reasoning for temporal fact extraction. Label propagation aggressively gathers fact candidates, and an Integer Linear Program is used to clean out false hypotheses that violate temporal constraints. Our method is able to improve on recall while keeping up with precision, which we demonstrate by experiments with biography-style Wikipedia pages and a large corpus of news articles.
5 0.49472865 176 acl-2012-Sentence Compression with Semantic Role Constraints
Author: Katsumasa Yoshikawa ; Ryu Iida ; Tsutomu Hirao ; Manabu Okumura
Abstract: For sentence compression, we propose new semantic constraints to directly capture the relations between a predicate and its arguments, whereas the existing approaches have focused on relatively shallow linguistic properties, such as lexical and syntactic information. These constraints are based on semantic roles and superior to the constraints of syntactic dependencies. Our empirical evaluation on the Written News Compression Corpus (Clarke and Lapata, 2008) demonstrates that our system achieves results comparable to other state-of-the-art techniques.
6 0.47485137 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction
7 0.44470352 57 acl-2012-Concept-to-text Generation via Discriminative Reranking
8 0.4300499 42 acl-2012-Bootstrapping via Graph Propagation
9 0.42725599 129 acl-2012-Learning High-Level Planning from Text
10 0.42240041 191 acl-2012-Temporally Anchored Relation Extraction
11 0.41680992 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
12 0.40887314 23 acl-2012-A Two-step Approach to Sentence Compression of Spoken Utterances
13 0.40231052 159 acl-2012-Pattern Learning for Relation Extraction with a Hierarchical Topic Model
14 0.39277688 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
15 0.38898697 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
16 0.38862535 51 acl-2012-Collective Generation of Natural Image Descriptions
17 0.38634208 34 acl-2012-Automatically Learning Measures of Child Language Development
18 0.38022903 44 acl-2012-CSNIPER - Annotation-by-query for Non-canonical Constructions in Large Corpora
19 0.37509289 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places
20 0.36531222 123 acl-2012-Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT
topicId topicWeight
[(25, 0.022), (26, 0.021), (28, 0.024), (30, 0.032), (37, 0.017), (39, 0.503), (59, 0.01), (74, 0.028), (82, 0.019), (84, 0.016), (85, 0.017), (90, 0.086), (92, 0.058), (94, 0.026), (99, 0.037)]
simIndex simValue paperId paperTitle
1 0.96591407 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System
Author: Wen-Tai Hsieh ; Chen-Ming Wu ; Tsun Ku ; Seng-cho T. Chou
Abstract: Social Event Radar is a new social networking-based service platform, that aim to alert as well as monitor any merchandise flaws, food-safety related issues, unexpected eruption of diseases or campaign issues towards to the Government, enterprises of any kind or election parties, through keyword expansion detection module, using bilingual sentiment opinion analysis tool kit to conclude the specific event social dashboard and deliver the outcome helping authorities to plan “risk control” strategy. With the rapid development of social network, people can now easily publish their opinions on the Internet. On the other hand, people can also obtain various opinions from others in a few seconds even though they do not know each other. A typical approach to obtain required information is to use a search engine with some relevant keywords. We thus take the social media and forum as our major data source and aim at collecting specific issues efficiently and effectively in this work. 163 Chen-Ming Wu Institute for Information Industry cmwu@ i i i .org .tw Seng-cho T. Chou Department of IM, National Taiwan University chou @ im .ntu .edu .tw 1
2 0.91757178 186 acl-2012-Structuring E-Commerce Inventory
Author: Karin Mauge ; Khash Rohanimanesh ; Jean-David Ruvini
Abstract: Large e-commerce enterprises feature millions of items entered daily by a large variety of sellers. While some sellers provide rich, structured descriptions of their items, a vast majority of them provide unstructured natural language descriptions. In the paper we present a 2 steps method for structuring items into descriptive properties. The first step consists in unsupervised property discovery and extraction. The second step involves supervised property synonym discovery using a maximum entropy based clustering algorithm. We evaluate our method on a year worth of ecommerce data and show that it achieves excellent precision with good recall.
3 0.90717036 7 acl-2012-A Computational Approach to the Automation of Creative Naming
Author: Gozde Ozbal ; Carlo Strapparava
Abstract: In this paper, we propose a computational approach to generate neologisms consisting of homophonic puns and metaphors based on the category of the service to be named and the properties to be underlined. We describe all the linguistic resources and natural language processing techniques that we have exploited for this task. Then, we analyze the performance of the system that we have developed. The empirical results show that our approach is generally effective and it constitutes a solid starting point for the automation ofthe naming process.
4 0.8936612 79 acl-2012-Efficient Tree-Based Topic Modeling
Author: Yuening Hu ; Jordan Boyd-Graber
Abstract: Topic modeling with a tree-based prior has been used for a variety of applications because it can encode correlations between words that traditional topic modeling cannot. However, its expressive power comes at the cost of more complicated inference. We extend the SPARSELDA (Yao et al., 2009) inference scheme for latent Dirichlet allocation (LDA) to tree-based topic models. This sampling scheme computes the exact conditional distribution for Gibbs sampling much more quickly than enumerating all possible latent variable assignments. We further improve performance by iteratively refining the sampling distribution only when needed. Experiments show that the proposed techniques dramatically improve the computation time.
same-paper 5 0.89089471 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
Author: Sindhu Raghavan ; Raymond Mooney ; Hyeonseo Ku
Abstract: Most information extraction (IE) systems identify facts that are explicitly stated in text. However, in natural language, some facts are implicit, and identifying them requires “reading between the lines”. Human readers naturally use common sense knowledge to infer such implicit information from the explicitly stated facts. We propose an approach that uses Bayesian Logic Programs (BLPs), a statistical relational model combining firstorder logic and Bayesian networks, to infer additional implicit information from extracted facts. It involves learning uncertain commonsense knowledge (in the form of probabilistic first-order rules) from natural language text by mining a large corpus of automatically extracted facts. These rules are then used to derive additional facts from extracted information using BLP inference. Experimental evaluation on a benchmark data set for machine reading demonstrates the efficacy of our approach.
6 0.58628261 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
7 0.54006416 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
8 0.52087772 161 acl-2012-Polarity Consistency Checking for Sentiment Dictionaries
10 0.51201719 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
11 0.51157284 6 acl-2012-A Comprehensive Gold Standard for the Enron Organizational Hierarchy
13 0.5056659 151 acl-2012-Multilingual Subjectivity and Sentiment Analysis
14 0.50078291 88 acl-2012-Exploiting Social Information in Grounded Language Learning via Grammatical Reduction
15 0.49139598 138 acl-2012-LetsMT!: Cloud-Based Platform for Do-It-Yourself Machine Translation
16 0.48750088 187 acl-2012-Subgroup Detection in Ideological Discussions
17 0.48639083 61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons
18 0.48276934 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
19 0.481644 70 acl-2012-Demonstration of IlluMe: Creating Ambient According to Instant Message Logs
20 0.47908095 100 acl-2012-Fine Granular Aspect Analysis using Latent Structural Models