acl acl2012 acl2012-80 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
Reference: text
sentIndex sentText sentNum sentScore
1 i l adlerm@ cs Abstract Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. [sent-7, score-0.394]
2 In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). [sent-8, score-1.151]
3 We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. [sent-9, score-0.833]
4 We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. [sent-10, score-0.468]
5 We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm. [sent-11, score-0.32]
6 Semantic inference applications such as QA and IE crucially rely on entailment rules (Ravichandran . [sent-17, score-0.394]
7 An important type of entailment rule specifies the entailment relation between natural language predicates, e. [sent-21, score-0.694]
8 , the entailment rule ‘X invade Y → X attack Y’ can be helpful in inferring tihnvea adfeor Yem →en Xtio antetadc hypothesis. [sent-23, score-0.429]
9 Textual entailment is inherently a transitive relation , that is, the rules ‘x → y’ and ‘y → z’ imply ttihoen nr ,ul teh a‘xt → t zh’. [sent-26, score-0.524]
10 en (t2 r0u1l0es) as a graph optimization problem, where nodes are predicates and edges represent entailment rules that respect transitivity. [sent-29, score-1.067]
11 Since finding the optimal set of edges respecting transitivity is NP-hard, they employed Integer Linear Programming (ILP) to find the exact solution. [sent-30, score-0.4]
12 Indeed, they showed that applying global transitivity constraints improves rule learning comparing to methods that ignore graph structure. [sent-31, score-0.526]
13 , 2011) introduced a more efficient exact algorithm, which decomposes the graph into connected components and then applies an ILP solver over each component. [sent-34, score-0.453]
14 Despite this progress, finding the exact solution remains NP-hard the authors themselves report they were unable to solve some graphs of rather moderate size and that the coverage of their method is limited. [sent-35, score-0.2]
15 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi1c 1s7–125, In this paper we present a novel method for learning the edges of entailment graphs. [sent-42, score-0.512]
16 To that end, we first (Section 3) conjecture and empirically show that entailment graphs exhibit a “tree-like” property, i. [sent-44, score-0.549]
17 Then, we present in Section 4 our iterative approximation algorithm, where in each iteration a node is removed and re-attached back to the graph in a locally-optimal way. [sent-47, score-0.444]
18 Combining this scheme with our conjecture about the graph structure enables a linear algorithm for node re-attachment. [sent-48, score-0.431]
19 Section 5 shows empirically that this algorithm is by orders of magnitude faster than the state-of-the-art exact algorithm, and that though an optimal solution is not guaranteed, the area under the precision-recall curve drops by merely a point. [sent-49, score-0.199]
20 Second, we exploit this assumption to develop a polynomial approximation algorithm for learning entailment graphs that can scale to much larger graphs than in the past. [sent-51, score-0.762]
21 Finally, we note that learning entailment graphs bears strong similarities to related tasks such as Taxonomy Induction (Snow et al. [sent-52, score-0.504]
22 2 Background Until recently, work on learning entailment rules between predicates considered each rule independently of others and did not exploit global dependencies. [sent-54, score-0.517]
23 (2010; 2011) formulated the problem as the problem of learning global entailment graphs. [sent-61, score-0.347]
24 , ‘X attack Y’) and edges represent entailment rules between them ( ‘X invade Y → X attack Y’). [sent-64, score-0.674]
25 tF rourl every pair no tfh predicates i,j, an →en Xtai altmtaenckt score wij was learned by training a classifier over distributional similarity features. [sent-65, score-0.253]
26 A positive wij indicated that the classifier believes i→ j and a negatdivicea wij ihnadti tchaete cdla tshsaitfi tehre b cellaiesvsiesfie ir →bel jie avneds ai 9 j. [sent-66, score-0.288]
27 Given the graph nodes V (corresponding to the predicates) and the weighting function w : V V → R, they sa)i man dto t hfien dw ethigeh edges nofc a graph VG = (V, E) tthheaty m aaimxim toiz fien dthe th objective P(i,j)∈Ewij =und (eVr, Ethe) constraint that the graph is trPan(sii,tj)i∈veE (i. [sent-67, score-1.226]
28 Let xij be a binary variable indicating the existence of an edge i → j in E. [sent-72, score-0.213]
29 ∀i,j,k∈V xij xjk − xik ∀i,j∈V xij ∈ {0, 1} (1) ≤ 1 The objective function is the sum of weights over the edges of G and the constraint xij + xjk − xik 1 on gtehse binary vdar tihaeb cleosn esntrfaoirnctes x that when−ev xer xij = xjk = 1, then also xik = 1(transitivity). [sent-76, score-1.237]
30 Since ILP is NP-hard, applying an ILP solver directly does not scale well because the number of variables is O(| V |2) and the number of constraints is O( |aVb |l3e)s. [sent-77, score-0.183]
31 Thus, even a graph with ∼80 nodes (predicates) h)a. [sent-78, score-0.385]
32 , 2011), they proposed a method that efficiently decomposes the graph into smaller components and applies an ILP solver on each component separately using a cutting-plane ≤ procedure (Riedel and Clarke, 2006). [sent-81, score-0.41]
33 When the graph does not decompose into sufficiently small components, and the weights generate many violations of transitivity, solving Max-Trans-Graph becomes intractable. [sent-83, score-0.372]
34 To address this problem, we present in this paper a method for approximating the optimal set of edges within each component and show that it is much more efficient and scalable both theoretically and empirically. [sent-84, score-0.305]
35 Given a pair of terms, a small graph is constructed and constraints are imposed on the graph structure. [sent-86, score-0.651]
36 Another approximation method that violates transitivity constraints is LP relaxation (Martins et al. [sent-89, score-0.314]
37 In LP relaxation, the constraint xij ∈ {0, 1} is replaced by 0 ≤ xij ≤ 1, transforming th∈e probliesm re pfrlaomce an yIL 0P ≤ ≤to x a L≤in 1ea,r t Program (LP), w prhoicbhis polynomial. [sent-91, score-0.356]
38 An LP solver is then applied on the problem, and variables xij that are assigned a fractional value are rounded to their nearest integer and so many violations of transitivity easily occur. [sent-92, score-0.481]
39 The solution when applying LP relaxation is not a transitive graph, but nevertheless we show for comparison in Section 5 that our method is much faster. [sent-93, score-0.195]
40 3 Forest-reducible Graphs The entailment relation, described by entailment graphs, is typically from a “semantically-specific” predicate to a more “general” one. [sent-95, score-0.722]
41 Thus, intuitively, the topology of an entailment graph is expected to be “tree-like”. [sent-96, score-0.646]
42 This property of entailment graphs is an interesting topological observation on its own, but also enables the efficient approximation algorithm of Section 4. [sent-98, score-0.639]
43 For a directed edge i → j in a directed acyclic graphs (DAG), we tegrem i t →he njo idne ai a rcehcitledd o afc yncoldice j, and j a parent of i. [sent-99, score-0.488]
44 A directed forest is a DAG 119 (a) Xdisease (b) cefore mpqiudme omn ti cin ino c urinbegin (c)becfroe mpqiudme onmnti cin i noc urinbegin Figure 1: A fragment of an entailment graph (a), its SCC graph (b) and its reduced graph (c). [sent-100, score-1.646]
45 Nodes are predicates with typed variables (see Section 5), which are omitted in (b) and (c) for compactness. [sent-101, score-0.192]
46 The entailment graph in Figure 1a (subgraph from the data set described in Section 5) is clearly not a directed forest it contains a cycle of size two comprising the nodes ‘X common in Y’ and ‘X frequent in Y’, and in addition the node ‘X be epidemic in Y’ has 3 parents. [sent-103, score-1.01]
47 However, we can convert it to a directed forest by applying the following operations. [sent-104, score-0.184]
48 T edheg Se iCnC G graph is always a DAG (Cormen et al. [sent-106, score-0.299]
49 , 2002), and if G is itrsan alswitaivyes tah eDnA thGe SCCorCm graph aisl ,a 2lso00 t2ra),ns anitdive if. [sent-107, score-0.299]
50 TGh ies graph in Figure 1b is the SCC graph of the one in Xcountry invadeX cYopulnatcrey an exY p lYapcela cbe part of Xcountry Figure 2: A fragment of an entailment graph that is not an FRG. [sent-108, score-1.244]
51 Figure 1a, but is still not a directed forest since the node ‘X be epidemic in Y’ has two parents. [sent-109, score-0.278]
52 The transitive closure of a directed graph G is obTtahineed tr by adding an edge f aro dmir encotdede ig rtaop nhod Ge j if there is a path in G from ito j. [sent-110, score-0.596]
53 The transitive irefd tuhecrtieon is o af pGa ihs nob Gtai fnreodm by removing atrlal edges rwedhousceti aonbs eonfc Ge d isoe os bntaotin aefdfe cbty yit rse mtraonvsiintgive a lcl o esdugrees. [sent-111, score-0.295]
54 We thus define the reduced graph Gred = (Vred, Ered) of a directed graph G as the tGransitive reduction o)f o oitfs aS dCiCre graph. [sent-114, score-0.797]
55 aTphhe graph hine Figure 1c is the reduced graph of the one in Figure 1a and is a directed forest. [sent-115, score-0.765]
56 We say a graph is a forest-reducible graph (FRG) if all nodes in its reduced form have no more than one parent. [sent-116, score-0.74]
57 The intuition behind this assumption is that the predicate on the left-hand-side of a unidirectional entailment rule has a more specific meaning than the one on the right-hand-side. [sent-118, score-0.375]
58 Accordingly, the reduced graph in Figure 1c is an FRG. [sent-120, score-0.355]
59 We note that this is not always the case: for example, the entailment graph in Figure 2 is not an FRG, because ‘X annex Y’ entails both ‘Y be part of X’ and ‘X invade Y’, while the latter two do not entail one another. [sent-121, score-0.695]
60 Consequently, a natural variant of the Max-Trans-Graph problem is to restrict the required output graph of the optimiza- tion problem (1) to an FRG. [sent-123, score-0.299]
61 We sampled 7 gold standard entailment graphs from the data set 120 described in Section 5, manually transformed them into FRGs by deleting a minimal number of edges, and measured recall over the set of edges in each graph (precision is naturally 1. [sent-126, score-0.968]
62 95, illustrating that deleting a very small proportion of edges converts an entailment graph into an FRG. [sent-129, score-0.811]
63 An ILP formulation for Max-Trans-Forest is simple a transitive graph is an FRG if all nodes in its reduced graph have no more than one parent. [sent-132, score-0.87]
64 ear constraint to ILP (1): ∀i,j,k∈V xij +xik + (1 − xjk) + (1 − xkj) ≤3 (2) We note that despite the restriction to FRGs, MaxTrans-Forest is an NP-hard problem by a reduction from the X3C problem (Garey and Johnson, 1979). [sent-135, score-0.231]
65 4 Sequential Approximation Algorithms In this section we present Tree-Node-Fix, an efficient approximation algorithm for Max-Trans-Forest, as well as Graph-Node-Fix, an approximation for MaxTrans-Graph. [sent-137, score-0.209]
66 Re-attaching a node v is performed by removing v from the graph and connecting it back with a better set of edges, while maintaining the constraint that it is an FRG. [sent-143, score-0.412]
67 This is done by considering all possible edges from/to the other graph nodes and choosing (a)c v (bd)1 v dc2 … (b’)…cd v … ( cr)1 … v r…2 r…3 Figure 3: (a) Inserting v into a component c ∈ Vred. [sent-144, score-0.584]
68 the optimal subset, while the rest of the graph remains fixed. [sent-148, score-0.343]
69 Formally, let Sv−in = Pi6=v wiv · xiv be the sumP of scores over v’s incomPing edges a xnd Sv−out = Pk6=v wvk · xvk be the sum Pof scores over v’s outgoinPg edges. [sent-149, score-0.295]
70 Clearly, at each re-attachment the value of the objective function cannot decrease, since the optimization algorithm considers the previous graph as one of its candidate solutions. [sent-152, score-0.362]
71 A node tchhailtd dirse nno itn a descendant of c can not become a child of v, since this would create a new path from that node to c and would require by transitivity to add a corresponding directed edge to c (but all graph edges not connecting v are fixed). [sent-167, score-1.005]
72 Thus, this case requires adding in G edges from v to all nodes in c and tqhueiirre ancestors, na Gnd e adlgseos f forro mea vch t new nchodildes so ifn v, adnednoted by d ∈ Vred, we add edges from all nodes in ndo oatendd tbhyei dr d ∈es Vcendants to v. [sent-169, score-0.502]
73 , 2002) and it is easy to verify that transitive reduction in FRGs is also linear (Line 1). [sent-189, score-0.196]
74 As the reduced graph is a forest, this simply means going over all nodes of Vred, and so the entire algorithm is linear. [sent-192, score-0.5]
75 In Section 5 we ran TNF until convergence and the maximal number of iterations over graph nodes was 8. [sent-196, score-0.438]
76 2 Graph-node-fix Next, we show Graph-Node-Fix (GNF), a similar approximation that employs the same re-attachment strategy but does not assume the graph is an FRG. [sent-198, score-0.373]
77 Nevertheless, the ILP in GNF is simpler than (1), since we consider only candidate edges 122 i vv kk i v k i v k Figure 4: Three types of transitivity constraint violations. [sent-200, score-0.355]
78 Figure 4 illustrates the three types of possible transitivity constraint violations when reattaching v. [sent-202, score-0.278]
79 3 Adding local constraints For some pairs of predicates i,j we sometimes have prior knowledge whether ientails j or not. [sent-206, score-0.209]
80 In all algorithms that apply an ILP solver, we add a constraint xij = 1if ientails j or xij = 0 if i does not entail j. [sent-208, score-0.389]
81 Similarly, in TNF we incorporate local constraints by setting wij = ∞ or wij = −∞. [sent-209, score-0.313]
82 The data set contains 10 entailment graphs, where graph nodes are typed predicates. [sent-214, score-0.774]
83 The data set contains 39,012 potential edges, of which 3,427 are annotated as edges (valid entailment rules) and 35,585 are annotated as non-edges. [sent-219, score-0.512]
84 The weighting function for tchlaes graph edges w i s → →de jfi. [sent-222, score-0.464]
85 n Tedh as wij = sij −λ, wonh feroerces is a single parameter controlling graph sparseness: as increases, wij decreases and becomes negative for more pairs of predicates, rendering the graph more sparse. [sent-223, score-0.901]
86 We implemented the following algorithms for λ λ learning graph edges, where in all of them the graph is first decomposed into components according to Berant et al’s method, as explained in Section 2. [sent-226, score-0.598]
87 No-trans Local scores are used without transitivity constraints an edge (i, j) is inserted iff wij > 0. [sent-227, score-0.415]
88 LP-relax Solving Max-Trans-Graph approximately by applying LP-relaxation (see Section 2) on each graph component. [sent-232, score-0.325]
89 Graph-Node-Fix (GNF) Initialization of each component is performed in the following way: if the graph is very sparse, i. [sent-237, score-0.333]
90 λ ≥ C for some constant C (set tho 1s vine our experiments), Cthe fno solving otnhes graph – exactly is not an issue and we use Exact-graph. [sent-239, score-0.333]
91 Tree-Node-Fix (TNF) Initialization is done as in GNF, except that if it generates a graph that is not an FRG, it is corrected by a simple heuristic: for every node in the reduced graph Gred that has more than 1We use the Gurobi optimization package in all experiments. [sent-243, score-0.725]
92 Wmbee erv oaflu naotdee algorithms by comparing the set of gold standard edges with the set of edges learned by each algorithm. [sent-246, score-0.33]
93 Figure 5 compares run-times2 of Exact-graph, GNF, TNF, and LP-relax as −λ increases and the graph TbeNcFo,m aensd d LePns-reer. [sent-251, score-0.299]
94 We observe that both Exact-graph and TNF substantially outperform No-trans and that TNF’s graph quality is only slightly lower than Exact-graph (which is extremely slow). [sent-282, score-0.299]
95 43), illustrating that assuming that entailment graphs are FRGs (Section 3) is reasonable in this data set. [sent-294, score-0.504]
96 To conclude, TNF learns transitive entailment graphs of good quality much faster than Exactgraph. [sent-295, score-0.634]
97 6 Conclusion Learning large and accurate resources of entailment rules is essential in many semantic inference applications. [sent-297, score-0.394]
98 The first contribution of this paper is a novel modeling assumption that entailment graphs are very similar to FRGs, which is analyzed and validated empirically. [sent-299, score-0.504]
99 The main contribution of the paper is an efficient polynomial approximation algorithm for learning entailment rules, which is based on this assumption. [sent-300, score-0.482]
100 We suggest our method as an important step towards scalable acquisition of precise entailment resources. [sent-302, score-0.375]
wordName wordTfidf (topN-words)
[('tnf', 0.423), ('entailment', 0.347), ('graph', 0.299), ('vred', 0.228), ('ilp', 0.185), ('gred', 0.179), ('edges', 0.165), ('graphs', 0.157), ('xij', 0.157), ('transitivity', 0.148), ('gnf', 0.146), ('wij', 0.13), ('transitive', 0.13), ('berant', 0.125), ('predicates', 0.123), ('frg', 0.114), ('scc', 0.114), ('directed', 0.111), ('nodes', 0.086), ('frgs', 0.081), ('dagan', 0.081), ('szpektor', 0.08), ('solver', 0.077), ('approximation', 0.074), ('auc', 0.072), ('ido', 0.072), ('node', 0.071), ('lp', 0.068), ('inserting', 0.065), ('xiv', 0.065), ('xjk', 0.065), ('xvk', 0.065), ('xik', 0.057), ('reduced', 0.056), ('edge', 0.056), ('parent', 0.053), ('constraints', 0.053), ('maximal', 0.053), ('cormen', 0.049), ('epidemic', 0.049), ('garey', 0.049), ('invade', 0.049), ('reattaching', 0.049), ('xdisease', 0.049), ('idan', 0.048), ('forest', 0.047), ('rules', 0.047), ('empirically', 0.045), ('optimal', 0.044), ('exact', 0.043), ('descendant', 0.043), ('sij', 0.043), ('typed', 0.042), ('constraint', 0.042), ('child', 0.041), ('curve', 0.04), ('violations', 0.039), ('relaxation', 0.039), ('schoenmackers', 0.039), ('dag', 0.039), ('children', 0.037), ('objective', 0.036), ('component', 0.034), ('linear', 0.034), ('solving', 0.034), ('efficient', 0.034), ('integer', 0.033), ('attack', 0.033), ('aharon', 0.033), ('cxhild', 0.033), ('cyprus', 0.033), ('ientails', 0.033), ('maxc', 0.033), ('mpqiudme', 0.033), ('tue', 0.033), ('urinbegin', 0.033), ('xcountry', 0.033), ('xwiv', 0.033), ('xwvi', 0.033), ('reduction', 0.032), ('going', 0.032), ('poon', 0.031), ('ontology', 0.031), ('termed', 0.03), ('shinyama', 0.028), ('slowly', 0.028), ('believes', 0.028), ('cin', 0.028), ('xmax', 0.028), ('inserted', 0.028), ('scalable', 0.028), ('predicate', 0.028), ('induction', 0.028), ('algorithm', 0.027), ('jacob', 0.027), ('variables', 0.027), ('hypothesis', 0.026), ('ered', 0.026), ('bob', 0.026), ('applying', 0.026)]
simIndex simValue paperId paperTitle
same-paper 1 0.99999934 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
2 0.22460704 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
Author: Meni Adler ; Jonathan Berant ; Ido Dagan
Abstract: We present a novel text exploration model, which extends the scope of state-of-the-art technologies by moving from standard concept-based exploration to statement-based exploration. The proposed scheme utilizes the textual entailment relation between statements as the basis of the exploration process. A user of our system can explore the result space of a query by drilling down/up from one statement to another, according to entailment relations specified by an entailment graph and an optional concept taxonomy. As a prominent use case, we apply our exploration system and illustrate its benefit on the health-care domain. To the best of our knowledge this is the first implementation of an exploration system at the statement level that is based on the textual entailment relation. 1
3 0.18557996 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
4 0.15490645 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
Author: Yashar Mehdad ; Matteo Negri ; Marcello Federico
Abstract: We address a core aspect of the multilingual content synchronization task: the identification of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions. Using a combination of lexical, syntactic, and semantic features to train a cross-lingual textual entailment system, we report promising results on different datasets.
5 0.13302507 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
6 0.12382383 78 acl-2012-Efficient Search for Transformation-based Inference
7 0.092643715 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions
8 0.088868968 51 acl-2012-Collective Generation of Natural Image Descriptions
9 0.088522188 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
11 0.081204161 106 acl-2012-Head-driven Transition-based Parsing with Top-down Prediction
12 0.07790906 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
13 0.076509148 42 acl-2012-Bootstrapping via Graph Propagation
14 0.071395412 184 acl-2012-String Re-writing Kernel
15 0.070337214 64 acl-2012-Crosslingual Induction of Semantic Roles
16 0.070282482 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
17 0.069232672 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures
18 0.066390462 191 acl-2012-Temporally Anchored Relation Extraction
19 0.060466766 4 acl-2012-A Comparative Study of Target Dependency Structures for Statistical Machine Translation
20 0.060407173 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
topicId topicWeight
[(0, -0.165), (1, 0.045), (2, -0.077), (3, 0.06), (4, -0.007), (5, 0.04), (6, -0.039), (7, 0.255), (8, 0.037), (9, 0.012), (10, -0.084), (11, 0.245), (12, 0.066), (13, -0.215), (14, 0.021), (15, -0.033), (16, 0.044), (17, 0.003), (18, 0.094), (19, 0.054), (20, -0.048), (21, -0.047), (22, -0.021), (23, -0.012), (24, 0.052), (25, 0.04), (26, -0.061), (27, -0.058), (28, -0.059), (29, -0.078), (30, -0.127), (31, -0.002), (32, 0.159), (33, 0.02), (34, -0.072), (35, -0.195), (36, 0.033), (37, -0.012), (38, -0.027), (39, 0.07), (40, 0.072), (41, 0.015), (42, 0.158), (43, 0.104), (44, 0.119), (45, -0.056), (46, 0.054), (47, 0.026), (48, -0.011), (49, -0.109)]
simIndex simValue paperId paperTitle
same-paper 1 0.97025633 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
2 0.84575987 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
Author: Meni Adler ; Jonathan Berant ; Ido Dagan
Abstract: We present a novel text exploration model, which extends the scope of state-of-the-art technologies by moving from standard concept-based exploration to statement-based exploration. The proposed scheme utilizes the textual entailment relation between statements as the basis of the exploration process. A user of our system can explore the result space of a query by drilling down/up from one statement to another, according to entailment relations specified by an entailment graph and an optional concept taxonomy. As a prominent use case, we apply our exploration system and illustrate its benefit on the health-care domain. To the best of our knowledge this is the first implementation of an exploration system at the statement level that is based on the textual entailment relation. 1
3 0.67301166 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
Author: Yashar Mehdad ; Matteo Negri ; Marcello Federico
Abstract: We address a core aspect of the multilingual content synchronization task: the identification of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions. Using a combination of lexical, syntactic, and semantic features to train a cross-lingual textual entailment system, we report promising results on different datasets.
4 0.64999413 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
5 0.50604945 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions
Author: Elena Cabrio ; Serena Villata
Abstract: Blogs and forums are widely adopted by online communities to debate about various issues. However, a user that wants to cut in on a debate may experience some difficulties in extracting the current accepted positions, and can be discouraged from interacting through these applications. In our paper, we combine textual entailment with argumentation theory to automatically extract the arguments from debates and to evaluate their acceptability.
6 0.49009207 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
7 0.47022086 42 acl-2012-Bootstrapping via Graph Propagation
8 0.38509664 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
9 0.31729177 184 acl-2012-String Re-writing Kernel
10 0.31164783 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
11 0.30975521 78 acl-2012-Efficient Search for Transformation-based Inference
12 0.27199513 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
13 0.27055585 12 acl-2012-A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction
14 0.2587316 194 acl-2012-Text Segmentation by Language Using Minimum Description Length
15 0.24606995 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
17 0.24128188 34 acl-2012-Automatically Learning Measures of Child Language Development
18 0.23990759 131 acl-2012-Learning Translation Consensus with Structured Label Propagation
19 0.23414107 51 acl-2012-Collective Generation of Natural Image Descriptions
20 0.23069674 191 acl-2012-Temporally Anchored Relation Extraction
topicId topicWeight
[(25, 0.017), (26, 0.021), (28, 0.027), (30, 0.069), (37, 0.049), (39, 0.058), (49, 0.013), (74, 0.024), (82, 0.041), (84, 0.021), (85, 0.051), (87, 0.271), (90, 0.075), (92, 0.09), (94, 0.048), (99, 0.044)]
simIndex simValue paperId paperTitle
same-paper 1 0.76498854 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
Author: Jonathan Berant ; Ido Dagan ; Meni Adler ; Jacob Goldberger
Abstract: Learning entailment rules is fundamental in many semantic-inference applications and has been an active field of research in recent years. In this paper we address the problem of learning transitive graphs that describe entailment rules between predicates (termed entailment graphs). We first identify that entailment graphs exhibit a “tree-like” property and are very similar to a novel type of graph termed forest-reducible graph. We utilize this property to develop an iterative efficient approximation algorithm for learning the graph edges, where each iteration takes linear time. We compare our approximation algorithm to a recently-proposed state-of-the-art exact algorithm and show that it is more efficient and scalable both theoretically and empirically, while its output quality is close to that given by the optimal solution of the exact algorithm.
2 0.68338478 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
Author: Gerard de Melo ; Gerhard Weikum
Abstract: We present UWN, a large multilingual lexical knowledge base that describes the meanings and relationships of words in over 200 languages. This paper explains how link prediction, information integration and taxonomy induction methods have been used to build UWN based on WordNet and extend it with millions of named entities from Wikipedia. We additionally introduce extensions to cover lexical relationships, frame-semantic knowledge, and language data. An online interface provides human access to the data, while a software API enables applications to look up over 16 million words and names.
3 0.48374918 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
4 0.48148355 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
Author: Bevan Jones ; Mark Johnson ; Sharon Goldwater
Abstract: Many semantic parsing models use tree transformations to map between natural language and meaning representation. However, while tree transformations are central to several state-of-the-art approaches, little use has been made of the rich literature on tree automata. This paper makes the connection concrete with a tree transducer based semantic parsing model and suggests that other models can be interpreted in a similar framework, increasing the generality of their contributions. In particular, this paper further introduces a variational Bayesian inference algorithm that is applicable to a wide class of tree transducers, producing state-of-the-art semantic parsing results while remaining applicable to any domain employing probabilistic tree transducers.
5 0.47914472 31 acl-2012-Authorship Attribution with Author-aware Topic Models
Author: Yanir Seroussi ; Fabian Bohnert ; Ingrid Zukerman
Abstract: Authorship attribution deals with identifying the authors of anonymous texts. Building on our earlier finding that the Latent Dirichlet Allocation (LDA) topic model can be used to improve authorship attribution accuracy, we show that employing a previously-suggested Author-Topic (AT) model outperforms LDA when applied to scenarios with many authors. In addition, we define a model that combines LDA and AT by representing authors and documents over two disjoint topic sets, and show that our model outperforms LDA, AT and support vector machines on datasets with many authors.
6 0.47707841 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars
7 0.47019354 38 acl-2012-Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing
8 0.46826404 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
9 0.46549639 205 acl-2012-Tweet Recommendation with Graph Co-Ranking
10 0.46495077 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
11 0.4635804 139 acl-2012-MIX Is Not a Tree-Adjoining Language
12 0.46003664 152 acl-2012-Multilingual WSD with Just a Few Lines of Code: the BabelNet API
13 0.45957655 75 acl-2012-Discriminative Strategies to Integrate Multiword Expression Recognition and Parsing
14 0.45851028 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
15 0.45816666 191 acl-2012-Temporally Anchored Relation Extraction
16 0.45724854 146 acl-2012-Modeling Topic Dependencies in Hierarchical Text Categorization
17 0.45136485 6 acl-2012-A Comprehensive Gold Standard for the Enron Organizational Hierarchy
18 0.45095894 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation
20 0.44983962 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle