acl acl2012 acl2012-36 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
Reference: text
sentIndex sentText sentNum sentScore
1 astern7 @ gmai l com Abstract This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. [sent-2, score-0.224]
2 Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. [sent-3, score-0.352]
3 Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components. [sent-5, score-0.52]
4 1 Introduction Recognizing Textual Entailment (RTE) is the task of identifying, given two text fragments, whether one of them can be inferred from the other (Dagan et al. [sent-6, score-0.033]
5 For example, in Information Extraction (IE), a system may be given a template with variables (e. [sent-9, score-0.05]
6 In Summarization, a good summary should be inferred from the 1www . [sent-12, score-0.033]
7 il/ ˜nlp/downloads/biutee 73 Ido Dagan Computer Science Department Bar-Ilan University Ramat-Gan 52900, Israel dagan @ c s . [sent-16, score-0.178]
8 , (Clark and Harrison, 2010; MacKinlay and Baldwin, 2009)), to complex linguistically-motivated methods, which incorporate extensive linguistic analysis (syntactic parsing, coreference resolution, semantic role labelling, etc. [sent-25, score-0.091]
9 ) and a rich inventory of linguistic and world-knowledge resources (e. [sent-26, score-0.05]
10 Thus, flexible and extensible publicly available RTE systems are expected to significantly facilitate research in this field. [sent-32, score-0.072]
11 More concretely, two major research communities would benefit from a publicly available RTE system: 1. [sent-33, score-0.058]
12 Higher-level application developers, who would use an RTE system to solve inference tasks in their application. [sent-34, score-0.137]
13 RTE systems for this type of researchers should be adaptable for the application specific data: they should be configurable, trainable, and extensible with inference knowledge that captures application-specific phenomena. [sent-35, score-0.243]
14 Researchers in the RTE community, that would not need to build a complete RTE system for their research. [sent-37, score-0.093]
15 c s 2o0c1ia2ti Aosns fo cria Ctio nm fpourta Ctoiomnpault Laitniognuaislt Licisn,g puaigsteiscs 73–78, their novel research components into an existing open-source system. [sent-40, score-0.073]
16 Such research efforts might include developing knowledge resources, developing inference components for specific phenomena such as temporal inference, or extending RTE to different languages. [sent-41, score-0.264]
17 A flexible and extensible RTE system is expected to encourage researchers to create and share their textual-inference components. [sent-42, score-0.17]
18 A good example from another research area is the Moses system for Statistical Machine Translation (SMT) (Koehn et al. [sent-43, score-0.05]
19 , 2007), which provides the core SMT components while being extended with new research components by a large scientific community. [sent-44, score-0.172]
20 Moreover, these systems are restricted in the types of knowledge resources which they can utilize, and in the scope of their inference algorithms. [sent-46, score-0.173]
21 For example, EDITS2 (Kouylekov and Negri, 2010) is a distancebased RTE system, which can exploit only lexical knowledge resources. [sent-47, score-0.061]
22 NutCracker3 (Bos and Markert, 2005) is a system based on logical representation and automatic theorem proving, but utilizes only WordNet (Fellbaum, 1998) as a lexical knowledge resource. [sent-48, score-0.138]
23 Our system provides state-of-the-art linguistic analysis tools and exploits various types of manually built and automatically acquired knowledge resources, including lexical, lexical-syntactic and syntactic rewrite rules. [sent-50, score-0.166]
24 Furthermore, the system components, including preprocessing utilities, knowledge resources, and even the steps of the inference algorithm, are modular, and can be replaced or extended easily with new components. [sent-51, score-0.202]
25 Extensibility and flexibility are also supported by a plug-in mechanism, by which new inference components can be integrated without changing existing code. [sent-52, score-0.16]
26 Notable support for researchers is provided by a visual tracing tool, Tracer, which visualizes every step of the inference process as shown in Figures 2 2http : / / edit s . [sent-53, score-0.481]
27 We will use this tool to illustrate various inference components in the demonstration session. [sent-61, score-0.247]
28 1 Inference algorithm In this section we provide a high level description of the inference components. [sent-63, score-0.116]
29 Further details of the al- gorithmic components appear in references provided throughout this section. [sent-64, score-0.073]
30 BIUTEE follows the transformation based paradigm, which recognizes textual entailment by converting the text into the hypothesis via a sequence of transformations. [sent-65, score-0.422]
31 Such a sequence is often referred to as a proof, and is performed, in our system, over the syntactic representation of the text - the text’s parse tree(s). [sent-66, score-0.066]
32 A transformation modifies a given parse tree, resulting in a generation of a new parse tree, which can be further modified by subsequent transformations. [sent-67, score-0.168]
33 This text-hypothesis pair requires two major transformations: (1) substituting “him” by “Charles G. [sent-77, score-0.049]
34 Taylor” via a coreference substitution to an earlier mention in the text, and (2) inferring that if “X accept Y” then “X was offered Y”. [sent-78, score-0.198]
35 Given a T-H pair, the system finds a proof which generates H from T, and estimates the proof validity. [sent-80, score-0.756]
36 The system returns a score which indicates how likely it is that the obtained proof is valid, i. [sent-81, score-0.403]
37 , the transformations along the proof preserve entailment from the meaning of T. [sent-83, score-0.942]
38 The main type of transformations is application of entailment-rules (Bar-Haim et al. [sent-84, score-0.353]
39 An entailment rule is composed of two sub-trees, termed lefthand-side and right-hand-side, and is applied on a parse-tree fragment that matches its left-hand-side, by substituting the left-hand-side with the righthand-side. [sent-86, score-0.335]
40 The simplest type of rules is lexical rules, like car → vehi cle. [sent-88, score-0.061]
41 More complicated rules capture the entailment relation between predicate-argument structures, like X accept Y → X was o f fe red Y. [sent-89, score-0.309]
42 E lniktaeil Xmen atc cruelpest can →also X e wnacosde o syntactic phenomena like the semantic equivalence of active and passive structures (X Verb [ act ive ] Y → Y i Verb [ pas s ive ] by X ) . [sent-90, score-0.106]
43 The complete formalism of entailment rules, adopted by our system, is described in (Bar-Haim et al. [sent-93, score-0.279]
44 Coreference relations are utilized via coreferencesubstitution transformations: one mention of an entity is replaced by another mention of the same entity, based on coreference relations. [sent-95, score-0.184]
45 In the above example the system could apply such a transformation to substitute “him” with “Charles G. [sent-96, score-0.138]
46 Since applications of entailment rules and coreference substitutions are yet, in most cases, insufficient in transforming T into H, our system allows on-the-fly transformations. [sent-98, score-0.444]
47 These transformations include insertions of missing nodes, flipping partsof-speech, moving sub-trees, etc. [sent-99, score-0.381]
48 (see (Stern and Dagan, 2011) for a complete list ofthese transformations). [sent-100, score-0.043]
49 Since these transformations are not justified by given knowledge resources, we use linguisticallymotivated features to estimate their validity. [sent-101, score-0.389]
50 For example, for on-the-fly lexical insertions we consider as features the named-entity annotation of the inserted word, and its probability estimation according to a unigram language model, which yields lower costs for more frequent words. [sent-102, score-0.081]
51 Given a (T,H) pair, the system applies a search algorithm (Stern et al. [sent-103, score-0.118]
52 For each proof step oi the system calculates a cost c(oi). [sent-108, score-0.677]
53 This cost is defined as follows: the system uses a weightvector w, which is learned in the training phase. [sent-109, score-0.143]
54 In addition, each transformation oi is represented by a feature vector f(oi) which characterizes the transformation. [sent-110, score-0.242]
55 The proof cost is defined as tsh dee sum do fa sth we c ·o fs(tos of the transformations from which it is composed, i. [sent-112, score-0.799]
56 : , Xn c(O) Xc(oi) Xi= X1 = Xw · f(oi) Xn Xi= X1 Xn = w ·Xf(oi) Xi= X1 (1) If the proof cost is below a threshold b, then the sys75 tem concludes that T entails H. [sent-114, score-0.446]
57 The complete description of the cost model, as well as the method for learning the parameters w and b is described in (Stern and Dagan, 2011). [sent-115, score-0.165]
58 2 System flow The BIUTEE system flow (Figure 1) starts with pre- processing of the text and the hypothesis. [sent-117, score-0.104]
59 , 2005) and ArkRef coreference resolver (Haghighi and Klein, 2009), as well as utilities for sentencesplitting and numerical-normalizations. [sent-119, score-0.176]
60 In addition, BIUTEE supports integration of users’ own utilities by simply implementing the appropriate interfaces. [sent-120, score-0.119]
61 Entailment recognition begins with a global processing phase in which inference related computations that are not part of the proof are performed. [sent-121, score-0.508]
62 Next, the system constructs a proof which is a sequence of transformations that transform the text into the hypothesis. [sent-123, score-0.756]
63 Finding such a proof is a sequential process, conducted by the search algorithm. [sent-124, score-0.388]
64 In each step of the proof construction the system examines all possible transformations that can be applied, generates new trees by applying selected transformations, and calculates their costs by constructing appropriate feature-vectors for them. [sent-125, score-0.811]
65 New types of transformations can be added to BIUTEE by a plug-in mechanism, without the need to change the code. [sent-126, score-0.353]
66 For example, imagine that a researcher applies BIUTEE on the medical domain. [sent-127, score-0.103]
67 There might be some well-known domain knowledge and rules that every medical person knows. [sent-128, score-0.102]
68 A plug-in is a piece of code which implements a few interfaces that detect which transformations can be applied, apply them, and construct appropriate feature-vectors for each applied transformation. [sent-130, score-0.353]
69 In addition, a plug-in can perform computations for the global processing phase. [sent-131, score-0.036]
70 Eventually, the search algorithm finds a (approximately) lowest cost proof. [sent-132, score-0.128]
71 This cost is normalized as a score between 0 and 1, and returned as output. [sent-133, score-0.093]
72 Training the cost model parameters w and b (see subsection 2. [sent-134, score-0.093]
73 Median and Best indicate the median score and the highest score of all submissions, respectively. [sent-142, score-0.068]
74 We use a Logistic-Regression learning algorithm, but, similar to other components, alternative learning-algorithms can be integrated easily by implementing an appropriate interface. [sent-144, score-0.034]
75 , 2010) is presented in Table 1: BIUTEE is better than the median of all submitted results, and in RTE-6 it outperforms all other systems. [sent-148, score-0.068]
76 In particular, they do not show all the potential transformations that could have been applied, but were rejected by the search algorithm. [sent-150, score-0.453]
77 However, such information is crucial for researchers, who need to observe the usage and the potential impact of each component of the system. [sent-151, score-0.058]
78 We address this need by providing an interactive 76 visual tracing tool, Tracer, which presents detailed information on each proof step, including potential steps that were not included in the final proof. [sent-152, score-0.646]
79 In the demo session, we will use the visual tracing tool to illustrate all of BIUTEE’s components4. [sent-153, score-0.355]
80 1 Modes Tracer provides two modes for tracing proof construction: automatic mode and manual mode. [sent-155, score-0.703]
81 In automatic mode, shown in Figure 2, the tool presents the complete process of inference, as conducted by the system’s search: the parse trees, the proof steps, the cost of each step and the final score. [sent-156, score-0.616]
82 For each transformation the tool presents the parse tree before and after applying the transformation, highlighting the impact of this transformation. [sent-157, score-0.282]
83 In manual mode, the user can invoke specific transformations proactively, including transformations rejected by the search algorithm for the eventual proof. [sent-158, score-0.848]
84 As shown in Figure 3, the tool provides a list of transformations that match the given parse-tree, from which the user chooses and applies a single transformation at each step. [sent-159, score-0.626]
85 Similar to automatic mode, their impact on the parse tree is shown visually. [sent-160, score-0.107]
86 2 Use cases Developers of knowledge resources, as well as other types of transformations, can be aided by Tracer as follows. [sent-162, score-0.036]
87 Applying an entailment rule is a process of first matching the rule’s left-hand-side to the text parse-tree (or to any tree along the proof), and then substituting it by the rule’s right-hand-side. [sent-163, score-0.369]
88 are a large screen and In- Figure 2: Entailment Rule application visualized in tracing tool. [sent-165, score-0.192]
89 The rule description is the first transformation (printed in bold) of the proof, shown in the lower pane. [sent-167, score-0.167]
90 It is followed by transformations 2 and 3, which are syntactic rewrite rules. [sent-168, score-0.407]
91 rule, the user can provide a text for which it is supposed to match, examine the list of potential transformations that can be performed on the text’s parse tree, as in Figure 3, and verify that the examined rule has been matched as expected. [sent-169, score-0.507]
92 Next, the user can apply the rule, visually examine its impact on the parse-tree, as in Figure 2, and validate that it operates as intended with no side-effects. [sent-170, score-0.072]
93 The complete inference process depends on the parameters learned in the training phase, as well as on the search algorithm which looks for lowest-cost proof from T to H. [sent-171, score-0.518]
94 Researchers investigating these algorithmic components can be assisted by the tracing tool as well. [sent-172, score-0.352]
95 For a given (T,H) pair, the automatic mode provides the complete proof found by the system. [sent-173, score-0.493]
96 Then, in the manual mode the researcher can try to construct alternative proofs. [sent-174, score-0.139]
97 If a proof with lower cost can be constructed manually it implies a limitation of the search algorithm. [sent-175, score-0.518]
98 The user can manually choose and apply each of these transformations, and observe their impact on the parse-tree. [sent-180, score-0.072]
99 An inference model for semantic entailment in natural language. [sent-215, score-0.323]
100 Simple coreference resolution with rich syntactic and semantic features. [sent-231, score-0.117]
wordName wordTfidf (topN-words)
[('biutee', 0.397), ('transformations', 0.353), ('proof', 0.353), ('rte', 0.313), ('entailment', 0.236), ('tracing', 0.192), ('dagan', 0.178), ('stern', 0.171), ('oi', 0.154), ('tracer', 0.134), ('ido', 0.104), ('cost', 0.093), ('coreference', 0.091), ('transformation', 0.088), ('inference', 0.087), ('tool', 0.087), ('utilities', 0.085), ('researchers', 0.079), ('visual', 0.076), ('components', 0.073), ('mode', 0.071), ('asher', 0.07), ('extensibility', 0.07), ('textual', 0.069), ('bentivogli', 0.068), ('median', 0.068), ('recognizing', 0.058), ('asylum', 0.054), ('mackinlay', 0.054), ('pascal', 0.051), ('resources', 0.05), ('system', 0.05), ('rule', 0.05), ('israel', 0.049), ('substituting', 0.049), ('bos', 0.047), ('opensource', 0.047), ('visualizes', 0.047), ('braz', 0.043), ('recognising', 0.043), ('salvo', 0.043), ('complete', 0.043), ('taylor', 0.042), ('extensible', 0.041), ('charles', 0.041), ('ive', 0.04), ('kouylekov', 0.04), ('luisa', 0.04), ('rejected', 0.04), ('researcher', 0.04), ('parse', 0.04), ('clark', 0.04), ('xn', 0.039), ('user', 0.039), ('offered', 0.038), ('developers', 0.038), ('hoa', 0.038), ('danilo', 0.038), ('mechanism', 0.037), ('limitation', 0.037), ('accept', 0.037), ('rules', 0.036), ('modular', 0.036), ('computations', 0.036), ('goldberg', 0.036), ('knowledge', 0.036), ('search', 0.035), ('implementing', 0.034), ('dang', 0.034), ('tree', 0.034), ('developing', 0.034), ('applies', 0.033), ('inferred', 0.033), ('modes', 0.033), ('impact', 0.033), ('mention', 0.032), ('phase', 0.032), ('publicly', 0.031), ('insufficient', 0.031), ('fellbaum', 0.031), ('medical', 0.03), ('replaced', 0.029), ('hypothesis', 0.029), ('description', 0.029), ('insertions', 0.028), ('costs', 0.028), ('rewrite', 0.028), ('manual', 0.028), ('community', 0.028), ('communities', 0.027), ('flow', 0.027), ('calculates', 0.027), ('notable', 0.027), ('logical', 0.027), ('syntactic', 0.026), ('provides', 0.026), ('alexandra', 0.025), ('finkel', 0.025), ('lexical', 0.025), ('potential', 0.025)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000001 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
2 0.53826064 78 acl-2012-Efficient Search for Transformation-based Inference
Author: Asher Stern ; Roni Stern ; Ido Dagan ; Ariel Felner
Abstract: This paper addresses the search problem in textual inference, where systems need to infer one piece of text from another. A prominent approach to this task is attempts to transform one text into the other through a sequence of inference-preserving transformations, a.k.a. a proof, while estimating the proof’s validity. This raises a search challenge of finding the best possible proof. We explore this challenge through a comprehensive investigation of prominent search algorithms and propose two novel algorithmic components specifically designed for textual inference: a gradient-style evaluation function, and a locallookahead node expansion method. Evaluations, using the open-source system, BIUTEE, show the contribution of these ideas to search efficiency and proof quality.
3 0.18733501 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
4 0.15687513 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
Author: Yashar Mehdad ; Matteo Negri ; Marcello Federico
Abstract: We address a core aspect of the multilingual content synchronization task: the identification of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions. Using a combination of lexical, syntactic, and semantic features to train a cross-lingual textual entailment system, we report promising results on different datasets.
5 0.13566287 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
Author: Meni Adler ; Jonathan Berant ; Ido Dagan
Abstract: We present a novel text exploration model, which extends the scope of state-of-the-art technologies by moving from standard concept-based exploration to statement-based exploration. The proposed scheme utilizes the textual entailment relation between statements as the basis of the exploration process. A user of our system can explore the result space of a query by drilling down/up from one statement to another, according to entailment relations specified by an entailment graph and an optional concept taxonomy. As a prominent use case, we apply our exploration system and illustrate its benefit on the health-care domain. To the best of our knowledge this is the first implementation of an exploration system at the statement level that is based on the textual entailment relation. 1
6 0.13302507 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
7 0.097553469 43 acl-2012-Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench
8 0.089919709 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions
9 0.089422293 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
10 0.084936477 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
11 0.071083769 184 acl-2012-String Re-writing Kernel
12 0.065860584 137 acl-2012-Lemmatisation as a Tagging Task
13 0.060539745 76 acl-2012-Distributional Semantics in Technicolor
14 0.057244346 127 acl-2012-Large-Scale Syntactic Language Modeling with Treelets
15 0.054747634 155 acl-2012-NiuTrans: An Open Source Toolkit for Phrase-based and Syntax-based Machine Translation
16 0.054204721 58 acl-2012-Coreference Semantics from Web Features
17 0.054075476 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
18 0.047745418 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
19 0.046410929 44 acl-2012-CSNIPER - Annotation-by-query for Non-canonical Constructions in Large Corpora
20 0.044352725 139 acl-2012-MIX Is Not a Tree-Adjoining Language
topicId topicWeight
[(0, -0.164), (1, 0.036), (2, -0.071), (3, 0.046), (4, -0.007), (5, 0.141), (6, -0.029), (7, 0.356), (8, 0.06), (9, 0.046), (10, -0.228), (11, 0.296), (12, 0.024), (13, -0.294), (14, 0.165), (15, -0.112), (16, 0.058), (17, 0.04), (18, 0.017), (19, -0.082), (20, -0.149), (21, -0.04), (22, 0.044), (23, 0.009), (24, 0.09), (25, -0.032), (26, -0.022), (27, 0.068), (28, 0.06), (29, 0.063), (30, 0.078), (31, -0.014), (32, -0.208), (33, -0.102), (34, 0.081), (35, 0.164), (36, 0.066), (37, -0.022), (38, 0.063), (39, -0.045), (40, -0.125), (41, 0.041), (42, -0.132), (43, 0.005), (44, -0.035), (45, 0.04), (46, -0.105), (47, -0.043), (48, -0.034), (49, 0.003)]
simIndex simValue paperId paperTitle
same-paper 1 0.97007585 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
2 0.91676319 78 acl-2012-Efficient Search for Transformation-based Inference
Author: Asher Stern ; Roni Stern ; Ido Dagan ; Ariel Felner
Abstract: This paper addresses the search problem in textual inference, where systems need to infer one piece of text from another. A prominent approach to this task is attempts to transform one text into the other through a sequence of inference-preserving transformations, a.k.a. a proof, while estimating the proof’s validity. This raises a search challenge of finding the best possible proof. We explore this challenge through a comprehensive investigation of prominent search algorithms and propose two novel algorithmic components specifically designed for textual inference: a gradient-style evaluation function, and a locallookahead node expansion method. Evaluations, using the open-source system, BIUTEE, show the contribution of these ideas to search efficiency and proof quality.
3 0.51323289 65 acl-2012-Crowdsourcing Inference-Rule Evaluation
Author: Naomi Zeichner ; Jonathan Berant ; Ido Dagan
Abstract: The importance of inference rules to semantic applications has long been recognized and extensive work has been carried out to automatically acquire inference-rule resources. However, evaluating such resources has turned out to be a non-trivial task, slowing progress in the field. In this paper, we suggest a framework for evaluating inference-rule resources. Our framework simplifies a previously proposed “instance-based evaluation” method that involved substantial annotator training, making it suitable for crowdsourcing. We show that our method produces a large amount of annotations with high inter-annotator agreement for a low cost at a short period of time, without requiring training expert annotators.
4 0.43042222 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents
Author: Yashar Mehdad ; Matteo Negri ; Marcello Federico
Abstract: We address a core aspect of the multilingual content synchronization task: the identification of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions. Using a combination of lexical, syntactic, and semantic features to train a cross-lingual textual entailment system, we report promising results on different datasets.
5 0.40819296 82 acl-2012-Entailment-based Text Exploration with Application to the Health-care Domain
Author: Meni Adler ; Jonathan Berant ; Ido Dagan
Abstract: We present a novel text exploration model, which extends the scope of state-of-the-art technologies by moving from standard concept-based exploration to statement-based exploration. The proposed scheme utilizes the textual entailment relation between statements as the basis of the exploration process. A user of our system can explore the result space of a query by drilling down/up from one statement to another, according to entailment relations specified by an entailment graph and an optional concept taxonomy. As a prominent use case, we apply our exploration system and illustrate its benefit on the health-care domain. To the best of our knowledge this is the first implementation of an exploration system at the statement level that is based on the textual entailment relation. 1
6 0.37570021 43 acl-2012-Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench
7 0.35728648 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
8 0.33625662 181 acl-2012-Spectral Learning of Latent-Variable PCFGs
9 0.30558613 137 acl-2012-Lemmatisation as a Tagging Task
10 0.26745462 77 acl-2012-Ecological Evaluation of Persuasive Messages Using Google AdWords
11 0.26531804 53 acl-2012-Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions
12 0.2616764 133 acl-2012-Learning to "Read Between the Lines" using Bayesian Logic Programs
13 0.2306758 184 acl-2012-String Re-writing Kernel
14 0.22921634 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
15 0.22325112 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
16 0.21989678 113 acl-2012-INPROwidth.3emiSS: A Component for Just-In-Time Incremental Speech Synthesis
17 0.21238492 129 acl-2012-Learning High-Level Planning from Text
18 0.19363177 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction
19 0.18933618 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
20 0.18364696 112 acl-2012-Humor as Circuits in Semantic Networks
topicId topicWeight
[(25, 0.032), (26, 0.034), (28, 0.033), (30, 0.036), (37, 0.022), (39, 0.061), (49, 0.243), (57, 0.019), (74, 0.027), (82, 0.018), (84, 0.026), (85, 0.049), (90, 0.079), (92, 0.175), (94, 0.011), (99, 0.057)]
simIndex simValue paperId paperTitle
same-paper 1 0.83499676 36 acl-2012-BIUTEE: A Modular Open-Source System for Recognizing Textual Entailment
Author: Asher Stern ; Ido Dagan
Abstract: This paper introduces BIUTEE1 , an opensource system for recognizing textual entailment. Its main advantages are its ability to utilize various types of knowledge resources, and its extensibility by which new knowledge resources and inference components can be easily integrated. These abilities make BIUTEE an appealing RTE system for two research communities: (1) researchers of end applications, that can benefit from generic textual inference, and (2) RTE researchers, who can integrate their novel algorithms and knowledge resources into our system, saving the time and effort of developing a complete RTE system from scratch. Notable assistance for these re- searchers is provided by a visual tracing tool, by which researchers can refine and “debug” their knowledge resources and inference components.
2 0.75015235 106 acl-2012-Head-driven Transition-based Parsing with Top-down Prediction
Author: Katsuhiko Hayashi ; Taro Watanabe ; Masayuki Asahara ; Yuji Matsumoto
Abstract: This paper presents a novel top-down headdriven parsing algorithm for data-driven projective dependency analysis. This algorithm handles global structures, such as clause and coordination, better than shift-reduce or other bottom-up algorithms. Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms.
3 0.69020504 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
Author: Christian Chiarcos
Abstract: This paper describes a novel approach towards the empirical approximation of discourse relations between different utterances in texts. Following the idea that every pair of events comes with preferences regarding the range and frequency of discourse relations connecting both parts, the paper investigates whether these preferences are manifested in the distribution of relation words (that serve to signal these relations). Experiments on two large-scale English web corpora show that significant correlations between pairs of adjacent events and relation words exist, that they are reproducible on different data sets, and for three relation words, that their distribution corresponds to theorybased assumptions. 1 Motivation Texts are not merely accumulations of isolated utterances, but the arrangement of utterances conveys meaning; human text understanding can thus be described as a process to recover the global structure of texts and the relations linking its different parts (Vallduv ı´ 1992; Gernsbacher et al. 2004). To capture these aspects of meaning in NLP, it is necessary to develop operationalizable theories, and, within a supervised approach, large amounts of annotated training data. To facilitate manual annotation, weakly supervised or unsupervised techniques can be applied as preprocessing step for semimanual annotation, and this is part of the motivation of the approach described here. 213 Discourse relations involve different aspects of meaning. This may include factual knowledge about the connected discourse segments (a ‘subjectmatter’ relation, e.g., if one utterance represents the cause for another, Mann and Thompson 1988, p.257), argumentative purposes (a ‘presentational’ relation, e.g., one utterance motivates the reader to accept a claim formulated in another utterance, ibid., p.257), or relations between entities mentioned in the connected discourse segments (anaphoric relations, Webber et al. 2003). Discourse relations can be indicated explicitly by optional cues, e.g., adverbials (e.g., however), conjunctions (e.g., but), or complex phrases (e.g., in contrast to what Peter said a minute ago). Here, these cues are referred to as relation words. Assuming that relation words are associated with specific discourse relations (Knott and Dale 1994; Prasad et al. 2008), the distribution of relation words found between two (types of) events can yield insights into the range of discourse relations possible at this occasion and their respective likeliness. For this purpose, this paper proposes a background knowledge base (BKB) that hosts pairs of events (here heuristically represented by verbs) along with distributional profiles for relation words. The primary data structure of the BKB is a triple where one event (type) is connected with a particular relation word to another event (type). Triples are further augmented with a frequency score (expressing the likelihood of the triple to be observed), a significance score (see below), and a correlation score (indicating whether a pair of events has a positive or negative correlation with a particular relation word). ProceedJienjgus, R ofep thueb 5lic0t hof A Knonrueaa,l M 8-e1e4ti Jnugly o f2 t0h1e2 A.s ?c so2c0ia1t2io Ans fsoorc Ciatoiomnp fuotart Cioonmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi2c 1s3–217, Triples can be easily acquired from automatically parsed corpora. While the relation word is usually part of the utterance that represents the source of the relation, determining the appropriate target (antecedent) of the relation may be difficult to achieve. As a heuristic, an adjacency preference is adopted, i.e., the target is identified with the main event of the preceding utterance.1 The BKB can be constructed from a sufficiently large corpus as follows: • • identify event types and relation words for every utterance create a candidate triple consisting of the event type of the utterance, the relation word, and the event type of the preceding utterance. add the candidate triple to the BKB, if it found in the BKB, increase its score by (or initialize it with) 1, – – • perform a pruning on all candidate triples, calcpuerlaftoer significance aonnd a lclo crarneldaitdioante scores Pruning uses statistical significance tests to evaluate whether the relative frequency of a relation word for a pair of events is significantly higher or lower than the relative frequency of the relation word in the entire corpus. Assuming that incorrect candidate triples (i.e., where the factual target of the relation was non-adjacent) are equally distributed, they should be filtered out by the significance tests. The goal of this paper is to evaluate the validity of this approach. 2 Experimental Setup By generalizing over multiple occurrences of the same events (or, more precisely, event types), one can identify preferences of event pairs for one or several relation words. These preferences capture context-invariant characteristics of pairs of events and are thus to considered to reflect a semantic predisposition for a particular discourse relation. Formally, an event is the semantic representation of the meaning conveyed in the utterance. We 1Relations between non-adjacent utterances are constrained by the structure of discourse (Webber 1991), and thus less likely than relations between adjacent utterances. 214 assume that the same event can reoccur in different contexts, we are thus studying relations between types of events. For the experiment described here, events are heuristically identified with the main predicates of a sentence, i.e., non-auxiliar, noncausative, non-modal verbal lexemes that serve as heads of main clauses. The primary data structure of the approach described here is a triple consisting of a source event, a relation word and a target (antecedent) event. These triples are harvested from large syntactically annotated corpora. For intersentential relations, the target is identified with the event of the immediately preceding main clause. These extraction preferences are heuristic approximations, and thus, an additional pruning step is necessary. For this purpose, statistical significance tests are adopted (χ2 for triples of frequent events and relation words, t-test for rare events and/or relation words) that compare the relative frequency of a rela- tion word given a pair of events with the relative frequency of the relation word in the entire corpus. All results with p ≥ .05 are excluded, i.e., only triples are preserved pfo ≥r w .0h5ic ahr teh eex xocblsuedrevde,d i positive or negative correlation between a pair of events and a relation word is not due to chance with at least 95% probability. Assuming an even distribution of incorrect target events, this should rule these out. Additionally, it also serves as a means of evaluation. Using statistical significance tests as pruning criterion entails that all triples eventually confirmed are statistically significant.2 This setup requires immense amounts of data: We are dealing with several thousand events (theoretically, the total number of verbs of a language). The chance probability for two events to occur in adjacent position is thus far below 10−6, and it decreases further if the likelihood of a relation word is taken into consideration. All things being equal, we thus need millions of sentences to create the BKB. Here, two large-scale corpora of English are employed, PukWaC and Wackypedia EN (Baroni et al. 2009). PukWaC is a 2G-token web corpus of British English crawled from the uk domain (Ferraresi et al. 2Subsequent studies may employ less rigid pruning criteria. For the purpose of the current paper, however, the statistical significance of all extracted triples serves as an criterion to evaluate methodological validity. 2008), and parsed with MaltParser (Nivre et al. 2006). It is distributed in 5 parts; Only PukWaC1 to PukWaC-4 were considered here, constituting 82.2% (72.5M sentences) of the entire corpus, PukWaC-5 is left untouched for forthcoming evaluation experiments. Wackypedia EN is a 0.8G-token dump of the English Wikipedia, annotated with the same tools. It is distributed in 4 different files; the last portion was left untouched for forthcoming evaluation experiments. The portion analyzed here comprises 33.2M sentences, 75.9% of the corpus. The extraction of events in these corpora uses simple patterns that combine dependency information and part-of-speech tags to retrieve the main verbs and store their lemmata as event types. The target (antecedent) event was identified with the last main event of the preceding sentence. As relation words, only sentence-initial children of the source event that were annotated as adverbial modifiers, verb modifiers or conjunctions were considered. 3 Evaluation To evaluate the validity of the approach, three fundamental questions need to be addressed: significance (are there significant correlations between pairs of events and relation words ?), reproducibility (can these correlations confirmed on independent data sets ?), and interpretability (can these correlations be interpreted in terms of theoretically-defined discourse relations ?). 3.1 Significance and Reproducibility Significance tests are part of the pruning stage of the algorithm. Therefore, the number of triples eventually retrieved confirms the existence of statistically significant correlations between pairs of events and relation words. The left column of Tab. 1 shows the number of triples obtained from PukWaC subcorpora of different size. For reproducibility, compare the triples identified with Wackypedia EN and PukWaC subcorpora of different size: Table 1 shows the number of triples found in both Wackypedia EN and PukWaC, and the agreement between both resources. For two triples involving the same events (event types) and the same relation word, agreement means that the relation word shows either positive or negative correlation 215 TasPbe13u7l4n2k98t. We254Mn1a c:CeAs(gurb42)et760cr8m,iop3e61r4l28np0st6uwicho21rm9W,e2673mas048p7c3okenytpdoagi21p8r,o35eE0s29Nit36nvgreipol8796r50s9%.n3509egative correlation of event pairs and relation words between Wackypedia EN and PukWaC subcorpora of different size TBH: thb ouetwnev r17 t1,o27,t0a95P41 ul2kWv6aCs,8.0 Htr5iple1v s, 45.12T35av9sg7.reH7em nv6 ts62(. %.9T2) Table 2: Agreement between but (B), however (H) and then (T) on PukWaC in both corpora, disagreement means positive correlation in one corpus and negative correlation in the other. Table 1 confirms that results obtained on one resource can be reproduced on another. This indicates that triples indeed capture context-invariant, and hence, semantic, characteristics of the relation between events. The data also indicates that reproducibility increases with the size of corpora from which a BKB is built. 3.2 Interpretability Any theory of discourse relations would predict that relation words with similar function should have similar distributions, whereas one would expect different distributions for functionally unrelated relation words. These expectations are tested here for three of the most frequent relation words found in the corpora, i.e., but, then and however. But and however can be grouped together under a generalized notion of contrast (Knott and Dale 1994; Prasad et al. 2008); then, on the other hand, indicates a tem- poral and/or causal relation. Table 2 confirms the expectation that event pairs that are correlated with but tend to show the same correlation with however, but not with then. 4 Discussion and Outlook This paper described a novel approach towards the unsupervised acquisition of discourse relations, with encouraging preliminary results: Large collections of parsed text are used to assess distributional profiles of relation words that indicate discourse relations that are possible between specific types of events; on this basis, a background knowledge base (BKB) was created that can be used to predict an appropriatediscoursemarkertoconnecttwoutterances with no overt relation word. This information can be used, for example, to facilitate the semiautomated annotation of discourse relations, by pointing out the ‘default’ relation word for a given pair of events. Similarly, Zhou et al. (2010) used a language model to predict discourse markers for implicitly realized discourse relations. As opposed to this shallow, n-gram-based approach, here, the internal structure of utterances is exploited: based on semantic considerations, syntactic patterns have been devised that extract triples of event pairs and relation words. The resulting BKB provides a distributional approximation of the discourse relations that can hold between two specific event types. Both approaches exploit complementary sources of knowledge, and may be combined with each other to achieve a more precise prediction of implicit discourse connectives. The validity of the approach was evaluated with respect to three evaluation criteria: The extracted associations between relation words and event pairs could be shown to be statistically significant, and to be reproducible on other corpora; for three highly frequent relation words, theoretical predictions about their relative distribution could be confirmed, indicating their interpretability in terms of presupposed taxonomies of discourse relations. Another prospective field of application can be seen in NLP applications, where selection preferences for relation words may serve as a cheap replacement for full-fledged discourse parsing. In the Natural Language Understanding domain, the BKB may help to disambiguate or to identify discourse relations between different events; in the context of Machine Translation, it may represent a factor guid- ing the insertion of relation words, a task that has been found to be problematic for languages that dif216 fer in their inventory and usage of discourse markers, e.g., German and English (Stede and Schmitz 2000). The approach is language-independent (except for the syntactic extraction patterns), and it does not require manually annotated data. It would thus be easy to create background knowledge bases with relation words for other languages or specific domains given a sufficient amount of textual data. – Related research includes, for example, the unsupervised recognition of causal and temporal relationships, as required, for example, for the recognition of textual entailment. Riaz and Girju (2010) exploit distributional information about pairs of utterances. Unlike approach described here, they are not restricted to adjacent utterances, and do not rely on explicit and recurrent relation words. Their approach can thus be applied to comparably small data sets. However, they are restricted to a specific type of relations whereas here the entire band- width of discourse relations that are explicitly realized in a language are covered. Prospectively, both approaches could be combined to compensate their respective weaknesses. Similar observations can be made with respect to Chambers and Jurafsky (2009) and Kasch and Oates (2010), who also study a single discourse relation (narration), and are thus more limited in scope than the approach described here. However, as their approach extends beyond pairs of events to complex event chains, it seems that both approaches provide complementary types of information and their results could also be combined in a fruitful way to achieve a more detailed assessment of discourse relations. The goal of this paper was to evaluate the methdological validity of the approach. It thus represents the basis for further experiments, e.g., with respect to the enrichment the BKB with information provided by Riaz and Girju (2010), Chambers and Jurafsky (2009) and Kasch and Oates (2010). Other directions of subsequent research may include address more elaborate models of events, and the investigation of the relationship between relation words and taxonomies of discourse relations. Acknowledgments This work was supported by a fellowship within the Postdoc program of the German Academic Exchange Service (DAAD). Initial experiments were conducted at the Collaborative Research Center (SFB) 632 “Information Structure” at the University of Potsdam, Germany. Iwould also like to thank three anonymous reviewers for valuable comments and feedback, as well as Manfred Stede and Ed Hovy whose work on discourse relations on the one hand and proposition stores on the other hand have been the main inspiration for this paper. References M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The wacky wide web: a collection of very large linguistically processed webcrawled corpora. Language Resources and Evaluation, 43(3):209–226, 2009. N. Chambers and D. Jurafsky. Unsupervised learning of narrative schemas and their participants. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, pages 602–610. Association for Computational Linguistics, 2009. A. Ferraresi, E. Zanchetta, M. Baroni, and S. Bernardini. Introducing and evaluating ukwac, a very large web-derived corpus of english. In Proceedings of the 4th Web as Corpus Workshop (WAC-4) Can we beat Google, pages 47–54, 2008. Morton Ann Gernsbacher, Rachel R. W. Robertson, Paola Palladino, and Necia K. Werner. Managing mental representations during narrative comprehension. Discourse Processes, 37(2): 145–164, 2004. N. Kasch and T. Oates. Mining script-like structures from the web. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, pages 34–42. Association for Computational Linguistics, 2010. A. Knott and R. Dale. Using linguistic phenomena to motivate a set ofcoherence relations. Discourse processes, 18(1):35–62, 1994. 217 J. van Kuppevelt and R. Smith, editors. Current Directions in Discourse andDialogue. Kluwer, Dordrecht, 2003. William C. Mann and Sandra A. Thompson. Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3):243–281, 1988. J. Nivre, J. Hall, and J. Nilsson. Maltparser: A data-driven parser-generator for dependency parsing. In Proc. of LREC, pages 2216–2219. Citeseer, 2006. R. Prasad, N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber. The penn discourse treebank 2.0. In Proc. 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, 2008. M. Riaz and R. Girju. Another look at causality: Discovering scenario-specific contingency relationships with no supervision. In Semantic Computing (ICSC), 2010 IEEE Fourth International Conference on, pages 361–368. IEEE, 2010. M. Stede and B. Schmitz. Discourse particles and discourse functions. Machine translation, 15(1): 125–147, 2000. Enric Vallduv ı´. The Informational Component. Garland, New York, 1992. Bonnie L. Webber. Structure and ostension in the interpretation of discourse deixis. Natural Language and Cognitive Processes, 2(6): 107–135, 1991. Bonnie L. Webber, Matthew Stone, Aravind K. Joshi, and Alistair Knott. Anaphora and discourse structure. Computational Linguistics, 4(29):545– 587, 2003. Z.-M. Zhou, Y. Xu, Z.-Y. Niu, M. Lan, J. Su, and C.L. Tan. Predicting discourse connectives for implicit discourse relation recognition. In COLING 2010, pages 1507–15 14, Beijing, China, August 2010.
4 0.67439353 78 acl-2012-Efficient Search for Transformation-based Inference
Author: Asher Stern ; Roni Stern ; Ido Dagan ; Ariel Felner
Abstract: This paper addresses the search problem in textual inference, where systems need to infer one piece of text from another. A prominent approach to this task is attempts to transform one text into the other through a sequence of inference-preserving transformations, a.k.a. a proof, while estimating the proof’s validity. This raises a search challenge of finding the best possible proof. We explore this challenge through a comprehensive investigation of prominent search algorithms and propose two novel algorithmic components specifically designed for textual inference: a gradient-style evaluation function, and a locallookahead node expansion method. Evaluations, using the open-source system, BIUTEE, show the contribution of these ideas to search efficiency and proof quality.
5 0.64873546 205 acl-2012-Tweet Recommendation with Graph Co-Ranking
Author: Rui Yan ; Mirella Lapata ; Xiaoming Li
Abstract: Mirella Lapata‡ Xiaoming Li†, \ ‡Institute for Language, \State Key Laboratory of Software Cognition and Computation, Development Environment, University of Edinburgh, Beihang University, Edinburgh EH8 9AB, UK Beijing 100083, China mlap@ inf .ed .ac .uk lxm@pku .edu .cn 2012.1 Twitter enables users to send and read textbased posts ofup to 140 characters, known as tweets. As one of the most popular micro-blogging services, Twitter attracts millions of users, producing millions of tweets daily. Shared information through this service spreads faster than would have been possible with traditional sources, however the proliferation of user-generation content poses challenges to browsing and finding valuable information. In this paper we propose a graph-theoretic model for tweet recommendation that presents users with items they may have an interest in. Our model ranks tweets and their authors simultaneously using several networks: the social network connecting the users, the network connecting the tweets, and a third network that ties the two together. Tweet and author entities are ranked following a co-ranking algorithm based on the intuition that that there is a mutually reinforcing relationship between tweets and their authors that could be reflected in the rankings. We show that this framework can be parametrized to take into account user preferences, the popularity of tweets and their authors, and diversity. Experimental evaluation on a large dataset shows that our model out- performs competitive approaches by a large margin.
6 0.64668393 154 acl-2012-Native Language Detection with Tree Substitution Grammars
7 0.64452982 86 acl-2012-Exploiting Latent Information to Predict Diffusions of Novel Topics on Social Networks
8 0.63416952 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
9 0.59324914 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars
10 0.58411783 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
11 0.57997096 31 acl-2012-Authorship Attribution with Author-aware Topic Models
12 0.57144731 38 acl-2012-Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing
13 0.56046402 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
14 0.55141765 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
15 0.54192132 167 acl-2012-QuickView: NLP-based Tweet Search
16 0.53575689 139 acl-2012-MIX Is Not a Tree-Adjoining Language
17 0.53158218 22 acl-2012-A Topic Similarity Model for Hierarchical Phrase-based Translation
18 0.53117114 98 acl-2012-Finding Bursty Topics from Microblogs
20 0.52413237 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale