emnlp emnlp2013 emnlp2013-76 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan
Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.
Reference: text
sentIndex sentText sentNum sentScore
1 s g Abstract In this paper we classify the temporal relations between pairs of events on an article-wide basis. [sent-4, score-0.705]
2 To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. [sent-6, score-0.509]
3 We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. [sent-7, score-0.994]
4 1 Introduction A good amount of research had been invested in understanding temporal relationships within text. [sent-11, score-0.54]
5 Particular areas of interest include determining the relationship between an event mention and a time expression (timex), as well as determining the relationship between two event mentions. [sent-12, score-0.47]
6 The latter, which we refer to as event-event (E-E) temporal classification is the focus of this work. [sent-13, score-0.495]
7 Being able to infer these temporal relationships allows us to build up a better understanding of the text in question, and can aid several natural language understanding tasks such as information extraction and text summarization. [sent-19, score-0.587]
8 For example, we can build up a temporal characterization of an article by constructing a temporal graph denoting the relationships between all events within an article (Verhagen et al. [sent-20, score-1.237]
9 The temporal graph can also be used in text summarization, where temporal order can be used to improve sentence ordering and thereby the eventual generated summary (Barzilay et al. [sent-24, score-0.987]
10 – – – Given the importance and value of temporal relations, the community has organized shared tasks 1From article AFP ENG 20030304. [sent-26, score-0.51]
11 In the task definitions, EE temporal classification involves determining the relationship between events found within the same sentence, or in adjacent sentences. [sent-35, score-0.732]
12 For brevity we will refer to this loosely as intra-sentence E-E temporal classification in the rest of this paper. [sent-36, score-0.495]
13 In particular, one deficiency is that it does not allow us to construct the complete temporal graph we seek. [sent-40, score-0.486]
14 As illustrated in Figure 1, being able to perform only intra-sentence E-E temporal classification may result in a forest of disconnected temporal graphs. [sent-41, score-0.957]
15 A sentence s3 separates events C and D, as such an intra-sentence E-E classification system will not be able to determine the temporal relationship between them. [sent-42, score-0.675]
16 While we can determine the relationship between A and C in the figure with the use of temporal transitivity rules (Setzer et al. [sent-43, score-0.527]
17 s 21 ABC s3 s4 DE Figure 1: A disconnected temporal graph of events within an article. [sent-45, score-0.632]
18 In this work, we seek to overcome this limitation, and study what can enable effective article-wide E-E temporal classification. [sent-47, score-0.462]
19 That is, we want to be able to determine the temporal relationship between two events located anywhere within an article. [sent-48, score-0.673]
20 We suggest making use of semantically 13 motivated features derived from discourse analysis instead, and show that these discourse features are superior. [sent-50, score-0.946]
21 While we are just focusing on E-E temporal classification, our work can complement other approaches such as the joint inference approach proposed by Do et al. [sent-51, score-0.462]
22 (2009) which builds on top of event-timex (E-T) and E-E temporal classification systems. [sent-53, score-0.495]
23 2 Related Work Many researchers have worked on the E-E temporal classification problem, especially as part of the TempEval series of evaluation workshops. [sent-55, score-0.495]
24 By leveraging on the transitivity properties of temporal relationships (Setzer et al. [sent-61, score-0.509]
25 , 2003), they found that MLNs are useful in inferring new temporal relationships from known ones. [sent-62, score-0.535]
26 Recognizing that the temporal relationships between event pairs and time expressions are related, Yoshikawa et al. [sent-63, score-0.653]
27 To the best of our knowledge, the only piece of work to have gone beyond sentence boundaries and tackle the problem of article-wide E-E temporal classification is by Do et al. [sent-66, score-0.518]
28 Making use of integer linear programming (ILP), they built a joint inference model which is capable of classifying temporal relationships between any event pair within a given document. [sent-68, score-0.684]
29 They also showed that event co-reference information can be useful in determining these temporal relationships. [sent-69, score-0.658]
30 However they did not make use of features directed specifically at determining the temporal relationships of event pairs across different sentences. [sent-70, score-0.679]
31 Underlying these disparate data-driven methods for similar temporal processing tasks, the reviewed works all adopted a similar set of surface features including vocabulary features, part-of-speech tags, constituent grammar parses, governing grammar nodes and verb tenses, among others. [sent-72, score-0.52]
32 We argue that these features are not sufficiently discriminative of temporal relationships because they do not explain how sentences are combined together, and thus are unable to properly differentiate between the different temporal classifications. [sent-73, score-0.971]
33 Webster and WordNet) as well as discourse relations help temporal classification. [sent-77, score-1.047]
34 We expand on their study to assess the utility of adopting additional discourse frameworks as alternative and complementary views. [sent-80, score-0.549]
35 Given only syntactic features, we may be drawn to conclude that they share similar temporal relationships. [sent-86, score-0.462]
36 Clearly, syntax alone is not going to be useful to help us arrive at the cor- rect temporal relations. [sent-88, score-0.488]
37 Given a E-E pair which crosses sentence boundaries, how can we determine the temporal relationship between them? [sent-90, score-0.527]
38 They sug14 gested instead that discourse relations hold the key to interpreting such temporal relationships. [sent-92, score-1.047]
39 Building on their observations, we believe that discourse analysis is integral to any solution for the problem of article-wide E-E temporal classification. [sent-93, score-0.945]
40 In RST, a piece of text is split into a sequence of non-overlapping text fragments known as elementary discourse units (EDUs). [sent-97, score-0.646]
41 A discourse tree can be composed by viewing each EDU as a leaf node. [sent-102, score-0.502]
42 Nodes in the discourse tree are linked to one another via the discourse relations that hold between the EDUs. [sent-103, score-1.087]
43 RST discourse relations capture the semantic relation between two EDUs, and these often offer a clue to the temporal relationship between events in the two EDUs too. [sent-104, score-1.227]
44 The RST discourse structure for the second line of text is shown on the left of Figure 2. [sent-107, score-0.521]
45 The RST discourse relation in this case is very useful in helping us determine the relationship between the two events. [sent-110, score-0.548]
46 Another widely adopted discourse relation annotation is the PDTB framework (Prasad et al. [sent-112, score-0.487]
47 Figure 2: RST and PDTB discourse structures for the second line of text in Example 2. [sent-118, score-0.521]
48 The structure on the left is the RST discourse structure, while the structure on the right is for PDTB. [sent-119, score-0.507]
49 framework, the discourse relations in PDTB build on the work on D-LTAG by Webber (2004), a lexicongrounded approach to discourse analysis. [sent-120, score-1.042]
50 At this point we want to note the differences between the use of the RST framework and PDTBstyled discourse relations in the context of our work. [sent-125, score-0.585]
51 The theoretical underpinnings behind these two discourse analysis are very different, and we believe that they can be complementary to each other. [sent-126, score-0.483]
52 However this constraint is not found in PDTB-styled relations, where a text fragment can participate in one discourse relation, and a subsequence of it participate in another. [sent-129, score-0.555]
53 Second, with PDTB-styled relations not every sentence needs to be in a relation with another as the PDTB framework does not aim to build a global discourse tree that covers all sentence pairs. [sent-132, score-0.63]
54 The RST framework does not suffer from this limitation however as we can build up a discourse 15 tree connecting all the text within a given article. [sent-134, score-0.572]
55 The transition between segments can represent possible topic shifts which can provide useful information about temporal relationships. [sent-139, score-0.564]
56 The segment boundary signals to us a possible temporal shift and can help us to infer that the bombing event took place BEFORE the deaths and injuries had occurred. [sent-142, score-0.73]
57 T))2 while another explosion hit a bus terminal at the 4 Methodology Having motivated the use of discourse analysis for our problem, we now proceed to explain how we can make use of them for temporal classification. [sent-147, score-0.919]
58 and PDTB discourse relations are commonly represented as graphs, and we can also view the output of text segmentation as a graph with individual text segments forming vertices, and the transitions between them forming edges. [sent-153, score-0.818]
59 Convolution kernels had also previously been shown to work well for the related problem of E-T temporal classification (Ng and Kan, 2012), where the features adopted are similarly structural in nature. [sent-157, score-0.551]
60 We now describe our use of the discourse analysis frameworks to generate appropriate representations for input to the convolution kernel. [sent-158, score-0.608]
61 Recall that the RST framework provides us with a discourse tree for an entire input article. [sent-160, score-0.502]
62 In our work, we first make use of the parser by Feng and Hirst (2012) to obtain a discourse tree representation of our input. [sent-162, score-0.502]
63 We illustrate this procedure using the example discourse tree illustrated in Figure 3. [sent-164, score-0.502]
64 EDUs including EDU1 to EDU3 form the vertices while discourse relations r1and r2 between the EDUs form the edges. [sent-165, score-0.609]
65 We trace the short16 r1r2 At1t2 t3t4B r3 Figure 4: A possible PDTB-styled discourse annotation where the circles represent events we are interested in. [sent-168, score-0.599]
66 (2013) to obtain the discourse relations over an input article. [sent-174, score-0.585]
67 Similar to how we work with the RST discourse framework, for a given E-E pair, we retrieve the relevant text fragments and use the shortest path linking the two events as a feature structure for our convolution kernel classifier. [sent-175, score-0.906]
68 The parentheses delimit text fragments, t1 to t4, which have been identified as arguments participating in discourse relations, r1to r3. [sent-178, score-0.584]
69 We model each PDTB discourse annotation as a graph and employ Dijkstra’s shortest path algorithm. [sent-183, score-0.547]
70 We perform the same temporal saturation step as described in Do et al. [sent-208, score-0.493]
71 A breakdown of the number of instances by each temporal classes is shown in Table 1. [sent-210, score-0.519]
72 We believe this to be an enhancement as it ensures that all inferred temporal relationships are generated. [sent-214, score-0.535]
73 Figure 7 shows the number of instances for each temporal class broken down by the number of sentences (i. [sent-217, score-0.494]
74 ClassAFTERBEFOREOVERLAP # E-E pairs3,588 (45%)3,589 (45%)815 (10%) Table 1: Number of E-E pairs in data set attributable to each temporal class. [sent-222, score-0.462]
75 Figure 7: Breakdown of number of E-E pairs for each temporal class based on sentence gap. [sent-224, score-0.462]
76 This baseline system, and the subsequent systems we will describe, comprises of three separate one-vs-all classifiers for each of the temporal classes. [sent-241, score-0.462]
77 In Row 3, RST denotes the RST discourse feature, PDTB denotes the PDTB-styled discourse features, and TOPICSEG denotes the text segmentation feature. [sent-249, score-1.008]
78 To get a better idea of the performance we can obtain if oracular versions of our features are available, we also show the results obtained if hand-annotated RST discourse structures, text segments, as well as event co-reference information were used. [sent-260, score-0.694]
79 Annota- 18 tions for the RST discourse structures and text segments were performed by the first author (RST annotations were made following the annotation guidelines given by Carlson and Marcu (2001)). [sent-261, score-0.597]
80 These oracular results further confirm the importance of non-local discourse analysis for temporal processing. [sent-266, score-0.973]
81 We performed ablation tests to assess the efficacy of the discourse features used in our earlier experiments. [sent-268, score-0.504]
82 From the ablation tests, we also observe that the RST discourse feature contributes the most to overall system performance while the PDTB discourse feature contributes the least. [sent-272, score-0.961]
83 However we should not conclude prematurely that the former is more useful than the latter; as the results are obtained using parses from automatic systems, and are not reflective of the full utility of ground truth discourse annotations. [sent-273, score-0.483]
84 The ablation test results showed us that discourse relations (in particular RST dis- Figure 8: Proportion of occurence in temporal classes for every RST and PDTB relation. [sent-275, score-1.119]
85 We have also motivated our work earlier with the intuition that certain relations such as the RST “Result” and the PDTB “Cause” relations provide very useful temporal cues. [sent-282, score-0.744]
86 We now offer an introspection into the use of these discourse relations. [sent-283, score-0.488]
87 Figure 8 illustrates the relative proportion of temporal classes in which each RST and PDTB relation appear. [sent-284, score-0.487]
88 If the relations are randomly distributed, we should expect their distribution to follow that of the temporal classes as shown in Table 1. [sent-285, score-0.615]
89 These relations are likely useful in disambiguating between the different temporal classes. [sent-288, score-0.616]
90 Table 4: Subset of top RST discourse fragments on support vectors identified by linearizing kernel function. [sent-299, score-0.624]
91 Table 4 shows a subset of the top RST discourse fragments identified for the BEFORE and OVERLAP one-vs-all classifiers. [sent-300, score-0.569]
92 Its corresponding discourse structure is illustrated in the top half of Figure 9. [sent-310, score-0.506]
93 ”, and its discourse structure is shown in the bottom half of Figure 9. [sent-317, score-0.506]
94 Figure 9: RST discourse structures for sentences A (top half) and B (bottom half) in Example 4. [sent-327, score-0.482]
95 From the ablation test results, text segmentation is the next most important feature after the RST discourse feature. [sent-329, score-0.598]
96 The high number of event pairs which are not assigned to any temporal class explains the lower recall scores obtained by our system, as observed in Table 2. [sent-348, score-0.606]
97 Figure 10: Accuracy of the classifer for each temporal class, plotted against the sentence gap of each E-E pair. [sent-362, score-0.489]
98 7 Conclusion We believe that discourse features play an important role in the temporal ordering of events in text. [sent-363, score-1.06]
99 We have proposed the use of different discourse analysis frameworks and shown that they are effective for classifying the temporal relationships of articlewide E-E pairs. [sent-364, score-1.025]
100 In future work, we will like to explore how to better exploit the various discourse analysis frameworks for temporal classification. [sent-367, score-0.978]
wordName wordTfidf (topN-words)
[('rst', 0.464), ('temporal', 0.462), ('discourse', 0.457), ('pdtb', 0.24), ('event', 0.144), ('relations', 0.128), ('events', 0.115), ('edus', 0.114), ('verhagen', 0.103), ('convolution', 0.092), ('fragments', 0.088), ('overlap', 0.079), ('segments', 0.076), ('relationship', 0.065), ('segment', 0.062), ('fragment', 0.059), ('frameworks', 0.059), ('singapore', 0.055), ('segmentation', 0.055), ('uzzaman', 0.054), ('oracular', 0.054), ('row', 0.05), ('ace', 0.049), ('article', 0.048), ('relationships', 0.047), ('ablation', 0.047), ('airport', 0.046), ('kazantseva', 0.046), ('switched', 0.046), ('wounded', 0.046), ('said', 0.046), ('tree', 0.045), ('topical', 0.041), ('yoshikawa', 0.041), ('setzer', 0.04), ('killed', 0.04), ('semeval', 0.04), ('text', 0.039), ('marc', 0.038), ('bethard', 0.037), ('participating', 0.037), ('shortest', 0.035), ('lascarides', 0.034), ('classification', 0.033), ('adopting', 0.033), ('making', 0.032), ('instances', 0.032), ('james', 0.031), ('path', 0.031), ('davao', 0.031), ('deaths', 0.031), ('hypotactic', 0.031), ('injuries', 0.031), ('introspection', 0.031), ('kolya', 0.031), ('linearizing', 0.031), ('milosevic', 0.031), ('quang', 0.031), ('saturation', 0.031), ('skorochod', 0.031), ('wielded', 0.031), ('rhetorical', 0.031), ('happened', 0.031), ('within', 0.031), ('adopted', 0.03), ('prasad', 0.029), ('horizontal', 0.028), ('surface', 0.028), ('medical', 0.027), ('parentheses', 0.027), ('circles', 0.027), ('gap', 0.027), ('zoning', 0.027), ('szpakowicz', 0.027), ('naushad', 0.027), ('pighin', 0.027), ('vapnik', 0.027), ('swept', 0.027), ('xuan', 0.027), ('dijkstra', 0.027), ('welch', 0.027), ('useful', 0.026), ('determining', 0.026), ('believe', 0.026), ('kernels', 0.026), ('structure', 0.025), ('structures', 0.025), ('classes', 0.025), ('tempeval', 0.025), ('eng', 0.025), ('verbocean', 0.025), ('afp', 0.025), ('pitch', 0.025), ('identified', 0.024), ('kernel', 0.024), ('kan', 0.024), ('vertices', 0.024), ('graph', 0.024), ('half', 0.024), ('piece', 0.023)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000011 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification
Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan
Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.
2 0.26927918 41 emnlp-2013-Building Event Threads out of Multiple News Articles
Author: Xavier Tannier ; Veronique Moriceau
Abstract: We present an approach for building multidocument event threads from a large corpus of newswire articles. An event thread is basically a succession of events belonging to the same story. It helps the reader to contextualize the information contained in a single article, by navigating backward or forward in the thread from this article. A specific effort is also made on the detection of reactions to a particular event. In order to build these event threads, we use a cascade of classifiers and other modules, taking advantage of the redundancy of information in the newswire corpus. We also share interesting comments concerning our manual annotation procedure for building a training and testing set1.
3 0.23241854 118 emnlp-2013-Learning Biological Processes with Global Constraints
Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning
Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.
4 0.2130478 74 emnlp-2013-Event-Based Time Label Propagation for Automatic Dating of News Articles
Author: Tao Ge ; Baobao Chang ; Sujian Li ; Zhifang Sui
Abstract: Since many applications such as timeline summaries and temporal IR involving temporal analysis rely on document timestamps, the task of automatic dating of documents has been increasingly important. Instead of using feature-based methods as conventional models, our method attempts to date documents in a year level by exploiting relative temporal relations between documents and events, which are very effective for dating documents. Based on this intuition, we proposed an eventbased time label propagation model called confidence boosting in which time label information can be propagated between documents and events on a bipartite graph. The experiments show that our event-based propagation model can predict document timestamps in high accuracy and the model combined with a MaxEnt classifier outperforms the state-ofthe-art method for this task especially when the size of the training set is small.
5 0.19488087 93 emnlp-2013-Harvesting Parallel News Streams to Generate Paraphrases of Event Relations
Author: Congle Zhang ; Daniel S. Weld
Abstract: The distributional hypothesis, which states that words that occur in similar contexts tend to have similar meanings, has inspired several Web mining algorithms for paraphrasing semantically equivalent phrases. Unfortunately, these methods have several drawbacks, such as confusing synonyms with antonyms and causes with effects. This paper introduces three Temporal Correspondence Heuristics, that characterize regularities in parallel news streams, and shows how they may be used to generate high precision paraphrases for event relations. We encode the heuristics in a probabilistic graphical model to create the NEWSSPIKE algorithm for mining news streams. We present experiments demonstrating that NEWSSPIKE significantly outperforms several competitive baselines. In order to spur further research, we provide a large annotated corpus of timestamped news arti- cles as well as the paraphrases produced by NEWSSPIKE.
6 0.1893838 174 emnlp-2013-Single-Document Summarization as a Tree Knapsack Problem
7 0.18088681 192 emnlp-2013-Unsupervised Induction of Contingent Event Pairs from Film Scenes
8 0.17078359 152 emnlp-2013-Predicting the Presence of Discourse Connectives
9 0.1385375 106 emnlp-2013-Inducing Document Plans for Concept-to-Text Generation
10 0.13389772 16 emnlp-2013-A Unified Model for Topics, Events and Users on Twitter
11 0.11862718 63 emnlp-2013-Discourse Level Explanatory Relation Extraction from Product Reviews Using First-Order Logic
12 0.11073832 147 emnlp-2013-Optimized Event Storyline Generation based on Mixture-Event-Aspect Model
13 0.0832 124 emnlp-2013-Leveraging Lexical Cohesion and Disruption for Topic Segmentation
14 0.080303915 18 emnlp-2013-A temporal model of text periodicities using Gaussian Processes
15 0.067252062 17 emnlp-2013-A Walk-Based Semantically Enriched Tree Kernel Over Distributed Word Representations
16 0.066769585 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction
17 0.062522799 67 emnlp-2013-Easy Victories and Uphill Battles in Coreference Resolution
19 0.062114313 75 emnlp-2013-Event Schema Induction with a Probabilistic Entity-Driven Model
20 0.058298938 160 emnlp-2013-Relational Inference for Wikification
topicId topicWeight
[(0, -0.218), (1, 0.18), (2, -0.033), (3, 0.311), (4, -0.028), (5, -0.19), (6, -0.168), (7, -0.031), (8, -0.159), (9, 0.09), (10, 0.078), (11, 0.029), (12, 0.027), (13, 0.061), (14, 0.055), (15, 0.115), (16, -0.114), (17, 0.075), (18, 0.011), (19, 0.253), (20, 0.052), (21, -0.116), (22, -0.097), (23, -0.078), (24, 0.001), (25, 0.033), (26, 0.035), (27, -0.145), (28, 0.018), (29, -0.043), (30, -0.044), (31, -0.016), (32, 0.012), (33, 0.049), (34, -0.053), (35, -0.092), (36, -0.075), (37, -0.033), (38, 0.047), (39, 0.006), (40, 0.069), (41, 0.031), (42, -0.084), (43, 0.018), (44, -0.038), (45, 0.078), (46, -0.046), (47, 0.05), (48, -0.041), (49, 0.155)]
simIndex simValue paperId paperTitle
same-paper 1 0.97455603 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification
Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan
Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.
2 0.76421857 41 emnlp-2013-Building Event Threads out of Multiple News Articles
Author: Xavier Tannier ; Veronique Moriceau
Abstract: We present an approach for building multidocument event threads from a large corpus of newswire articles. An event thread is basically a succession of events belonging to the same story. It helps the reader to contextualize the information contained in a single article, by navigating backward or forward in the thread from this article. A specific effort is also made on the detection of reactions to a particular event. In order to build these event threads, we use a cascade of classifiers and other modules, taking advantage of the redundancy of information in the newswire corpus. We also share interesting comments concerning our manual annotation procedure for building a training and testing set1.
3 0.72097892 74 emnlp-2013-Event-Based Time Label Propagation for Automatic Dating of News Articles
Author: Tao Ge ; Baobao Chang ; Sujian Li ; Zhifang Sui
Abstract: Since many applications such as timeline summaries and temporal IR involving temporal analysis rely on document timestamps, the task of automatic dating of documents has been increasingly important. Instead of using feature-based methods as conventional models, our method attempts to date documents in a year level by exploiting relative temporal relations between documents and events, which are very effective for dating documents. Based on this intuition, we proposed an eventbased time label propagation model called confidence boosting in which time label information can be propagated between documents and events on a bipartite graph. The experiments show that our event-based propagation model can predict document timestamps in high accuracy and the model combined with a MaxEnt classifier outperforms the state-ofthe-art method for this task especially when the size of the training set is small.
4 0.64908719 118 emnlp-2013-Learning Biological Processes with Global Constraints
Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning
Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.
5 0.62047493 152 emnlp-2013-Predicting the Presence of Discourse Connectives
Author: Gary Patterson ; Andrew Kehler
Abstract: We present a classification model that predicts the presence or omission of a lexical connective between two clauses, based upon linguistic features of the clauses and the type of discourse relation holding between them. The model is trained on a set of high frequency relations extracted from the Penn Discourse Treebank and achieves an accuracy of 86.6%. Analysis of the results reveals that the most informative features relate to the discourse dependencies between sequences of coherence relations in the text. We also present results of an experiment that provides insight into the nature and difficulty of the task.
6 0.59018815 174 emnlp-2013-Single-Document Summarization as a Tree Knapsack Problem
7 0.58247304 192 emnlp-2013-Unsupervised Induction of Contingent Event Pairs from Film Scenes
8 0.57809162 93 emnlp-2013-Harvesting Parallel News Streams to Generate Paraphrases of Event Relations
9 0.56432867 106 emnlp-2013-Inducing Document Plans for Concept-to-Text Generation
10 0.44191837 63 emnlp-2013-Discourse Level Explanatory Relation Extraction from Product Reviews Using First-Order Logic
11 0.32263643 147 emnlp-2013-Optimized Event Storyline Generation based on Mixture-Event-Aspect Model
12 0.32100853 16 emnlp-2013-A Unified Model for Topics, Events and Users on Twitter
13 0.30971378 18 emnlp-2013-A temporal model of text periodicities using Gaussian Processes
14 0.30701694 68 emnlp-2013-Effectiveness and Efficiency of Open Relation Extraction
16 0.26025507 43 emnlp-2013-Cascading Collective Classification for Bridging Anaphora Recognition using a Rich Linguistic Feature Set
17 0.25276324 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge
18 0.25070748 14 emnlp-2013-A Synchronous Context Free Grammar for Time Normalization
19 0.24896424 124 emnlp-2013-Leveraging Lexical Cohesion and Disruption for Topic Segmentation
20 0.2443132 160 emnlp-2013-Relational Inference for Wikification
topicId topicWeight
[(3, 0.041), (9, 0.013), (18, 0.035), (22, 0.107), (30, 0.061), (50, 0.018), (51, 0.153), (66, 0.301), (71, 0.053), (75, 0.027), (77, 0.018), (96, 0.043)]
simIndex simValue paperId paperTitle
1 0.96821457 109 emnlp-2013-Is Twitter A Better Corpus for Measuring Sentiment Similarity?
Author: Shi Feng ; Le Zhang ; Binyang Li ; Daling Wang ; Ge Yu ; Kam-Fai Wong
Abstract: Extensive experiments have validated the effectiveness of the corpus-based method for classifying the word’s sentiment polarity. However, no work is done for comparing different corpora in the polarity classification task. Nowadays, Twitter has aggregated huge amount of data that are full of people’s sentiments. In this paper, we empirically evaluate the performance of different corpora in sentiment similarity measurement, which is the fundamental task for word polarity classification. Experiment results show that the Twitter data can achieve a much better performance than the Google, Web1T and Wikipedia based methods.
2 0.90937573 186 emnlp-2013-Translating into Morphologically Rich Languages with Synthetic Phrases
Author: Victor Chahuneau ; Eva Schlinger ; Noah A. Smith ; Chris Dyer
Abstract: Translation into morphologically rich languages is an important but recalcitrant problem in MT. We present a simple and effective approach that deals with the problem in two phases. First, a discriminative model is learned to predict inflections of target words from rich source-side annotations. Then, this model is used to create additional sentencespecific word- and phrase-level translations that are added to a standard translation model as “synthetic” phrases. Our approach relies on morphological analysis of the target language, but we show that an unsupervised Bayesian model of morphology can successfully be used in place of a supervised analyzer. We report significant improvements in translation quality when translating from English to Russian, Hebrew and Swahili.
3 0.89362985 201 emnlp-2013-What is Hidden among Translation Rules
Author: Libin Shen ; Bowen Zhou
Abstract: Most of the machine translation systems rely on a large set of translation rules. These rules are treated as discrete and independent events. In this short paper, we propose a novel method to model rules as observed generation output of a compact hidden model, which leads to better generalization capability. We present a preliminary generative model to test this idea. Experimental results show about one point improvement on TER-BLEU over a strong baseline in Chinese-to-English translation.
same-paper 4 0.8494857 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification
Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan
Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.
5 0.71300483 143 emnlp-2013-Open Domain Targeted Sentiment
Author: Margaret Mitchell ; Jacqui Aguilar ; Theresa Wilson ; Benjamin Van Durme
Abstract: We propose a novel approach to sentiment analysis for a low resource setting. The intuition behind this work is that sentiment expressed towards an entity, targeted sentiment, may be viewed as a span of sentiment expressed across the entity. This representation allows us to model sentiment detection as a sequence tagging problem, jointly discovering people and organizations along with whether there is sentiment directed towards them. We compare performance in both Spanish and English on microblog data, using only a sentiment lexicon as an external resource. By leveraging linguisticallyinformed features within conditional random fields (CRFs) trained to minimize empirical risk, our best models in Spanish significantly outperform a strong baseline, and reach around 90% accuracy on the combined task of named entity recognition and sentiment prediction. Our models in English, trained on a much smaller dataset, are not yet statistically significant against their baselines.
6 0.70591325 81 emnlp-2013-Exploring Demographic Language Variations to Improve Multilingual Sentiment Analysis in Social Media
7 0.68675435 47 emnlp-2013-Collective Opinion Target Extraction in Chinese Microblogs
8 0.67603922 99 emnlp-2013-Implicit Feature Detection via a Constrained Topic Model and SVM
9 0.67469019 77 emnlp-2013-Exploiting Domain Knowledge in Aspect Extraction
10 0.67434287 83 emnlp-2013-Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
11 0.67150688 30 emnlp-2013-Automatic Extraction of Morphological Lexicons from Morphologically Annotated Corpora
12 0.6592418 181 emnlp-2013-The Effects of Syntactic Features in Automatic Prediction of Morphology
13 0.65541261 8 emnlp-2013-A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability
14 0.65505821 38 emnlp-2013-Bilingual Word Embeddings for Phrase-Based Machine Translation
15 0.65343851 40 emnlp-2013-Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction
16 0.6533004 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks
17 0.6518907 107 emnlp-2013-Interactive Machine Translation using Hierarchical Translation Models
18 0.64991379 114 emnlp-2013-Joint Learning and Inference for Grammatical Error Correction
19 0.64945352 138 emnlp-2013-Naive Bayes Word Sense Induction
20 0.64524448 63 emnlp-2013-Discourse Level Explanatory Relation Extraction from Product Reviews Using First-Order Logic