emnlp emnlp2013 emnlp2013-118 knowledge-graph by maker-knowledge-mining

118 emnlp-2013-Learning Biological Processes with Global Constraints


Source: pdf

Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning

Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 Manning Stanford University, Stanford Justin Lewis and Brittany Harding University of Washington, Seattle Abstract Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. [sent-2, score-0.568]

2 Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? [sent-3, score-0.411]

3 In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. [sent-6, score-1.024]

4 We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e. [sent-7, score-0.688]

5 Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. [sent-10, score-0.339]

6 On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure. [sent-11, score-0.61]

7 1 Introduction A process is defined as a series of inter-related events that involve multiple entities and lead to an end result. [sent-12, score-0.466]

8 Product manufacturing, economical developments, and various phenomena in life and social sciences can all be viewed as types of processes. [sent-13, score-0.036]

9 Processes are complicated objects; consider for example the biological process of ATP synthesis described in Figure 1. [sent-14, score-0.472]

10 Additionally, it describes relations between events and entities, and the relationship between events (e. [sent-16, score-0.762]

11 , the second occurrence of the event ‘enter’, causes the event ‘changing ’). [sent-18, score-0.983]

12 ∗ Both authors equally contributed to the paper 1710 Peter Clark Allen Institute for Artificial Intelligence, Seattle Automatically extracting the structure of processes from text is crucial for applications that require reasoning, such as non-factoid QA. [sent-19, score-0.218]

13 For instance, answering a question on ATP synthesis, such as “How do H+ ions contribute to the production of ATP? [sent-20, score-0.058]

14 ” requires a structure that links H+ ions (Figure 1, sentence 1) to ATP (Figure 1, sentence 4) through a sequence of intermediate events. [sent-21, score-0.058]

15 , 2011), which further supports the importance of process extraction. [sent-24, score-0.152]

16 Process extraction is related to two recent lines of work in Information Extraction event extraction and timeline construction. [sent-25, score-0.669]

17 Traditional event extraction focuses on identifying a closed set of events within a single sentence. [sent-26, score-0.801]

18 In practice, events are currently almost always extracted from a single sentence. [sent-30, score-0.28]

19 Process extraction, on the other hand, is centered around discovering relations between events that span multiple sentences. [sent-31, score-0.482]

20 The set of possible event types in process – extraction is also much larger. [sent-32, score-0.673]

21 Timeline construction involves identifying temporal relations between events (Do et al. [sent-33, score-0.627]

22 , 2012; McClosky and Manning, 2012; D’Souza and Ng, 2013), and is thus related to process extraction as both focus on event-event relations spanning multiple sentences. [sent-34, score-0.405]

23 However, events in processes are tightly coupled in ways that go beyond simple temporal ordering, and these dependencies are central for the process extraction task. [sent-35, score-0.962]

24 Hence, capturing process structure requires modeling a larger set of relations that includes temporal, causal and co-reference relations. [sent-36, score-0.399]

25 In this paper, we formally define the task of process extraction and present automatic extraction methods. [sent-37, score-0.338]

26 Our approach handles an open set of event types and works over multiple sentences, extracting a rich set of event-event relations. [sent-38, score-0.428]

27 oc d2s0 i1n3 N Aastusorcaila Ltiaon g fuoarg Ceo Pmrpoucetastsi on ga,l p Laignegsu 1is7t1ic0s–1720, Figure 1: Partial annotation of the ATP synthesis process. [sent-41, score-0.127]

28 Most of the semantic roles have been removed for simplicity. [sent-42, score-0.057]

29 we characterize a set of global properties of process structure that can be utilized during process extraction. [sent-43, score-0.482]

30 For example, all events in a process are somehow connected to one another. [sent-44, score-0.432]

31 Also, processes usually exhibit a “chain-like” structure reflecting process progression over time. [sent-45, score-0.407]

32 We show that incorporating such global properties into our model and performing joint inference over the extracted relations significantly improves the quality of process structures predicted. [sent-46, score-0.435]

33 We conduct experiments on a novel dataset of process descriptions from the textbook “Biology” (Campbell and Reece, 2005) that were annotated by trained biologists. [sent-47, score-0.232]

34 We define process extraction and characterize processes’ structural properties. [sent-50, score-0.412]

35 We model global structural properties in processes and demonstrate significant improvement in extraction accuracy. [sent-52, score-0.546]

36 We publicly release a novel data set of 148 fully annotated biological process descript. [sent-54, score-0.345]

37 2 Process Definition and Dataset We define a process description as a paragraph or sequence of tokens x = {x1, . [sent-61, score-0.193]

38 x|x| } that describes 1711 a series of events related by temporal and/or causal relations. [sent-64, score-0.63]

39 For example, in ATP synthesis (Figure 1), the event of rotor spinning causes the event where an internal rod spins. [sent-65, score-1.262]

40 We model the events within a process and their relations by a directed graph P = (V, E), where trhelea tnioodness b Vy = {1, . [sent-66, score-0.64]

41 , |V | } represent ,eEve)n,t w menttihoens n oandeds sla Vbel =ed edges ,E| correspond ntot eevveenntt- emveenntrelations. [sent-69, score-0.108]

42 An event mention v ∈ V is defined by a trigger tv, wAhni cehv eisn a span oofn w vo ∈rds V xi, xi+1 , . [sent-70, score-0.645]

43 , xj ; and by a set of argument mentions Av, where each argument mention av ∈ Av is also a span of words labeled by a semantic r∈ole A ltaken from a set L. [sent-73, score-0.421]

44 For example, yin a th seem mlaasntt event em le tnatkioenn ofrfo AmTP a synthesis, tv = produce, and one of the argument mentions is av = (ATP, RESULT). [sent-74, score-0.781]

45 A labeled edge (u, v, r) in the graph describes a relation r ∈ R between the event gmreanpthio dness u iabneds v. [sent-75, score-0.66]

46 eTlaheti otans kr o∈f process eexntr tahceti eovne nist to extract the graph P from the text x. [sent-76, score-0.2]

47 1 A natural way to break down process extraction into sub-parts is to first perform semantic role labeling (SRL), that is, identify triggers and predict argument mentions with their semantic role, and then extract event-event relations between pairs of event mentions. [sent-77, score-1.078]

48 In this paper, we focus on the second step, where given a set of event triggers T , we find astlle pe,v ewnhte-revee gnivt relations, fw ehveernet a trigger represents the entire event. [sent-78, score-0.613]

49 For completeness, we now describe the semantic roles L used in our dataset, and then 1Argument mentions are also related by coreference relations, but we neglect that since it is not central in this paper. [sent-79, score-0.277]

50 eTsehen tse tht eL s ecton otfai enves sntt-anevdeanrdt rseelmataionntsic Rro. [sent-81, score-0.033]

51 Two additional semantic roles were employed that are relevant for biological text: RESULT corresponds to an entity that is the result of an event, and RAW-MATERIAL describes an entity that is used or consumed during an event. [sent-83, score-0.292]

52 For example, the last event ‘produce ’ in Figure 1, has ‘ATP’ as the RESULT, and ‘ADP’ as the RAW-MATERIAL. [sent-84, score-0.428]

53 The event-event relation set R contains the following (assuming a rlealbaetileodn edge (u, v, r)): 1. [sent-85, score-0.142]

54 PREV denotes that u is an event immediately before v. [sent-86, score-0.483]

55 Thus, the edges (u, v, PREV) and (v, w, PREV), preclude the edge (u, w, PREV). [sent-87, score-0.154]

56 ”, there is no edge (strikes, reaches, PREV) due to the intervening event ‘passed’. [sent-97, score-0.507]

57 COTEMP denotes that events u and v overlap in time (e. [sent-99, score-0.335]

58 , the first two event mentions flowing and enter in Figure 1). [sent-101, score-0.617]

59 For instance, in “During DNA replication, DNA polymerases proofread each nucleotide. [sent-104, score-0.083]

60 ” there is an edge (DNA replication, proofread, SUPER). [sent-107, score-0.079]

61 , the relation between changing and spins in sentence 2 of Figure 1). [sent-111, score-0.187]

62 ENABLES denotes that event u creates preconditions that allow event v to take place. [sent-113, score-0.911]

63 , allowing them to spread into nearby tissues” has the edge (lose, spread, ENABLES). [sent-120, score-0.13]

64 An intuitive way to think about the difference between Causes and Enables is the following: if u causes v this means that if u happens, then v happens. [sent-121, score-0.127]

65 If u enables v, then if u does not happen, then v does not happen. [sent-122, score-0.073]

66 SAME denotes that u and v both refer to the same event (spins and Spinning in Figure 1). [sent-124, score-0.483]

67 Early work on temporal logic (Allen, 1983) contained more temporal relations than are used in our 1712 Tabl#e1of:nProc-Ne#sOoNsf#tEasotrefisntle aovctnkseiocn ots ver16358A4. [sent-125, score-0.567]

68 We chose a relation set R that captruelreasti othne s eests Ren. [sent-130, score-0.063]

69 ti Wal aspects ao fr temporal tre Rlat thioants c bape-tween events in a process, while keeping the annotation as simple as possible. [sent-131, score-0.529]

70 For instance, we include the SUPER relation that appears in temporal annotations such as the Timebank corpus (Pustejovsky et al. [sent-132, score-0.25]

71 , 2003) and Allen’s work, but in practice was not considered by many temporal ordering systems (Chambers and Jurafsky, 2008; Yoshikawa et al. [sent-133, score-0.228]

72 Importantly, our relation set also includes the relations CAUSES and ENABLES, which are fundamental to modeling processes and go beyond simple temporal ordering. [sent-136, score-0.628]

73 in a temporal ordering task to modify probabilities provided by pairwise classifiers prior to joint inference. [sent-140, score-0.291]

74 In this paper, we simply treat SAME as another event-event relation, which allows us to easily perform joint inference and employ structural constraints that combine both coreference and temporal relations simultaneously. [sent-141, score-0.524]

75 3) We annotated 148 process descriptions based on the aforementioned definitions. [sent-143, score-0.199]

76 Structural properties of processes Coherent processes exhibit many structural properties. [sent-145, score-0.652]

77 For example, two argument mentions related to the same event cannot overlap a constraint that has been used in the past in SRL (Toutanova et al. [sent-146, score-0.586]

78 In this paper we focus on three main structural properties of the graph P. [sent-148, score-0.227]

79 First, in a coherent process, sa lol fev tehnets g mrapenhtio Pn. [sent-149, score-0.034]

80 ed are related to one another, and hence the graph P must be connected. [sent-150, score-0.048]

81 Second, processes tgernadp hto Pha mveu a t“ bcheai cno-nlinkeec”t sdt. [sent-151, score-0.251]

82 ructure where one event follows another, and thus we expect – Table2:No10D23d≥e4gde. [sent-152, score-0.428]

83 Indeed, 90% of event ’m deengtrieoens t oh gavene degree ≤ 2, as ddeeemdo,n 9s0tr%ate odf by tnhte mGeonldti ocnoslu hmanve eo fd eTgarbelee 2≤. [sent-156, score-0.517]

84 Last, i fd we ocnosntsraidteedr relations between all possible triples of events in a process, clearly some configurations are impossible, while others are common (illustrated in Figure 2). [sent-157, score-0.493]

85 3, we show that modeling these properties using a joint inference framework improves the quality of process extraction significantly. [sent-159, score-0.312]

86 3 Joint Model for Process Extraction Given a paragraph x and a trigger set T , we wish Gtoi evexntra act p aarlla event-event dre ala ttriiogngse Er s. [sent-160, score-0.211]

87 (2012), our model consists of a local pairwise classifier and global constraints. [sent-162, score-0.159]

88 We first introduce a classifier that is based on features from previous work. [sent-163, score-0.04]

89 Next, we describe novel features specific for process extraction. [sent-164, score-0.152]

90 Last, we incorporate global constraints into our model using an ILP formulation. [sent-165, score-0.056]

91 1 Local pairwise classifier The local pairwise classifier predicts relations between all event mention pairs. [sent-167, score-0.838]

92 After adding NONE to indicate no relation, and including the undirected relations COTEMP and SAME, R contains 11relations. [sent-169, score-0.16]

93 The classifier is hence a fun, cRtio cno f : nTs Trel → Rns. [sent-170, score-0.073]

94 aLnet e n a bme pthlee, number of triggers in a process, and ti be the i-th trigger in its description. [sent-174, score-0.283]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('event', 0.428), ('atp', 0.333), ('events', 0.28), ('processes', 0.218), ('prev', 0.208), ('biological', 0.193), ('temporal', 0.187), ('relations', 0.16), ('process', 0.152), ('causes', 0.127), ('synthesis', 0.127), ('dna', 0.125), ('structural', 0.112), ('av', 0.106), ('trigger', 0.098), ('extraction', 0.093), ('tj', 0.089), ('triggers', 0.087), ('causal', 0.087), ('mentions', 0.087), ('cotemp', 0.083), ('proofread', 0.083), ('spinning', 0.083), ('spins', 0.083), ('strikes', 0.083), ('allen', 0.083), ('super', 0.083), ('edge', 0.079), ('enables', 0.073), ('argument', 0.071), ('properties', 0.067), ('enter', 0.066), ('replication', 0.066), ('coreference', 0.065), ('pairwise', 0.063), ('relation', 0.063), ('ti', 0.062), ('srl', 0.058), ('ions', 0.058), ('roles', 0.057), ('global', 0.056), ('timeline', 0.055), ('denotes', 0.055), ('characterize', 0.055), ('graphs', 0.055), ('fd', 0.053), ('tv', 0.053), ('spread', 0.051), ('passed', 0.051), ('lose', 0.049), ('graph', 0.048), ('descriptions', 0.047), ('seattle', 0.046), ('mention', 0.044), ('describes', 0.042), ('span', 0.042), ('ordering', 0.041), ('paragraph', 0.041), ('changing', 0.041), ('reaches', 0.041), ('classifier', 0.04), ('edges', 0.039), ('exhibit', 0.037), ('dre', 0.036), ('bionlp', 0.036), ('wal', 0.036), ('timebank', 0.036), ('rotor', 0.036), ('ala', 0.036), ('oftemporal', 0.036), ('ofrfo', 0.036), ('neglect', 0.036), ('preclude', 0.036), ('infor', 0.036), ('campbell', 0.036), ('attachments', 0.036), ('bme', 0.036), ('flowing', 0.036), ('harding', 0.036), ('mcclosky', 0.036), ('ntot', 0.036), ('ole', 0.036), ('pha', 0.036), ('sti', 0.036), ('tnhte', 0.036), ('phenomena', 0.036), ('coherent', 0.034), ('series', 0.034), ('eisn', 0.033), ('rod', 0.033), ('hto', 0.033), ('rds', 0.033), ('tse', 0.033), ('cno', 0.033), ('ots', 0.033), ('sla', 0.033), ('textbook', 0.033), ('wre', 0.033), ('central', 0.032), ('kim', 0.032)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000004 118 emnlp-2013-Learning Biological Processes with Global Constraints

Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning

Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.

2 0.31957272 16 emnlp-2013-A Unified Model for Topics, Events and Users on Twitter

Author: Qiming Diao ; Jing Jiang

Abstract: With the rapid growth of social media, Twitter has become one of the most widely adopted platforms for people to post short and instant message. On the one hand, people tweets about their daily lives, and on the other hand, when major events happen, people also follow and tweet about them. Moreover, people’s posting behaviors on events are often closely tied to their personal interests. In this paper, we try to model topics, events and users on Twitter in a unified way. We propose a model which combines an LDA-like topic model and the Recurrent Chinese Restaurant Process to capture topics and events. We further propose a duration-based regularization component to find bursty events. We also propose to use event-topic affinity vectors to model the asso- . ciation between events and topics. Our experiments shows that our model can accurately identify meaningful events and the event-topic affinity vectors are effective for event recommendation and grouping events by topics.

3 0.27631807 192 emnlp-2013-Unsupervised Induction of Contingent Event Pairs from Film Scenes

Author: Zhichao Hu ; Elahe Rahimtoroghi ; Larissa Munishkina ; Reid Swanson ; Marilyn A. Walker

Abstract: Human engagement in narrative is partially driven by reasoning about discourse relations between narrative events, and the expectations about what is likely to happen next that results from such reasoning. Researchers in NLP have tackled modeling such expectations from a range of perspectives, including treating it as the inference of the CONTINGENT discourse relation, or as a type of common-sense causal reasoning. Our approach is to model likelihood between events by drawing on several of these lines of previous work. We implement and evaluate different unsupervised methods for learning event pairs that are likely to be CONTINGENT on one another. We refine event pairs that we learn from a corpus of film scene descriptions utilizing web search counts, and evaluate our results by collecting human judgments ofcontingency. Our results indicate that the use of web search counts increases the av- , erage accuracy of our best method to 85.64% over a baseline of 50%, as compared to an average accuracy of 75. 15% without web search.

4 0.26279843 41 emnlp-2013-Building Event Threads out of Multiple News Articles

Author: Xavier Tannier ; Veronique Moriceau

Abstract: We present an approach for building multidocument event threads from a large corpus of newswire articles. An event thread is basically a succession of events belonging to the same story. It helps the reader to contextualize the information contained in a single article, by navigating backward or forward in the thread from this article. A specific effort is also made on the detection of reactions to a particular event. In order to build these event threads, we use a cascade of classifiers and other modules, taking advantage of the redundancy of information in the newswire corpus. We also share interesting comments concerning our manual annotation procedure for building a training and testing set1.

5 0.23241854 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification

Author: Jun-Ping Ng ; Min-Yen Kan ; Ziheng Lin ; Wei Feng ; Bin Chen ; Jian Su ; Chew Lim Tan

Abstract: In this paper we classify the temporal relations between pairs of events on an article-wide basis. This is in contrast to much of the existing literature which focuses on just event pairs which are found within the same or adjacent sentences. To achieve this, we leverage on discourse analysis as we believe that it provides more useful semantic information than typical lexico-syntactic features. We propose the use of several discourse analysis frameworks, including 1) Rhetorical Structure Theory (RST), 2) PDTB-styled discourse relations, and 3) topical text segmentation. We explain how features derived from these frameworks can be effectively used with support vector machines (SVM) paired with convolution kernels. Experiments show that our proposal is effective in improving on the state-of-the-art significantly by as much as 16% in terms of F1, even if we only adopt less-than-perfect automatic discourse analyzers and parsers. Making use of more accurate discourse analysis can further boost gains to 35%.

6 0.22386843 74 emnlp-2013-Event-Based Time Label Propagation for Automatic Dating of News Articles

7 0.19353172 147 emnlp-2013-Optimized Event Storyline Generation based on Mixture-Event-Aspect Model

8 0.18306121 93 emnlp-2013-Harvesting Parallel News Streams to Generate Paraphrases of Event Relations

9 0.14798017 75 emnlp-2013-Event Schema Induction with a Probabilistic Entity-Driven Model

10 0.11268673 68 emnlp-2013-Effectiveness and Efficiency of Open Relation Extraction

11 0.11156759 90 emnlp-2013-Generating Coherent Event Schemas at Scale

12 0.10306189 67 emnlp-2013-Easy Victories and Uphill Battles in Coreference Resolution

13 0.094918989 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution

14 0.088522248 73 emnlp-2013-Error-Driven Analysis of Challenges in Coreference Resolution

15 0.087166719 160 emnlp-2013-Relational Inference for Wikification

16 0.085041307 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction

17 0.081658944 152 emnlp-2013-Predicting the Presence of Discourse Connectives

18 0.072198033 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge

19 0.069598615 62 emnlp-2013-Detection of Product Comparisons - How Far Does an Out-of-the-Box Semantic Role Labeling System Take You?

20 0.062098611 112 emnlp-2013-Joint Coreference Resolution and Named-Entity Linking with Multi-Pass Sieves


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.215), (1, 0.264), (2, 0.02), (3, 0.36), (4, 0.04), (5, -0.245), (6, -0.28), (7, -0.003), (8, -0.128), (9, 0.07), (10, 0.064), (11, 0.019), (12, 0.032), (13, -0.047), (14, 0.009), (15, -0.094), (16, -0.029), (17, 0.031), (18, -0.046), (19, 0.027), (20, 0.024), (21, -0.06), (22, -0.001), (23, 0.008), (24, 0.05), (25, 0.044), (26, 0.005), (27, 0.009), (28, -0.049), (29, 0.042), (30, -0.027), (31, -0.112), (32, -0.009), (33, -0.036), (34, -0.0), (35, 0.083), (36, 0.068), (37, 0.018), (38, -0.008), (39, 0.01), (40, -0.066), (41, 0.031), (42, 0.005), (43, -0.047), (44, 0.021), (45, -0.061), (46, -0.006), (47, -0.02), (48, 0.014), (49, -0.026)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.98649138 118 emnlp-2013-Learning Biological Processes with Global Constraints

Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning

Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.

2 0.87904525 192 emnlp-2013-Unsupervised Induction of Contingent Event Pairs from Film Scenes

Author: Zhichao Hu ; Elahe Rahimtoroghi ; Larissa Munishkina ; Reid Swanson ; Marilyn A. Walker

Abstract: Human engagement in narrative is partially driven by reasoning about discourse relations between narrative events, and the expectations about what is likely to happen next that results from such reasoning. Researchers in NLP have tackled modeling such expectations from a range of perspectives, including treating it as the inference of the CONTINGENT discourse relation, or as a type of common-sense causal reasoning. Our approach is to model likelihood between events by drawing on several of these lines of previous work. We implement and evaluate different unsupervised methods for learning event pairs that are likely to be CONTINGENT on one another. We refine event pairs that we learn from a corpus of film scene descriptions utilizing web search counts, and evaluate our results by collecting human judgments ofcontingency. Our results indicate that the use of web search counts increases the av- , erage accuracy of our best method to 85.64% over a baseline of 50%, as compared to an average accuracy of 75. 15% without web search.

3 0.80957502 41 emnlp-2013-Building Event Threads out of Multiple News Articles

Author: Xavier Tannier ; Veronique Moriceau

Abstract: We present an approach for building multidocument event threads from a large corpus of newswire articles. An event thread is basically a succession of events belonging to the same story. It helps the reader to contextualize the information contained in a single article, by navigating backward or forward in the thread from this article. A specific effort is also made on the detection of reactions to a particular event. In order to build these event threads, we use a cascade of classifiers and other modules, taking advantage of the redundancy of information in the newswire corpus. We also share interesting comments concerning our manual annotation procedure for building a training and testing set1.

4 0.78835177 74 emnlp-2013-Event-Based Time Label Propagation for Automatic Dating of News Articles

Author: Tao Ge ; Baobao Chang ; Sujian Li ; Zhifang Sui

Abstract: Since many applications such as timeline summaries and temporal IR involving temporal analysis rely on document timestamps, the task of automatic dating of documents has been increasingly important. Instead of using feature-based methods as conventional models, our method attempts to date documents in a year level by exploiting relative temporal relations between documents and events, which are very effective for dating documents. Based on this intuition, we proposed an eventbased time label propagation model called confidence boosting in which time label information can be propagated between documents and events on a bipartite graph. The experiments show that our event-based propagation model can predict document timestamps in high accuracy and the model combined with a MaxEnt classifier outperforms the state-ofthe-art method for this task especially when the size of the training set is small.

5 0.72462982 16 emnlp-2013-A Unified Model for Topics, Events and Users on Twitter

Author: Qiming Diao ; Jing Jiang

Abstract: With the rapid growth of social media, Twitter has become one of the most widely adopted platforms for people to post short and instant message. On the one hand, people tweets about their daily lives, and on the other hand, when major events happen, people also follow and tweet about them. Moreover, people’s posting behaviors on events are often closely tied to their personal interests. In this paper, we try to model topics, events and users on Twitter in a unified way. We propose a model which combines an LDA-like topic model and the Recurrent Chinese Restaurant Process to capture topics and events. We further propose a duration-based regularization component to find bursty events. We also propose to use event-topic affinity vectors to model the asso- . ciation between events and topics. Our experiments shows that our model can accurately identify meaningful events and the event-topic affinity vectors are effective for event recommendation and grouping events by topics.

6 0.64899862 76 emnlp-2013-Exploiting Discourse Analysis for Article-Wide Temporal Classification

7 0.61908656 93 emnlp-2013-Harvesting Parallel News Streams to Generate Paraphrases of Event Relations

8 0.58592874 147 emnlp-2013-Optimized Event Storyline Generation based on Mixture-Event-Aspect Model

9 0.45037112 75 emnlp-2013-Event Schema Induction with a Probabilistic Entity-Driven Model

10 0.38153645 68 emnlp-2013-Effectiveness and Efficiency of Open Relation Extraction

11 0.34838465 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction

12 0.34636229 90 emnlp-2013-Generating Coherent Event Schemas at Scale

13 0.33844525 152 emnlp-2013-Predicting the Presence of Discourse Connectives

14 0.31437829 183 emnlp-2013-The VerbCorner Project: Toward an Empirically-Based Semantic Decomposition of Verbs

15 0.29386008 160 emnlp-2013-Relational Inference for Wikification

16 0.25769117 182 emnlp-2013-The Topology of Semantic Knowledge

17 0.24416265 18 emnlp-2013-A temporal model of text periodicities using Gaussian Processes

18 0.23818268 131 emnlp-2013-Mining New Business Opportunities: Identifying Trend related Products by Leveraging Commercial Intents from Microblogs

19 0.23110905 112 emnlp-2013-Joint Coreference Resolution and Named-Entity Linking with Multi-Pass Sieves

20 0.22667243 1 emnlp-2013-A Constrained Latent Variable Model for Coreference Resolution


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(3, 0.028), (9, 0.025), (18, 0.028), (22, 0.124), (30, 0.059), (47, 0.344), (50, 0.017), (51, 0.132), (66, 0.018), (71, 0.013), (75, 0.068), (77, 0.014), (90, 0.018), (96, 0.029)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.76999199 118 emnlp-2013-Learning Biological Processes with Global Constraints

Author: Aju Thalappillil Scaria ; Jonathan Berant ; Mengqiu Wang ; Peter Clark ; Justin Lewis ; Brittany Harding ; Christopher D. Manning

Abstract: Biological processes are complex phenomena involving a series of events that are related to one another through various relationships. Systems that can understand and reason over biological processes would dramatically improve the performance of semantic applications involving inference such as question answering (QA) – specifically “How? ” and “Why? ” questions. In this paper, we present the task of process extraction, in which events within a process and the relations between the events are automatically extracted from text. We represent processes by graphs whose edges describe a set oftemporal, causal and co-reference event-event relations, and characterize the structural properties of these graphs (e.g., the graphs are connected). Then, we present a method for extracting relations between the events, which exploits these structural properties by performing joint in- ference over the set of extracted relations. On a novel dataset containing 148 descriptions of biological processes (released with this paper), we show significant improvement comparing to baselines that disregard process structure.

2 0.71112722 9 emnlp-2013-A Log-Linear Model for Unsupervised Text Normalization

Author: Yi Yang ; Jacob Eisenstein

Abstract: We present a unified unsupervised statistical model for text normalization. The relationship between standard and non-standard tokens is characterized by a log-linear model, permitting arbitrary features. The weights of these features are trained in a maximumlikelihood framework, employing a novel sequential Monte Carlo training algorithm to overcome the large label space, which would be impractical for traditional dynamic programming solutions. This model is implemented in a normalization system called UNLOL, which achieves the best known results on two normalization datasets, outperforming more complex systems. We use the output of UNLOL to automatically normalize a large corpus of social media text, revealing a set of coherent orthographic styles that underlie online language variation.

3 0.66089565 204 emnlp-2013-Word Level Language Identification in Online Multilingual Communication

Author: Dong Nguyen ; A. Seza Dogruoz

Abstract: Multilingual speakers switch between languages in online and spoken communication. Analyses of large scale multilingual data require automatic language identification at the word level. For our experiments with multilingual online discussions, we first tag the language of individual words using language models and dictionaries. Secondly, we incorporate context to improve the performance. We achieve an accuracy of 98%. Besides word level accuracy, we use two new metrics to evaluate this task.

4 0.49414924 41 emnlp-2013-Building Event Threads out of Multiple News Articles

Author: Xavier Tannier ; Veronique Moriceau

Abstract: We present an approach for building multidocument event threads from a large corpus of newswire articles. An event thread is basically a succession of events belonging to the same story. It helps the reader to contextualize the information contained in a single article, by navigating backward or forward in the thread from this article. A specific effort is also made on the detection of reactions to a particular event. In order to build these event threads, we use a cascade of classifiers and other modules, taking advantage of the redundancy of information in the newswire corpus. We also share interesting comments concerning our manual annotation procedure for building a training and testing set1.

5 0.48707548 77 emnlp-2013-Exploiting Domain Knowledge in Aspect Extraction

Author: Zhiyuan Chen ; Arjun Mukherjee ; Bing Liu ; Meichun Hsu ; Malu Castellanos ; Riddhiman Ghosh

Abstract: Aspect extraction is one of the key tasks in sentiment analysis. In recent years, statistical models have been used for the task. However, such models without any domain knowledge often produce aspects that are not interpretable in applications. To tackle the issue, some knowledge-based topic models have been proposed, which allow the user to input some prior domain knowledge to generate coherent aspects. However, existing knowledge-based topic models have several major shortcomings, e.g., little work has been done to incorporate the cannot-link type of knowledge or to automatically adjust the number of topics based on domain knowledge. This paper proposes a more advanced topic model, called MC-LDA (LDA with m-set and c-set), to address these problems, which is based on an Extended generalized Pólya urn (E-GPU) model (which is also proposed in this paper). Experiments on real-life product reviews from a variety of domains show that MCLDA outperforms the existing state-of-the-art models markedly.

6 0.48395759 136 emnlp-2013-Multi-Domain Adaptation for SMT Using Multi-Task Learning

7 0.47953326 74 emnlp-2013-Event-Based Time Label Propagation for Automatic Dating of News Articles

8 0.47645751 179 emnlp-2013-Summarizing Complex Events: a Cross-Modal Solution of Storylines Extraction and Reconstruction

9 0.47295538 48 emnlp-2013-Collective Personal Profile Summarization with Social Networks

10 0.47268361 25 emnlp-2013-Appropriately Incorporating Statistical Significance in PMI

11 0.46224368 46 emnlp-2013-Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes

12 0.46193874 65 emnlp-2013-Document Summarization via Guided Sentence Compression

13 0.46140915 29 emnlp-2013-Automatic Domain Partitioning for Multi-Domain Learning

14 0.45981282 152 emnlp-2013-Predicting the Presence of Discourse Connectives

15 0.45935774 124 emnlp-2013-Leveraging Lexical Cohesion and Disruption for Topic Segmentation

16 0.45699754 194 emnlp-2013-Unsupervised Relation Extraction with General Domain Knowledge

17 0.45684251 56 emnlp-2013-Deep Learning for Chinese Word Segmentation and POS Tagging

18 0.45678502 21 emnlp-2013-An Empirical Study Of Semi-Supervised Chinese Word Segmentation Using Co-Training

19 0.45617643 80 emnlp-2013-Exploiting Zero Pronouns to Improve Chinese Coreference Resolution

20 0.45589542 117 emnlp-2013-Latent Anaphora Resolution for Cross-Lingual Pronoun Prediction