acl acl2012 acl2012-85 knowledge-graph by maker-knowledge-mining

85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

Source: pdf

Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran

Abstract: Interpreting news requires identifying its constituent events. Events are complex linguistically and ontologically, so disambiguating their reference is challenging. We introduce event linking, which canonically labels an event reference with the article where it was first reported. This implicitly relaxes coreference to co-reporting, and will practically enable augmenting news archives with semantic hyperlinks. We annotate and analyse a corpus of 150 documents, extracting 501 links to a news archive with reasonable inter-annotator agreement.

Reference: text

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 au Abstract Interpreting news requires identifying its constituent events. [sent-5, score-0.208]

2 Events are complex linguistically and ontologically, so disambiguating their reference is challenging. [sent-6, score-0.063]

3 We introduce event linking, which canonically labels an event reference with the article where it was first reported. [sent-7, score-1.511]

4 This implicitly relaxes coreference to co-reporting, and will practically enable augmenting news archives with semantic hyperlinks. [sent-8, score-0.256]

5 We annotate and analyse a corpus of 150 documents, extracting 501 links to a news archive with reasonable inter-annotator agreement. [sent-9, score-0.404]

6 1 Introduction Interpreting news requires identifying its constituent events. [sent-10, score-0.179]

7 Information extraction (IE) makes this feasible by considering only events of a specified type, such as personnel succession or arrest (Grishman and Sundheim, 1996; LDC, 2005), an approach not extensible to novel events, or the same event types in sub-domains, e. [sent-11, score-0.885]

8 On the other hand, topic detection and tracking (TDT; Allan, 2002) disregards individual event mentions, clustering together articles that share a topic. [sent-14, score-0.744]

9 Between these fine and coarse-grained approaches, event identification requires grouping references to the same event. [sent-15, score-0.643]

10 However, strict coreference is hampered by the complexity of event semantics: poison, murder and die may indicate the same effective event. [sent-16, score-0.783]

11 The solution is to tag mentions with a canonical identifier for each news-triggering event. [sent-17, score-0.126]

12 This paper introduces event linking: given a past event reference in context, find the article in a news archive that first reports that the event happened. [sent-18, score-2.614]

13 +Department of #R&D;, 228 Thomson Computing Macquarie University NSW, Australia { honnibal Reuters Corporation St. [sent-19, score-0.053]

14 com The task has an immediate practical application: some online newspapers link past event mentions to relevant news stories, but currently do so with low coverage and consistency; an event linker can add referentially-precise hyperlinks to news. [sent-22, score-1.773]

15 The event linking task parallels entity linking (NEL; Ji and Grishman, 2011), considering a news archive as a knowledge base (KB) of events, where each article exclusively represents the zero or more events that it first reports. [sent-23, score-1.865]

16 Coupled with an appropriate event extractor, event linking may be performed for all events mentioned in a document, like the named entity disambiguation task (Bunescu and Pa¸ sca, 2006; Cucerzan, 2007). [sent-24, score-1.751]

17 We have annotated and analysed 150 news and opinion articles, marking references to past, newsworthy events, and linking where possible to canonical articles in a 13-year news archive. [sent-25, score-0.855]

18 2 The events in a news story Approaches to news event processing are subsumed within broader notions of topics, scenario templates, or temporal entities, among others. [sent-26, score-1.469]

19 We illustrate key challenges in processing news events and motivate event linking through the example story in Figure 1. [sent-27, score-1.385]

20 Salience Our story highlights carjackings and a police warning as newsworthy, alongside events like feeding, drove and told which carry less individual weight. [sent-28, score-0.653]

21 Orthogonally, parts of the story are new events, while others are previously reported events that the reader may be aware of (illustrated in Figure 1). [sent-29, score-0.342]

22 Online, the two background carjackings and the police warning are hyperlinked to other SMH articles where they were reported. [sent-30, score-0.273]

23 c so2c0ia1t2io Ans fso rc Ciatoiomnp fuotart Cio nmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi2c 2s8–232, Figure 1: Possible event mentions marked in an article from SMH, segmented into news (N) and background (B) event portions. [sent-33, score-1.739]

24 ishman and Sundheim, 1996) selects an event type of which all instances are salient; TDT (Allan, 2002) operates at the document level, which avoids differentiating event mentions; and TimeML (Pustejovsky et al. [sent-34, score-1.35]

25 Critiquing ACE05 event detection for not addressing salience, Ji et al. [sent-36, score-0.643]

26 (2009) harness cross-document fre- quencies for event ranking. [sent-37, score-0.643]

27 Similarly, reference to a previously-reported event implies it is newsworthy. [sent-38, score-0.706]

28 Diversity IE traditionally targets a selected event type (Grishman and Sundheim, 1996). [sent-39, score-0.695]

29 ACE05 considers a broader event typology, dividing eight thematic event types (business, justice, etc. [sent-40, score-1.33]

30 ) into 33 subtypes such as attack, die and declare bankruptcy (LDC, 2005). [sent-41, score-0.1]

31 Most subtypes suffer from few annotated instances, while others are impractically broad: sexual abuse, gunfire and the Holocaust each constitute attack instances (is told considered an attack in Figure 1? [sent-42, score-0.214]

32 1 While ACE05 would mark the various attack events in our story, police warned would be unrecognised. [sent-45, score-0.375]

33 , 2010; Chambers and Jurafsky, 2011), event types are brittle to particular tasks and domains, such as bio-text mining (e. [sent-49, score-0.643]

34 229 Identity Event coreference is complicated by partitive (sub-event) and logical (e. [sent-61, score-0.077]

35 When considering the relationship between another carjacking and grabbed, drove or stabbed, ACE05 would apply the policy: “When in doubt, do not mark any coreference” (LDC, 2005). [sent-64, score-0.099]

36 Bejan and Harabagiu (2008) consider event coreference across documents, marking the “most important events” (Bejan, 2010), albeit within Google News clusters, where multiple articles reporting the same event are likely to use similar language. [sent-65, score-1.496]

37 Similar challenges apply to identifying event causality and other relations: Bejan and Harabagiu (2008) suggest arcs such as feeding −p −re −c −ed →es walking −en −a −b →les grabbed – akin to instantiat−i −o −ns − → →of FrameNe−−t ’s − →frame relations (Fillmore et al. [sent-66, score-0.829]

38 Explicit reference By considering events through topical document clusters, TDT avoids some challenges of precise identity. [sent-69, score-0.399]

39 It prescribes rules of interpretation for which stories pertain to a seminal event. [sent-70, score-0.074]

40 However, the carjackings in our story are neither preconditions nor consequences of a seminal event and so would not constitute a TDT cluster. [sent-71, score-0.864]

41 TDT fails to account for these explicit event references. [sent-72, score-0.703]

42 (2009) consider event dependency as directed arcs between documents or paragraphs, they generally retain a broad sense of topic with little attention to explicit reference. [sent-74, score-0.734]

43 3 The event linking task Given an explicit reference to a past event, event linking grounds it in a given news archive. [sent-75, score-2.013]

44 This applies to all events worthy of having been reported, and harnesses explicit reference rather than more general notions of relevance. [sent-76, score-0.402]

45 Though analogous to NEL, our task differs in the types of expressions that may be linked, and the manner of determining the correct KB node to link to, if any. [sent-77, score-0.161]

46 1 Event-referring expressions We consider a subset of newsworthy events things that happen and directly trigger news as candidate referents. [sent-79, score-0.611]

47 In TimeML’s event classification (Puste– – jovsky et al. [sent-80, score-0.643]

48 All references must be explicit, reporting the event as factual and completed or ongoing. [sent-91, score-0.677]

49 Not all event references meeting these criteria are reasonably LINKABLE to a single article: MULTIPLE many distinct events, or an event type, e. [sent-92, score-1.314]

50 world wars, demand; AGGREGATE emerges from other events over time, e. [sent-94, score-0.242]

51 grew 15%, scored 100 goals; COMPLEX an event reported over multiple articles in terms of its sub-events, e. [sent-96, score-0.697]

52 2 A news archive as a KB We define a canonical link target for each event: the earliest article in the archive that reports the given event happened or is happening. [sent-100, score-1.626]

53 Each archival article implicitly represents zero or more related events, just as Wikipedia entries represent zero or one entity in NEL. [sent-101, score-0.258]

54 Links target the story as a whole: closely related, co-reported events link to the same article, avoiding a problematically strict approach to event identity. [sent-102, score-1.147]

55 An archive reports only selected events, so a valid target may not exist (NEL’s NIL). [sent-103, score-0.271]

56 4 An annotated corpus We link to a digital archive of the Sydney Morning Herald: Australian and international news from 1986 to 2009, published daily, Monday to Saturday. [sent-104, score-0.501]

57 2 We annotate a randomly sampled corpus of 150 articles from its 2009 News and Features and Business sections including news reports, op-eds and letters. [sent-105, score-0.233]

58 For this whole-document annotation, a single word of each past/ongoing, newsworthy event mention is marked. [sent-106, score-0.843]

59 3 If LINKABLE, the annotator searches the archive by keyword and date, selecting a target, reported here (a self-referential link) or NIL. [sent-107, score-0.243]

60 An annotation ofour example story (Figure 1) would produce five groups of event references (Table 1). [sent-108, score-0.771]

61 2The archive may be searched at http : / / news store . [sent-109, score-0.372]

62 ac 3We couple marking and linking since annotators must learn to judge newsworthiness relative to the target archive. [sent-113, score-0.358]

63 Pairwise inter-annotator agreement in Table 2 shows that annotators infrequently select the same words to link, but that reasonable agreement on the link target can be achieved for agreed tokens. [sent-115, score-0.308]

64 4 Adjudicator-annotator agreements are generally much higher than inter-annotator agreements: in many cases, an annotator fails to find a target or selects one that does not first report the event; J accepts most annotations as valid. [sent-116, score-0.122]

65 In other cases, there may be multiple articles published on the same day that describe the event in question from different angles; agreement increases substantially when relaxed to accept date agreement. [sent-117, score-0.736]

66 Our adjudicated corpus of 150 documents is summarised in Table 3. [sent-118, score-0.053]

67 Where a definitive link target is not available, an annotator may erroneously select another candidate: an opinion article describing the event, an article where the event is mentioned as background, or an article anticipating the event. [sent-119, score-1.341]

68 4κ ≈ F1 for the binary token task (F1 accounts for the majority class) and for the sparse link targets/date selection. [sent-121, score-0.129]

69 CategoryMentionsTypesDocs Any markable2136655149 1399 501 417 229 144 99 COMPLEX 667 231 220 111 77 79 111 77 79 MULTIPLE 328 102 102 189 57 57 LINKABLE linked reported here nil AGGREGATE 1895757 Table 3: Annotation frequencies: no. [sent-122, score-0.09]

70 of mentions, distinct per document, and document frequency Can overpayed link to what had been acquired? [sent-123, score-0.165]

71 Can 10 died be linked to an article where only nine are confirmed dead? [sent-124, score-0.21]

72 For the application of adding hyperlinks to news, such a link might be beneficial, but it may be better considered an AGGREGATE. [sent-125, score-0.188]

73 The schema underspecifies definitions of ‘event’ and ‘newsworthiness’, accounting for much of the token-level disagreement, but not directly affecting the task of linking a specified mention to the archive. [sent-126, score-0.233]

74 Adjectival mentions such as Apple’s new CEO are easy to miss and questionably explicit. [sent-127, score-0.077]

75 Events are also confused with facts and abstract entities, such as bans, plans, reports and laws. [sent-128, score-0.045]

76 Unlike many other facts, events can be grounded to a particular time of occurrence, often stated in text. [sent-129, score-0.242]

77 5 Analysis and discussion To assess task feasibility, we present bag-of-words (BoW) and oracle results (Figure 2). [sent-130, score-0.034]

78 Using the whole document as a query5 retrieves 30% of gold targets at rank 10, but only 60% by rank 150. [sent-131, score-0.088]

79 Term windows around each event mention perform close to our oracle consisting of successful search keywords collected during annotation, with over 80% recall at 150. [sent-132, score-0.754]

80 No system recalls over 30% of targets at 1-best, suggesting a reranking approach may be required. [sent-133, score-0.052]

81 These constraints may draw on temporal expressions in the source article or external knowledge. [sent-135, score-0.239]

82 Successful automated linking will therefore require extensive use of semantic and temporal information. [sent-136, score-0.236]

83 Our corpus also highlights distinctions between 5Using Apache Solr defaults: TFIDF-weighted cosine similarity over stemmed and stopped tokens. [sent-137, score-0.044]

84 231 Figure 2: Recall for BoW and oracle systems explicit event reference and broader relationships. [sent-138, score-0.844]

85 (2009) makes the reasonable assumption that news events generally build on others that recently precede them. [sent-140, score-0.492]

86 We find that the likelihood a linked article occurred fewer than d days ago reduces exponentially with respect to d, yet the rate of decay is surprisingly slow: half of all link targets precede their source by over 3 months. [sent-141, score-0.43]

87 The effect of coreporting rather than coreference is also clear: like {carjacking, grabbed} in our example, mention chains include { return, decide, recontest}, {winner, Cup} as well as more familiar instances like {acquired, acquisition }. [sent-142, score-0.119]

88 6 Conclusion We have introduced event linking, which takes a novel approach to news event reference, associating each newsworthy past event with a canonical article in a news archive. [sent-143, score-2.699]

89 We demonstrate task’s feasibility, with reasonable inter-annotator agreement over a 150 document corpus. [sent-144, score-0.107]

90 The corpus highlights features of the retrieval task and its dependence on temporal knowledge. [sent-145, score-0.089]

91 As well as using event linking to add referentially precise hyperlinks to a news archive, further characteristics of news will emerge by analysing the graph of event references. [sent-146, score-1.894]

92 A linguistic resource for discovering event structures and resolving event coreference. [sent-155, score-1.286]

93 Cross-document event extraction and tracking: Task, evaluation, techniques and challenges. [sent-198, score-0.643]

94 ACE (Automatic Content Extraction) English annotation guidelines for events. [sent-206, score-0.028]

95 TimeML: Robust specification of event and temporal expressions in text. [sent-217, score-0.72]

similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('event', 0.643), ('events', 0.242), ('archive', 0.193), ('linking', 0.191), ('news', 0.179), ('article', 0.162), ('newsworthy', 0.158), ('link', 0.129), ('story', 0.1), ('tdt', 0.098), ('bejan', 0.092), ('grishman', 0.09), ('carjackings', 0.079), ('grabbed', 0.079), ('linkable', 0.079), ('smh', 0.079), ('mentions', 0.077), ('coreference', 0.077), ('allan', 0.074), ('attack', 0.07), ('reference', 0.063), ('kb', 0.063), ('die', 0.063), ('nel', 0.063), ('police', 0.063), ('explicit', 0.06), ('ji', 0.06), ('sundheim', 0.059), ('salience', 0.059), ('heng', 0.059), ('hyperlinks', 0.059), ('timeml', 0.055), ('articles', 0.054), ('adjudicated', 0.053), ('carjacking', 0.053), ('filatova', 0.053), ('honnibal', 0.053), ('newsworthiness', 0.053), ('yangarber', 0.053), ('ralph', 0.052), ('targets', 0.052), ('annotator', 0.05), ('canonical', 0.049), ('linked', 0.048), ('tracking', 0.047), ('cosmin', 0.046), ('drove', 0.046), ('feeding', 0.046), ('temporal', 0.045), ('marking', 0.045), ('reports', 0.045), ('broader', 0.044), ('highlights', 0.044), ('past', 0.043), ('mention', 0.042), ('nil', 0.042), ('seminal', 0.042), ('adrian', 0.042), ('warning', 0.042), ('ldc', 0.041), ('agreement', 0.039), ('cup', 0.039), ('bionlp', 0.039), ('bow', 0.039), ('agreements', 0.039), ('precede', 0.039), ('harabagiu', 0.037), ('told', 0.037), ('subtypes', 0.037), ('fillmore', 0.037), ('jn', 0.037), ('notions', 0.037), ('annotators', 0.036), ('document', 0.036), ('background', 0.035), ('successful', 0.035), ('oracle', 0.034), ('reporting', 0.034), ('interpreting', 0.034), ('target', 0.033), ('zero', 0.032), ('stories', 0.032), ('entity', 0.032), ('expressions', 0.032), ('reasonable', 0.032), ('sydney', 0.032), ('arcs', 0.031), ('feasibility', 0.031), ('pustejovsky', 0.031), ('bunescu', 0.03), ('chambers', 0.03), ('challenges', 0.03), ('ie', 0.03), ('james', 0.029), ('au', 0.029), ('business', 0.028), ('avoids', 0.028), ('annotation', 0.028), ('aggregate', 0.028), ('reasonably', 0.028)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99999934 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran

2 0.29233903 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling

Author: Wei Lu ; Dan Roth

Abstract: This paper presents a novel sequence labeling model based on the latent-variable semiMarkov conditional random fields for jointly extracting argument roles of events from texts. The model takes in coarse mention and type information and predicts argument roles for a given event template. This paper addresses the event extraction problem in a primarily unsupervised setting, where no labeled training instances are available. Our key contribution is a novel learning framework called structured preference modeling (PM), that allows arbitrary preference to be assigned to certain structures during the learning procedure. We establish and discuss connections between this framework and other existing works. We show empirically that the structured preferences are crucial to the success of our task. Our model, trained without annotated data and with a small number of structured preferences, yields performance competitive to some baseline supervised approaches.

3 0.26063067 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures

Author: Oleksandr Kolomiyets ; Steven Bethard ; Marie-Francine Moens

Abstract: We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations like BEFORE, AFTER, OVERLAP and IDENTITY. We annotate a corpus of children’s stories with temporal dependency trees, achieving agreement (Krippendorff’s Alpha) of 0.856 on the event words, 0.822 on the links between events, and of 0.700 on the ordering relation labels. We compare two parsing models for temporal dependency structures, and show that a deterministic non-projective dependency parser outperforms a graph-based maximum spanning tree parser, achieving labeled attachment accuracy of 0.647 and labeled tree edit distance of 0.596. Our analysis of the dependency parser errors gives some insights into future research directions.

4 0.25880858 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection

Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li

Abstract: Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. Meanwhile, it significantly reduces the number of non-zero entries in the representation. We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient.

5 0.23753364 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations

Author: Christian Chiarcos

Abstract: This paper describes a novel approach towards the empirical approximation of discourse relations between different utterances in texts. Following the idea that every pair of events comes with preferences regarding the range and frequency of discourse relations connecting both parts, the paper investigates whether these preferences are manifested in the distribution of relation words (that serve to signal these relations). Experiments on two large-scale English web corpora show that significant correlations between pairs of adjacent events and relation words exist, that they are reproducible on different data sets, and for three relation words, that their distribution corresponds to theorybased assumptions. 1 Motivation Texts are not merely accumulations of isolated utterances, but the arrangement of utterances conveys meaning; human text understanding can thus be described as a process to recover the global structure of texts and the relations linking its different parts (Vallduv ı´ 1992; Gernsbacher et al. 2004). To capture these aspects of meaning in NLP, it is necessary to develop operationalizable theories, and, within a supervised approach, large amounts of annotated training data. To facilitate manual annotation, weakly supervised or unsupervised techniques can be applied as preprocessing step for semimanual annotation, and this is part of the motivation of the approach described here. 213 Discourse relations involve different aspects of meaning. This may include factual knowledge about the connected discourse segments (a ‘subjectmatter’ relation, e.g., if one utterance represents the cause for another, Mann and Thompson 1988, p.257), argumentative purposes (a ‘presentational’ relation, e.g., one utterance motivates the reader to accept a claim formulated in another utterance, ibid., p.257), or relations between entities mentioned in the connected discourse segments (anaphoric relations, Webber et al. 2003). Discourse relations can be indicated explicitly by optional cues, e.g., adverbials (e.g., however), conjunctions (e.g., but), or complex phrases (e.g., in contrast to what Peter said a minute ago). Here, these cues are referred to as relation words. Assuming that relation words are associated with specific discourse relations (Knott and Dale 1994; Prasad et al. 2008), the distribution of relation words found between two (types of) events can yield insights into the range of discourse relations possible at this occasion and their respective likeliness. For this purpose, this paper proposes a background knowledge base (BKB) that hosts pairs of events (here heuristically represented by verbs) along with distributional profiles for relation words. The primary data structure of the BKB is a triple where one event (type) is connected with a particular relation word to another event (type). Triples are further augmented with a frequency score (expressing the likelihood of the triple to be observed), a significance score (see below), and a correlation score (indicating whether a pair of events has a positive or negative correlation with a particular relation word). ProceedJienjgus, R ofep thueb 5lic0t hof A Knonrueaa,l M 8-e1e4ti Jnugly o f2 t0h1e2 A.s ?c so2c0ia1t2io Ans fsoorc Ciatoiomnp fuotart Cioonmaplu Ltiantgiounisatlic Lsi,n pgaugiestsi2c 1s3–217, Triples can be easily acquired from automatically parsed corpora. While the relation word is usually part of the utterance that represents the source of the relation, determining the appropriate target (antecedent) of the relation may be difficult to achieve. As a heuristic, an adjacency preference is adopted, i.e., the target is identified with the main event of the preceding utterance.1 The BKB can be constructed from a sufficiently large corpus as follows: • • identify event types and relation words for every utterance create a candidate triple consisting of the event type of the utterance, the relation word, and the event type of the preceding utterance. add the candidate triple to the BKB, if it found in the BKB, increase its score by (or initialize it with) 1, – – • perform a pruning on all candidate triples, calcpuerlaftoer significance aonnd a lclo crarneldaitdioante scores Pruning uses statistical significance tests to evaluate whether the relative frequency of a relation word for a pair of events is significantly higher or lower than the relative frequency of the relation word in the entire corpus. Assuming that incorrect candidate triples (i.e., where the factual target of the relation was non-adjacent) are equally distributed, they should be filtered out by the significance tests. The goal of this paper is to evaluate the validity of this approach. 2 Experimental Setup By generalizing over multiple occurrences of the same events (or, more precisely, event types), one can identify preferences of event pairs for one or several relation words. These preferences capture context-invariant characteristics of pairs of events and are thus to considered to reflect a semantic predisposition for a particular discourse relation. Formally, an event is the semantic representation of the meaning conveyed in the utterance. We 1Relations between non-adjacent utterances are constrained by the structure of discourse (Webber 1991), and thus less likely than relations between adjacent utterances. 214 assume that the same event can reoccur in different contexts, we are thus studying relations between types of events. For the experiment described here, events are heuristically identified with the main predicates of a sentence, i.e., non-auxiliar, noncausative, non-modal verbal lexemes that serve as heads of main clauses. The primary data structure of the approach described here is a triple consisting of a source event, a relation word and a target (antecedent) event. These triples are harvested from large syntactically annotated corpora. For intersentential relations, the target is identified with the event of the immediately preceding main clause. These extraction preferences are heuristic approximations, and thus, an additional pruning step is necessary. For this purpose, statistical significance tests are adopted (χ2 for triples of frequent events and relation words, t-test for rare events and/or relation words) that compare the relative frequency of a rela- tion word given a pair of events with the relative frequency of the relation word in the entire corpus. All results with p ≥ .05 are excluded, i.e., only triples are preserved pfo ≥r w .0h5ic ahr teh eex xocblsuedrevde,d i positive or negative correlation between a pair of events and a relation word is not due to chance with at least 95% probability. Assuming an even distribution of incorrect target events, this should rule these out. Additionally, it also serves as a means of evaluation. Using statistical significance tests as pruning criterion entails that all triples eventually confirmed are statistically significant.2 This setup requires immense amounts of data: We are dealing with several thousand events (theoretically, the total number of verbs of a language). The chance probability for two events to occur in adjacent position is thus far below 10−6, and it decreases further if the likelihood of a relation word is taken into consideration. All things being equal, we thus need millions of sentences to create the BKB. Here, two large-scale corpora of English are employed, PukWaC and Wackypedia EN (Baroni et al. 2009). PukWaC is a 2G-token web corpus of British English crawled from the uk domain (Ferraresi et al. 2Subsequent studies may employ less rigid pruning criteria. For the purpose of the current paper, however, the statistical significance of all extracted triples serves as an criterion to evaluate methodological validity. 2008), and parsed with MaltParser (Nivre et al. 2006). It is distributed in 5 parts; Only PukWaC1 to PukWaC-4 were considered here, constituting 82.2% (72.5M sentences) of the entire corpus, PukWaC-5 is left untouched for forthcoming evaluation experiments. Wackypedia EN is a 0.8G-token dump of the English Wikipedia, annotated with the same tools. It is distributed in 4 different files; the last portion was left untouched for forthcoming evaluation experiments. The portion analyzed here comprises 33.2M sentences, 75.9% of the corpus. The extraction of events in these corpora uses simple patterns that combine dependency information and part-of-speech tags to retrieve the main verbs and store their lemmata as event types. The target (antecedent) event was identified with the last main event of the preceding sentence. As relation words, only sentence-initial children of the source event that were annotated as adverbial modifiers, verb modifiers or conjunctions were considered. 3 Evaluation To evaluate the validity of the approach, three fundamental questions need to be addressed: significance (are there significant correlations between pairs of events and relation words ?), reproducibility (can these correlations confirmed on independent data sets ?), and interpretability (can these correlations be interpreted in terms of theoretically-defined discourse relations ?). 3.1 Significance and Reproducibility Significance tests are part of the pruning stage of the algorithm. Therefore, the number of triples eventually retrieved confirms the existence of statistically significant correlations between pairs of events and relation words. The left column of Tab. 1 shows the number of triples obtained from PukWaC subcorpora of different size. For reproducibility, compare the triples identified with Wackypedia EN and PukWaC subcorpora of different size: Table 1 shows the number of triples found in both Wackypedia EN and PukWaC, and the agreement between both resources. For two triples involving the same events (event types) and the same relation word, agreement means that the relation word shows either positive or negative correlation 215 TasPbe13u7l4n2k98t. We254Mn1a c:CeAs(gurb42)et760cr8m,iop3e61r4l28np0st6uwicho21rm9W,e2673mas048p7c3okenytpdoagi21p8r,o35eE0s29Nit36nvgreipol8796r50s9%.n3509egative correlation of event pairs and relation words between Wackypedia EN and PukWaC subcorpora of different size TBH: thb ouetwnev r17 t1,o27,t0a95P41 ul2kWv6aCs,8.0 Htr5iple1v s, 45.12T35av9sg7.reH7em nv6 ts62(. %.9T2) Table 2: Agreement between but (B), however (H) and then (T) on PukWaC in both corpora, disagreement means positive correlation in one corpus and negative correlation in the other. Table 1 confirms that results obtained on one resource can be reproduced on another. This indicates that triples indeed capture context-invariant, and hence, semantic, characteristics of the relation between events. The data also indicates that reproducibility increases with the size of corpora from which a BKB is built. 3.2 Interpretability Any theory of discourse relations would predict that relation words with similar function should have similar distributions, whereas one would expect different distributions for functionally unrelated relation words. These expectations are tested here for three of the most frequent relation words found in the corpora, i.e., but, then and however. But and however can be grouped together under a generalized notion of contrast (Knott and Dale 1994; Prasad et al. 2008); then, on the other hand, indicates a tem- poral and/or causal relation. Table 2 confirms the expectation that event pairs that are correlated with but tend to show the same correlation with however, but not with then. 4 Discussion and Outlook This paper described a novel approach towards the unsupervised acquisition of discourse relations, with encouraging preliminary results: Large collections of parsed text are used to assess distributional profiles of relation words that indicate discourse relations that are possible between specific types of events; on this basis, a background knowledge base (BKB) was created that can be used to predict an appropriatediscoursemarkertoconnecttwoutterances with no overt relation word. This information can be used, for example, to facilitate the semiautomated annotation of discourse relations, by pointing out the ‘default’ relation word for a given pair of events. Similarly, Zhou et al. (2010) used a language model to predict discourse markers for implicitly realized discourse relations. As opposed to this shallow, n-gram-based approach, here, the internal structure of utterances is exploited: based on semantic considerations, syntactic patterns have been devised that extract triples of event pairs and relation words. The resulting BKB provides a distributional approximation of the discourse relations that can hold between two specific event types. Both approaches exploit complementary sources of knowledge, and may be combined with each other to achieve a more precise prediction of implicit discourse connectives. The validity of the approach was evaluated with respect to three evaluation criteria: The extracted associations between relation words and event pairs could be shown to be statistically significant, and to be reproducible on other corpora; for three highly frequent relation words, theoretical predictions about their relative distribution could be confirmed, indicating their interpretability in terms of presupposed taxonomies of discourse relations. Another prospective field of application can be seen in NLP applications, where selection preferences for relation words may serve as a cheap replacement for full-fledged discourse parsing. In the Natural Language Understanding domain, the BKB may help to disambiguate or to identify discourse relations between different events; in the context of Machine Translation, it may represent a factor guid- ing the insertion of relation words, a task that has been found to be problematic for languages that dif216 fer in their inventory and usage of discourse markers, e.g., German and English (Stede and Schmitz 2000). The approach is language-independent (except for the syntactic extraction patterns), and it does not require manually annotated data. It would thus be easy to create background knowledge bases with relation words for other languages or specific domains given a sufficient amount of textual data. – Related research includes, for example, the unsupervised recognition of causal and temporal relationships, as required, for example, for the recognition of textual entailment. Riaz and Girju (2010) exploit distributional information about pairs of utterances. Unlike approach described here, they are not restricted to adjacent utterances, and do not rely on explicit and recurrent relation words. Their approach can thus be applied to comparably small data sets. However, they are restricted to a specific type of relations whereas here the entire band- width of discourse relations that are explicitly realized in a language are covered. Prospectively, both approaches could be combined to compensate their respective weaknesses. Similar observations can be made with respect to Chambers and Jurafsky (2009) and Kasch and Oates (2010), who also study a single discourse relation (narration), and are thus more limited in scope than the approach described here. However, as their approach extends beyond pairs of events to complex event chains, it seems that both approaches provide complementary types of information and their results could also be combined in a fruitful way to achieve a more detailed assessment of discourse relations. The goal of this paper was to evaluate the methdological validity of the approach. It thus represents the basis for further experiments, e.g., with respect to the enrichment the BKB with information provided by Riaz and Girju (2010), Chambers and Jurafsky (2009) and Kasch and Oates (2010). Other directions of subsequent research may include address more elaborate models of events, and the investigation of the relationship between relation words and taxonomies of discourse relations. Acknowledgments This work was supported by a fellowship within the Postdoc program of the German Academic Exchange Service (DAAD). Initial experiments were conducted at the Collaborative Research Center (SFB) 632 “Information Structure” at the University of Potsdam, Germany. Iwould also like to thank three anonymous reviewers for valuable comments and feedback, as well as Manfred Stede and Ed Hovy whose work on discourse relations on the one hand and proposition stores on the other hand have been the main inspiration for this paper. References M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The wacky wide web: a collection of very large linguistically processed webcrawled corpora. Language Resources and Evaluation, 43(3):209–226, 2009. N. Chambers and D. Jurafsky. Unsupervised learning of narrative schemas and their participants. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2, pages 602–610. Association for Computational Linguistics, 2009. A. Ferraresi, E. Zanchetta, M. Baroni, and S. Bernardini. Introducing and evaluating ukwac, a very large web-derived corpus of english. In Proceedings of the 4th Web as Corpus Workshop (WAC-4) Can we beat Google, pages 47–54, 2008. Morton Ann Gernsbacher, Rachel R. W. Robertson, Paola Palladino, and Necia K. Werner. Managing mental representations during narrative comprehension. Discourse Processes, 37(2): 145–164, 2004. N. Kasch and T. Oates. Mining script-like structures from the web. In Proceedings of the NAACL HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading, pages 34–42. Association for Computational Linguistics, 2010. A. Knott and R. Dale. Using linguistic phenomena to motivate a set ofcoherence relations. Discourse processes, 18(1):35–62, 1994. 217 J. van Kuppevelt and R. Smith, editors. Current Directions in Discourse andDialogue. Kluwer, Dordrecht, 2003. William C. Mann and Sandra A. Thompson. Rhetorical Structure Theory: Toward a functional theory of text organization. Text, 8(3):243–281, 1988. J. Nivre, J. Hall, and J. Nilsson. Maltparser: A data-driven parser-generator for dependency parsing. In Proc. of LREC, pages 2216–2219. Citeseer, 2006. R. Prasad, N. Dinesh, A. Lee, E. Miltsakaki, L. Robaldo, A. Joshi, and B. Webber. The penn discourse treebank 2.0. In Proc. 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, 2008. M. Riaz and R. Girju. Another look at causality: Discovering scenario-specific contingency relationships with no supervision. In Semantic Computing (ICSC), 2010 IEEE Fourth International Conference on, pages 361–368. IEEE, 2010. M. Stede and B. Schmitz. Discourse particles and discourse functions. Machine translation, 15(1): 125–147, 2000. Enric Vallduv ı´. The Informational Component. Garland, New York, 1992. Bonnie L. Webber. Structure and ostension in the interpretation of discourse deixis. Natural Language and Cognitive Processes, 2(6): 107–135, 1991. Bonnie L. Webber, Matthew Stone, Aravind K. Joshi, and Alistair Knott. Anaphora and discourse structure. Computational Linguistics, 4(29):545– 587, 2003. Z.-M. Zhou, Y. Xu, Z.-Y. Niu, M. Lan, J. Su, and C.L. Tan. Predicting discourse connectives for implicit discourse relation recognition. In COLING 2010, pages 1507–15 14, Beijing, China, August 2010.

6 0.20128137 99 acl-2012-Finding Salient Dates for Building Thematic Timelines

7 0.14510797 135 acl-2012-Learning to Temporally Order Medical Events in Clinical Text

8 0.1191442 191 acl-2012-Temporally Anchored Relation Extraction

9 0.11777196 91 acl-2012-Extracting and modeling durations for habits and events from Twitter

10 0.10347703 101 acl-2012-Fully Abstractive Approach to Guided Summarization

11 0.10283373 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale

12 0.094914667 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System

13 0.085150845 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions

14 0.083302669 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction

15 0.081024371 73 acl-2012-Discriminative Learning for Joint Template Filling

16 0.076379225 150 acl-2012-Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia

17 0.072374165 18 acl-2012-A Probabilistic Model for Canonicalizing Named Entity Mentions

18 0.066341631 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study

19 0.065856665 50 acl-2012-Collective Classification for Fine-grained Information Status

20 0.063434757 98 acl-2012-Finding Bursty Topics from Microblogs

similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, -0.178), (1, 0.196), (2, -0.096), (3, 0.258), (4, 0.032), (5, -0.212), (6, -0.0), (7, -0.084), (8, 0.008), (9, -0.128), (10, -0.18), (11, -0.052), (12, 0.006), (13, 0.018), (14, -0.019), (15, -0.022), (16, 0.069), (17, 0.134), (18, -0.18), (19, 0.041), (20, -0.065), (21, 0.149), (22, 0.02), (23, 0.027), (24, -0.065), (25, -0.008), (26, -0.128), (27, 0.079), (28, -0.064), (29, 0.017), (30, 0.2), (31, -0.09), (32, 0.061), (33, -0.004), (34, -0.064), (35, 0.05), (36, -0.0), (37, -0.129), (38, -0.007), (39, -0.093), (40, -0.073), (41, -0.019), (42, 0.038), (43, -0.07), (44, 0.087), (45, 0.139), (46, 0.109), (47, 0.133), (48, 0.126), (49, -0.009)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.99189448 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran

2 0.69780076 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling

Author: Wei Lu ; Dan Roth

3 0.65549695 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection

Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li

4 0.63749349 99 acl-2012-Finding Salient Dates for Building Thematic Timelines

Author: Remy Kessler ; Xavier Tannier ; Caroline Hagege ; Veronique Moriceau ; Andre Bittar

Abstract: We present an approach for detecting salient (important) dates in texts in order to automatically build event timelines from a search query (e.g. the name of an event or person, etc.). This work was carried out on a corpus of newswire texts in English provided by the Agence France Presse (AFP). In order to extract salient dates that warrant inclusion in an event timeline, we first recognize and normalize temporal expressions in texts and then use a machine-learning approach to extract salient dates that relate to a particular topic. We focused only on extracting the dates and not the events to which they are related.

5 0.53279942 135 acl-2012-Learning to Temporally Order Medical Events in Clinical Text

Author: Preethi Raghavan ; Albert Lai ; Eric Fosler-Lussier

Abstract: We investigate the problem of ordering medical events in unstructured clinical narratives by learning to rank them based on their time of occurrence. We represent each medical event as a time duration, with a corresponding start and stop, and learn to rank the starts/stops based on their proximity to the admission date. Such a representation allows us to learn all of Allen’s temporal relations between medical events. Interestingly, we observe that this methodology performs better than a classification-based approach for this domain, but worse on the relationships found in the Timebank corpus. This finding has important implications for styles of data representation and resources used for temporal relation learning: clinical narratives may have different language attributes corresponding to temporal ordering relative to Timebank, implying that the field may need to look at a wider range ofdomains to fully understand the nature of temporal ordering.

6 0.504794 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations

7 0.48442692 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures

8 0.43125242 91 acl-2012-Extracting and modeling durations for habits and events from Twitter

9 0.35782743 195 acl-2012-The Creation of a Corpus of English Metalanguage

10 0.34727401 49 acl-2012-Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study

11 0.33902577 50 acl-2012-Collective Classification for Fine-grained Information Status

12 0.33303547 73 acl-2012-Discriminative Learning for Joint Template Filling

13 0.32136703 191 acl-2012-Temporally Anchored Relation Extraction

14 0.31266585 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System

15 0.30902311 129 acl-2012-Learning High-Level Planning from Text

16 0.30412707 101 acl-2012-Fully Abstractive Approach to Guided Summarization

17 0.28859246 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions

18 0.27731407 215 acl-2012-WizIE: A Best Practices Guided Development Environment for Information Extraction

19 0.24286751 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale

20 0.24271409 98 acl-2012-Finding Bursty Topics from Microblogs

similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(25, 0.024), (26, 0.03), (28, 0.047), (30, 0.028), (37, 0.029), (39, 0.067), (49, 0.017), (59, 0.021), (74, 0.024), (82, 0.074), (84, 0.016), (85, 0.03), (86, 0.263), (90, 0.133), (92, 0.035), (94, 0.017), (99, 0.067)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.83748698 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive

Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran

2 0.82501709 32 acl-2012-Automated Essay Scoring Based on Finite State Transducer: towards ASR Transcription of Oral English Speech

Author: Xingyuan Peng ; Dengfeng Ke ; Bo Xu

Abstract: Conventional Automated Essay Scoring (AES) measures may cause severe problems when directly applied in scoring Automatic Speech Recognition (ASR) transcription as they are error sensitive and unsuitable for the characteristic of ASR transcription. Therefore, we introduce a framework of Finite State Transducer (FST) to avoid the shortcomings. Compared with the Latent Semantic Analysis with Support Vector Regression (LSA-SVR) method (stands for the conventional measures), our FST method shows better performance especially towards the ASR transcription. In addition, we apply the synonyms similarity to expand the FST model. The final scoring performance reaches an acceptable level of 0.80 which is only 0.07 lower than the correlation (0.87) between human raters.

3 0.75804871 72 acl-2012-Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents

Author: Yashar Mehdad ; Matteo Negri ; Marcello Federico

Abstract: We address a core aspect of the multilingual content synchronization task: the identification of novel, more informative or semantically equivalent pieces of information in two documents about the same topic. This can be seen as an application-oriented variant of textual entailment recognition where: i) T and H are in different languages, and ii) entailment relations between T and H have to be checked in both directions. Using a combination of lexical, syntactic, and semantic features to train a cross-lingual textual entailment system, we report promising results on different datasets.

4 0.63237113 150 acl-2012-Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia

Author: Sungchul Kim ; Kristina Toutanova ; Hwanjo Yu

Abstract: In this paper we propose a method to automatically label multi-lingual data with named entity tags. We build on prior work utilizing Wikipedia metadata and show how to effectively combine the weak annotations stemming from Wikipedia metadata with information obtained through English-foreign language parallel Wikipedia sentences. The combination is achieved using a novel semi-CRF model for foreign sentence tagging in the context of a parallel English sentence. The model outperforms both standard annotation projection methods and methods based solely on Wikipedia metadata.

5 0.5827952 187 acl-2012-Subgroup Detection in Ideological Discussions

Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev

Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.

6 0.58065993 12 acl-2012-A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

7 0.57591772 191 acl-2012-Temporally Anchored Relation Extraction

8 0.5669753 102 acl-2012-Genre Independent Subgroup Detection in Online Discussion Threads: A Study of Implicit Attitude using Textual Latent Semantics

9 0.56284088 188 acl-2012-Subgroup Detector: A System for Detecting Subgroups in Online Discussions

10 0.56200737 57 acl-2012-Concept-to-text Generation via Discriminative Reranking

11 0.55987155 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures

12 0.55738807 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base

13 0.55575377 99 acl-2012-Finding Salient Dates for Building Thematic Timelines

14 0.55284059 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations

15 0.54938883 73 acl-2012-Discriminative Learning for Joint Template Filling

16 0.54888535 62 acl-2012-Cross-Lingual Mixture Model for Sentiment Classification

17 0.54687846 40 acl-2012-Big Data versus the Crowd: Looking for Relationships in All the Right Places

18 0.54624033 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information

19 0.54548424 37 acl-2012-Baselines and Bigrams: Simple, Good Sentiment and Topic Classification

20 0.54496378 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling