acl acl2012 acl2012-17 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li
Abstract: Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. Meanwhile, it significantly reduces the number of non-zero entries in the representation. We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient.
Reference: text
sentIndex sentText sentNum sentScore
1 Abstract Mining retrospective events from text streams has been an important research topic. [sent-4, score-0.285]
2 , vector space model) cannot model temporal aspects of documents. [sent-7, score-0.149]
3 To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. [sent-8, score-0.199]
4 BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. [sent-9, score-0.779]
5 Meanwhile, it significantly reduces the number of non-zero entries in the representation. [sent-10, score-0.034]
6 We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient. [sent-11, score-0.403]
7 One standard way for that is to cluster news articles as events by following a two-step approach (Yang et al. [sent-16, score-0.3]
8 , 1998): 1) represent document as vectors and calculate similarities between documents; 2) run the clustering algorithm to obtain document clusters as events. [sent-17, score-0.363]
9 1 Underlying text representation often plays a critical role in this approach, especially for long text streams. [sent-18, score-0.172]
10 In this paper, our focus is to study how to represent temporal documents effectively for event detection. [sent-19, score-0.47]
11 , Vector Space Model (VSM), have a few shortcomings when dealing with temporal documents. [sent-22, score-0.15]
12 The major one is that it maps one dimension to one term, which completely ignores temporal information, and therefore VSM can never capture the evolving trends in text streams. [sent-23, score-0.336]
13 1Post-processing may be also needed document clusters to refine the results. [sent-25, score-0.156]
14 presidential election, although general terms correspond to events in different periods (i. [sent-38, score-0.237]
15 Temporal information has to be taken into consideration for event detection. [sent-41, score-0.263]
16 , the number of dimensions in VSM, can be very large, which requires a considerable amount of space for storage and time for downstream processing. [sent-44, score-0.122]
17 To address these difficulties, in this paper, we propose a burst based text representation method for scalable event detection. [sent-45, score-0.651]
18 The major novelty is to naturally incorporate temporal information into dimensions themselves instead of using external time decaying functions (Yang et al. [sent-46, score-0.362]
19 We instantiate this idea by using bursty features as basic representation units of documents. [sent-48, score-0.637]
20 In this paper, bursty feature refers to a sudden surge of the frequency of a single term in a text stream, and it is represented as the term itself together with the time interval during which the burst takes place. [sent-49, score-1.106]
21 For example, ( Olympic , Aug-0 8-2 0 0 8 , Aug-2 4 -2 0 0 8 ) 2 can be regarded as a bursty feature. [sent-50, score-0.531]
22 We also call the term in a bursty 2Beijing 2008 Olympic Games Proce Jedijung, sR oefpu thbeli c50 othf K Aonrneua,a8l -M14e Jtiunlgy o 2f0 t1h2e. [sent-51, score-0.608]
23 c s 2o0c1ia2ti Aosns fo cria Ctio nm fpourta Ctoiomnpault Laitniognuaislt Licisn,g puaigsteiscs 43–47, feature its bursty term. [sent-53, score-0.555]
24 In our model, each dimension corresponds to a bursty feature, which contains both temporal and semantic information. [sent-54, score-0.732]
25 Bursty features capture and reflect the evolving topic trends, which can be learnt by searching surge patterns in stream data (Kleinberg, 2003). [sent-55, score-0.158]
26 Built on bursty features, our representation model can well adapt to text streams with complex trends, and therefore provides a more reasonable temporal document representation. [sent-56, score-0.935]
27 We further propose a split-cluster-merge algorithm to generate clusters as events. [sent-57, score-0.069]
28 2 Burst-based Text Representation In this section, we describe the proposed burst-based text representation model, denoted as BurstVSM. [sent-60, score-0.199]
29 In BurstVSM, each document is represented as one vector as in VSM, while the major novelty is that one dimension is mapped to one bursty feature instead of one term. [sent-61, score-0.832]
30 In this paper, we define a bursty feature f as a triplet (wf, where w is the bursty term and ts and te are the start and end timestamps of the bursty interval (period). [sent-62, score-1.818]
31 Before introducting BurstVSM, we first discuss how to identify bursty features from text streams. [sent-63, score-0.564]
32 1 Burst Detection Algorithm We follow the batch mode two-state automaton method from (Kleinberg, 2003) for bursty feature detection. [sent-65, score-0.691]
33 3 In this model, a stream of documents containing a term w are assumed to be generated from a two-state automaton with a low frequency state q0 and a high frequency state q1. [sent-66, score-0.324]
34 Each state has its own emission rate (p0 and p1 respectively), and there is a probability for changing state. [sent-67, score-0.035]
35 If an interval of high states appears in the optimal state sequence of some term, this term together with this interval is detected as a bursty feature. [sent-68, score-0.809]
36 To obtain all bursty features in text streams, we can perform burst detection on each term in the vocabulary. [sent-69, score-0.968]
37 Instead of using a fixed p0 and p1 in (Kleinberg, 2003), by following the moving average method (Vlachos 3The news articles in one day is treated as a batch. [sent-70, score-0.146]
38 Given a term w, we use a sliding window of length L to estimate p0(t) and p1(t) for the tth batch as follows: p0 (t) = jj∈∈WWttNNj,jwand p1(t) = p0(t) × s, where Nj,wand Nj are w ’s document frequency and the total number of documents in jth batch respectively. [sent-73, score-0.455]
39 0, indicating state q1 has a faster rate, and it is empirically set as 1. [sent-75, score-0.035]
40 Wt is a time interval [max(t L/2, 0) , min(t + L/2, N)] , and the length of mmoaxvi(ntg− wL/in2d,o0)w, Lm nis( tse+t Las/ 1,N80) ]days. [sent-77, score-0.083]
41 All the other parts remain the same as in (Kleinberg, 2003). [sent-78, score-0.034]
42 2 Burst based text representation models We apply TVBurst to all the terms in our vocabulary to identify a set of bursty features, denoted as B. [sent-81, score-0.73]
43 Given B, a document di (t) with timestamp t is rBe. [sent-82, score-0.127]
44 pr Gesiveentned B a,s a a d ovecuctmore notf dweights in bursty feature dimensions: di(t) = (di,1(t), di,2(t), . [sent-83, score-0.555]
45 We define the jth weight of di as follows di,j=t0f,-idfi,wBj, iofth te ∈rw [itsBsej. [sent-87, score-0.12]
46 ,teBj] , vaWl ohfe Bnj thaend ti cmoenst a minps b oufr dstiyis te irnm t wheBj b,u wrsety se int uerp- tvhael wofe iBght using common used tf-idf method. [sent-88, score-0.089]
47 In BurstVSM, each dimension is mapped to one bursty feature, and it considers both semantic and temporal information. [sent-89, score-0.759]
48 One dimension is active only when the document falls in the corresponding bursty interval. [sent-90, score-0.693]
49 Usually, a document vector in BurstVSM has only a few non-zero entries, which makes computation of document similarities more efficient in large datasets compared with traditional VSM. [sent-91, score-0.25]
50 , 2007b), it proposes to weight different term dimensions with correspond- ing bursty scores. [sent-93, score-0.73]
51 However, it is still based on term dimensions and fails to deal with terms with multiple bursts. [sent-94, score-0.199]
52 Suppose that we are dealing with a text collection related with U. [sent-95, score-0.08]
53 In BurstVSM, one term with multiple bursts will be naturally mapped to different dimensions. [sent-99, score-0.149]
54 , 2008 ) correspond to different dimensions in BurstVSM, while Figure 2: One example for comparisons of different representation methods. [sent-102, score-0.267]
55 Terms in red box correspond to multiple bursty periods. [sent-103, score-0.531]
56 Here dimension reduction refers to the reduction of nonare returned as identified events. [sent-105, score-0.101]
57 1 Experiment Setup We used a subset of 68 millon deduplicated timestamped web pages generated from this archive (Huang et al. [sent-107, score-0.085]
58 Since our major focus is to detect events from news articles, we only keep the web pages with keyword “news” in URL field. [sent-109, score-0.211]
59 The final collection contains 11, 218, 581 articles with total 1, 730, 984, 304 tokens ranging from 2000 to 2009. [sent-110, score-0.059]
60 For split-cluster-merge algorithm, we implement the cluster step in a multi- zero entries inin srfe omrpmareantstieocn tationtnefom rvmpeoacrtiatolonr. [sent-112, score-0.11]
61 2e mCanounaslt ryuc ctoniosntru ocft e thset c toesl te c otilolenction for event VSM and boostVSM cannot capture such temporal differences. [sent-115, score-0.43]
62 Some methods try to design time decaying functions (Yang et al. [sent-116, score-0.049]
63 3 split-cluster-merge algorithm for event detection In this section, we discuss how to cluster documents as events. [sent-120, score-0.546]
64 Since each document can be represented as a burst-based vector, we use cosine function to compute document similarities. [sent-121, score-0.174]
65 Due to the large size of our news corpus, it is infeasible to cluster all the documents straightforward. [sent-122, score-0.212]
66 We develop a heuristic clustering algorithm for event detection, denoted as split-cluster-merge, which includes three main steps, namely split, cluster and merge. [sent-123, score-0.489]
67 The idea is that we first split the dataset into small parts, then cluster the documents ofeach part independently and finally merge similar clusters from two consecutive parts. [sent-124, score-0.296]
68 In our dataset, we find that most events last no more than one month, so we split the dataset into parts by months. [sent-125, score-0.167]
69 After splitting, clustering can run in parallel for different parts (we use CLUTO4 as the clustering tool), which significantly reduces total time cost. [sent-126, score-0.168]
70 For merge, we merge clusters in consecutive months with an empirical threshold of 0. [sent-127, score-0.139]
71 To examine the effectiveness of event detection methods in different grains, we consider two type of events in terms of the number of relevant documents, namely significant events and moderate events. [sent-133, score-0.779]
72 A significant event is required to have at least 300 relevant docs, and a moderate event is required to have 10 ∼ 100 relevant docs. [sent-134, score-0.669]
73 s 1t 4co glrleacd-tion, starting with a list of 100 candidate seed events by referring to Xinhua News. [sent-136, score-0.133]
74 5 For one target event, the judges first construct queries with temporal constraints to retrieve candidate documents and then judge wether they are relevant or not. [sent-137, score-0.273]
75 Each document is assigned to three students, and we adopt the majority-win strategy for the final judgment. [sent-138, score-0.087]
76 Finally, by removing all candidate seed events which neither belong to significant events nor moderate events, we derive a test collection consisting of 24 significant events and 40 moderate events. [sent-139, score-0.54]
77 3 Evaluation metrics and baselines Similar to the evaluation in information retrieval , given a target event, we evaluate the quality of the returned “relevant” documents by systems. [sent-141, score-0.107]
78 We use average precision, average recall and mean average precision(MAP) as evaluation metrics. [sent-142, score-0.081]
79 A difference is that we do not have queries, and the output of a system is a set of document clusters. [sent-143, score-0.087]
80 So for a system, given an event in golden standard, we first select the cluster (the system generates) which has the 5http://news. [sent-144, score-0.376]
81 004 most relevant documents, then sort the documents in the descending order of similarities with the cluster centroid and finally compute P, R ,F and MAP in this cluster. [sent-208, score-0.252]
82 We used the event detection method in (Swan and Allan, 2000) as baseline, denoted as timeminesχ2. [sent-210, score-0.449]
83 Recall that BurstVSM relies on bursty features as dimensions, we tested dif- ferent burst detection algorithms in our proposed BurstVSM model, including swan (Swan and Allan, 2000), kleinberg (Kleinberg, 2003) and our proposed TVBurst algorithm. [sent-212, score-1.092]
84 We use TVBurst as the default burst detection algorithm in later experiments. [sent-216, score-0.327]
85 Then we compare the performance of different text representation models for event detection, namely BurstVSM and boostVSM (He et al. [sent-217, score-0.425]
86 7 For different representation models, we use split-cluster-merge as clustering algorithm. [sent-220, score-0.173]
87 Table 2 shows that BurstVSM is much effecitve than boostVSM for event detection. [sent-221, score-0.263]
88 )10 2, 93624,5 01,35 1 9,840 5,1785 ,1962 clustering documents in a coarse grain (e. [sent-231, score-0.172]
89 In our methods, event detection is treated as document clustering. [sent-235, score-0.476]
90 It is very important to study how similarities affect the performance of clustering. [sent-236, score-0.053]
91 To see why our proposed representation methods are better than boostVSM, we present the average intra-class similarity and inter-class similarity for different events in Table 3. [sent-237, score-0.326]
92 8 We can see BurstVSM results in a larger intra-class similarity and a smaller inter-class similarity than boostVSM. [sent-238, score-0.06]
93 We further analyze the space/time complexity of different representation models. [sent-240, score-0.106]
94 We can see that BurstVSM has much smaller space/time cost compared with boostVSM, and meanwhile it has a better performance for event detection (See Table 2). [sent-242, score-0.42]
95 In burst-based representation, one document has fewer non-zero entries. [sent-243, score-0.087]
96 The core idea of this work is initialized and developped by Kai Fan. [sent-245, score-0.049]
97 We have developped an online Chinese large-scale event search engine based on this work, visit http://sewm. [sent-249, score-0.312]
98 8For each event in our golden standard, we have two clusters: relevant documents and non-relevant documents(within the event period). [sent-253, score-0.686]
99 Using burstiness to improve clustering of topics in news streams. [sent-269, score-0.146]
100 Identifying similarities, periodicities and bursts for online search queries. [sent-293, score-0.045]
wordName wordTfidf (topN-words)
[('bursty', 0.531), ('burstvsm', 0.42), ('event', 0.263), ('burst', 0.201), ('boostvsm', 0.196), ('tvburst', 0.168), ('vsm', 0.134), ('events', 0.133), ('detection', 0.126), ('temporal', 0.126), ('swan', 0.122), ('dimensions', 0.122), ('kleinberg', 0.112), ('representation', 0.106), ('presidential', 0.104), ('allan', 0.098), ('document', 0.087), ('interval', 0.083), ('documents', 0.081), ('term', 0.077), ('cluster', 0.076), ('dimension', 0.075), ('batch', 0.072), ('clusters', 0.069), ('retrospective', 0.067), ('clustering', 0.067), ('denoted', 0.06), ('moderate', 0.059), ('stream', 0.057), ('batmanfly', 0.056), ('olympic', 0.056), ('surge', 0.056), ('vlachos', 0.056), ('news', 0.055), ('similarities', 0.053), ('election', 0.052), ('streams', 0.052), ('xin', 0.05), ('decaying', 0.049), ('developped', 0.049), ('scalable', 0.048), ('bursts', 0.045), ('evolving', 0.045), ('merge', 0.043), ('relevant', 0.042), ('kai', 0.042), ('novelty', 0.042), ('te', 0.041), ('di', 0.04), ('automaton', 0.039), ('jth', 0.039), ('november', 0.039), ('comparisons', 0.039), ('fung', 0.037), ('golden', 0.037), ('archive', 0.037), ('yang', 0.037), ('articles', 0.036), ('state', 0.035), ('trends', 0.034), ('entries', 0.034), ('parts', 0.034), ('text', 0.033), ('zhao', 0.032), ('meanwhile', 0.031), ('similarity', 0.03), ('moving', 0.028), ('china', 0.028), ('average', 0.027), ('consecutive', 0.027), ('tth', 0.027), ('mapped', 0.027), ('returned', 0.026), ('mode', 0.025), ('beihang', 0.024), ('burstiness', 0.024), ('crs', 0.024), ('deduplicated', 0.024), ('dimitrios', 0.024), ('grain', 0.024), ('hubert', 0.024), ('irnm', 0.024), ('lavrenko', 0.024), ('lxm', 0.024), ('meek', 0.024), ('sudden', 0.024), ('tial', 0.024), ('timestamped', 0.024), ('wether', 0.024), ('wilcoxon', 0.024), ('wofe', 0.024), ('feature', 0.024), ('dealing', 0.024), ('cn', 0.024), ('collection', 0.023), ('major', 0.023), ('namely', 0.023), ('period', 0.023), ('vector', 0.023), ('tef', 0.022)]
simIndex simValue paperId paperTitle
same-paper 1 1.0000002 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection
Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li
Abstract: Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. Meanwhile, it significantly reduces the number of non-zero entries in the representation. We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient.
2 0.47413298 98 acl-2012-Finding Bursty Topics from Microblogs
Author: Qiming Diao ; Jing Jiang ; Feida Zhu ; Ee-Peng Lim
Abstract: Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic. The former helps find eventdriven posts while the latter helps identify and filter out “personal” posts. Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model. We also show some case studies that demonstrate the importance of considering both the temporal information and users’ personal interests for bursty topic detection from microblogs.
3 0.25880858 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive
Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran
Abstract: Interpreting news requires identifying its constituent events. Events are complex linguistically and ontologically, so disambiguating their reference is challenging. We introduce event linking, which canonically labels an event reference with the article where it was first reported. This implicitly relaxes coreference to co-reporting, and will practically enable augmenting news archives with semantic hyperlinks. We annotate and analyse a corpus of 150 documents, extracting 501 links to a news archive with reasonable inter-annotator agreement.
4 0.15232083 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures
Author: Oleksandr Kolomiyets ; Steven Bethard ; Marie-Francine Moens
Abstract: We propose a new approach to characterizing the timeline of a text: temporal dependency structures, where all the events of a narrative are linked via partial ordering relations like BEFORE, AFTER, OVERLAP and IDENTITY. We annotate a corpus of children’s stories with temporal dependency trees, achieving agreement (Krippendorff’s Alpha) of 0.856 on the event words, 0.822 on the links between events, and of 0.700 on the ordering relation labels. We compare two parsing models for temporal dependency structures, and show that a deterministic non-projective dependency parser outperforms a graph-based maximum spanning tree parser, achieving labeled attachment accuracy of 0.647 and labeled tree edit distance of 0.596. Our analysis of the dependency parser errors gives some insights into future research directions.
5 0.14733973 191 acl-2012-Temporally Anchored Relation Extraction
Author: Guillermo Garrido ; Anselmo Penas ; Bernardo Cabaleiro ; Alvaro Rodrigo
Abstract: Although much work on relation extraction has aimed at obtaining static facts, many of the target relations are actually fluents, as their validity is naturally anchored to a certain time period. This paper proposes a methodological approach to temporally anchored relation extraction. Our proposal performs distant supervised learning to extract a set of relations from a natural language corpus, and anchors each of them to an interval of temporal validity, aggregating evidence from documents supporting the relation. We use a rich graphbased document-level representation to generate novel features for this task. Results show that our implementation for temporal anchoring is able to achieve a 69% of the upper bound performance imposed by the relation extraction step. Compared to the state of the art, the overall system achieves the highest precision reported.
6 0.14185308 99 acl-2012-Finding Salient Dates for Building Thematic Timelines
7 0.12341409 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
8 0.10774799 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
9 0.10156913 135 acl-2012-Learning to Temporally Order Medical Events in Clinical Text
10 0.083314329 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
11 0.079136565 91 acl-2012-Extracting and modeling durations for habits and events from Twitter
12 0.078730837 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
13 0.068424366 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
14 0.063908719 208 acl-2012-Unsupervised Relation Discovery with Sense Disambiguation
15 0.061270013 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System
16 0.054786913 120 acl-2012-Information-theoretic Multi-view Domain Adaptation
17 0.053218074 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
18 0.050534479 48 acl-2012-Classifying French Verbs Using French and English Lexical Resources
19 0.049605832 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes
20 0.048268594 192 acl-2012-Tense and Aspect Error Correction for ESL Learners Using Global Context
topicId topicWeight
[(0, -0.161), (1, 0.172), (2, -0.002), (3, 0.209), (4, -0.092), (5, -0.182), (6, 0.034), (7, -0.121), (8, 0.038), (9, -0.15), (10, -0.205), (11, 0.015), (12, 0.083), (13, 0.082), (14, -0.048), (15, -0.019), (16, 0.084), (17, 0.124), (18, -0.051), (19, 0.092), (20, -0.028), (21, 0.156), (22, 0.013), (23, 0.061), (24, -0.126), (25, -0.11), (26, -0.127), (27, 0.152), (28, -0.001), (29, 0.04), (30, 0.337), (31, -0.238), (32, -0.193), (33, -0.087), (34, -0.102), (35, -0.168), (36, 0.089), (37, -0.079), (38, -0.094), (39, 0.115), (40, 0.118), (41, 0.093), (42, 0.097), (43, 0.121), (44, 0.002), (45, -0.036), (46, -0.029), (47, -0.062), (48, -0.126), (49, -0.011)]
simIndex simValue paperId paperTitle
same-paper 1 0.95695847 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection
Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li
Abstract: Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. Meanwhile, it significantly reduces the number of non-zero entries in the representation. We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient.
2 0.67345536 98 acl-2012-Finding Bursty Topics from Microblogs
Author: Qiming Diao ; Jing Jiang ; Feida Zhu ; Ee-Peng Lim
Abstract: Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic. The former helps find eventdriven posts while the latter helps identify and filter out “personal” posts. Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model. We also show some case studies that demonstrate the importance of considering both the temporal information and users’ personal interests for bursty topic detection from microblogs.
3 0.57552421 85 acl-2012-Event Linking: Grounding Event Reference in a News Archive
Author: Joel Nothman ; Matthew Honnibal ; Ben Hachey ; James R. Curran
Abstract: Interpreting news requires identifying its constituent events. Events are complex linguistically and ontologically, so disambiguating their reference is challenging. We introduce event linking, which canonically labels an event reference with the article where it was first reported. This implicitly relaxes coreference to co-reporting, and will practically enable augmenting news archives with semantic hyperlinks. We annotate and analyse a corpus of 150 documents, extracting 501 links to a news archive with reasonable inter-annotator agreement.
4 0.38340685 99 acl-2012-Finding Salient Dates for Building Thematic Timelines
Author: Remy Kessler ; Xavier Tannier ; Caroline Hagege ; Veronique Moriceau ; Andre Bittar
Abstract: We present an approach for detecting salient (important) dates in texts in order to automatically build event timelines from a search query (e.g. the name of an event or person, etc.). This work was carried out on a corpus of newswire texts in English provided by the Agence France Presse (AFP). In order to extract salient dates that warrant inclusion in an event timeline, we first recognize and normalize temporal expressions in texts and then use a machine-learning approach to extract salient dates that relate to a particular topic. We focused only on extracting the dates and not the events to which they are related.
5 0.33409914 33 acl-2012-Automatic Event Extraction with Structured Preference Modeling
Author: Wei Lu ; Dan Roth
Abstract: This paper presents a novel sequence labeling model based on the latent-variable semiMarkov conditional random fields for jointly extracting argument roles of events from texts. The model takes in coarse mention and type information and predicts argument roles for a given event template. This paper addresses the event extraction problem in a primarily unsupervised setting, where no labeled training instances are available. Our key contribution is a novel learning framework called structured preference modeling (PM), that allows arbitrary preference to be assigned to certain structures during the learning procedure. We establish and discuss connections between this framework and other existing works. We show empirically that the structured preferences are crucial to the success of our task. Our model, trained without annotated data and with a small number of structured preferences, yields performance competitive to some baseline supervised approaches.
6 0.31311259 135 acl-2012-Learning to Temporally Order Medical Events in Clinical Text
7 0.27008793 90 acl-2012-Extracting Narrative Timelines as Temporal Dependency Structures
8 0.25688836 219 acl-2012-langid.py: An Off-the-shelf Language Identification Tool
9 0.25176749 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
10 0.25044864 91 acl-2012-Extracting and modeling durations for habits and events from Twitter
11 0.24805567 201 acl-2012-Towards the Unsupervised Acquisition of Discourse Relations
12 0.24508622 120 acl-2012-Information-theoretic Multi-view Domain Adaptation
13 0.23068826 191 acl-2012-Temporally Anchored Relation Extraction
14 0.21720004 117 acl-2012-Improving Word Representations via Global Context and Multiple Word Prototypes
15 0.19215947 77 acl-2012-Ecological Evaluation of Persuasive Messages Using Google AdWords
16 0.18914451 126 acl-2012-Labeling Documents with Timestamps: Learning from their Time Expressions
17 0.17752734 16 acl-2012-A Nonparametric Bayesian Approach to Acoustic Model Discovery
18 0.17522418 180 acl-2012-Social Event Radar: A Bilingual Context Mining and Sentiment Analysis Summarization System
19 0.1714921 60 acl-2012-Coupling Label Propagation and Constraints for Temporal Fact Extraction
20 0.16420698 112 acl-2012-Humor as Circuits in Semantic Networks
topicId topicWeight
[(25, 0.021), (26, 0.047), (28, 0.032), (30, 0.022), (37, 0.034), (39, 0.081), (74, 0.027), (76, 0.3), (82, 0.039), (84, 0.014), (85, 0.037), (86, 0.01), (90, 0.116), (92, 0.065), (94, 0.022), (99, 0.054)]
simIndex simValue paperId paperTitle
same-paper 1 0.75270295 17 acl-2012-A Novel Burst-based Text Representation Model for Scalable Event Detection
Author: Xin Zhao ; Rishan Chen ; Kai Fan ; Hongfei Yan ; Xiaoming Li
Abstract: Mining retrospective events from text streams has been an important research topic. Classic text representation model (i.e., vector space model) cannot model temporal aspects of documents. To address it, we proposed a novel burst-based text representation model, denoted as BurstVSM. BurstVSM corresponds dimensions to bursty features instead of terms, which can capture semantic and temporal information. Meanwhile, it significantly reduces the number of non-zero entries in the representation. We test it via scalable event detection, and experiments in a 10-year news archive show that our methods are both effective and efficient.
2 0.5736149 213 acl-2012-Utilizing Dependency Language Models for Graph-based Dependency Parsing Models
Author: Wenliang Chen ; Min Zhang ; Haizhou Li
Abstract: Most previous graph-based parsing models increase decoding complexity when they use high-order features due to exact-inference decoding. In this paper, we present an approach to enriching high-orderfeature representations for graph-based dependency parsing models using a dependency language model and beam search. The dependency language model is built on a large-amount of additional autoparsed data that is processed by a baseline parser. Based on the dependency language model, we represent a set of features for the parsing model. Finally, the features are efficiently integrated into the parsing model during decoding using beam search. Our approach has two advantages. Firstly we utilize rich high-order features defined over a view of large scope and additional large raw corpus. Secondly our approach does not increase the decoding complexity. We evaluate the proposed approach on English and Chinese data. The experimental results show that our new parser achieves the best accuracy on the Chinese data and comparable accuracy with the best known systems on the English data.
3 0.55924273 98 acl-2012-Finding Bursty Topics from Microblogs
Author: Qiming Diao ; Jing Jiang ; Feida Zhu ; Ee-Peng Lim
Abstract: Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic. The former helps find eventdriven posts while the latter helps identify and filter out “personal” posts. Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model. We also show some case studies that demonstrate the importance of considering both the temporal information and users’ personal interests for bursty topic detection from microblogs.
4 0.50538522 187 acl-2012-Subgroup Detection in Ideological Discussions
Author: Amjad Abu-Jbara ; Pradeep Dasigi ; Mona Diab ; Dragomir Radev
Abstract: The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results.
5 0.50121558 206 acl-2012-UWN: A Large Multilingual Lexical Knowledge Base
Author: Gerard de Melo ; Gerhard Weikum
Abstract: We present UWN, a large multilingual lexical knowledge base that describes the meanings and relationships of words in over 200 languages. This paper explains how link prediction, information integration and taxonomy induction methods have been used to build UWN based on WordNet and extend it with millions of named entities from Wikipedia. We additionally introduce extensions to cover lexical relationships, frame-semantic knowledge, and language data. An online interface provides human access to the data, while a software API enables applications to look up over 16 million words and names.
6 0.50084406 21 acl-2012-A System for Real-time Twitter Sentiment Analysis of 2012 U.S. Presidential Election Cycle
7 0.50059932 191 acl-2012-Temporally Anchored Relation Extraction
9 0.49454433 132 acl-2012-Learning the Latent Semantics of a Concept from its Definition
10 0.49411112 174 acl-2012-Semantic Parsing with Bayesian Tree Transducers
11 0.49344739 156 acl-2012-Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information
12 0.4931033 28 acl-2012-Aspect Extraction through Semi-Supervised Modeling
13 0.49272105 167 acl-2012-QuickView: NLP-based Tweet Search
14 0.49222034 80 acl-2012-Efficient Tree-based Approximation for Entailment Graph Learning
15 0.4901146 214 acl-2012-Verb Classification using Distributional Similarity in Syntactic and Semantic Structures
16 0.48984057 84 acl-2012-Estimating Compact Yet Rich Tree Insertion Grammars
17 0.48847759 10 acl-2012-A Discriminative Hierarchical Model for Fast Coreference at Large Scale
18 0.48777473 130 acl-2012-Learning Syntactic Verb Frames using Graphical Models
19 0.48772854 31 acl-2012-Authorship Attribution with Author-aware Topic Models
20 0.48659542 38 acl-2012-Bayesian Symbol-Refined Tree Substitution Grammars for Syntactic Parsing