acl acl2013 acl2013-142 knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jiwei Li ; Sujian Li
Abstract: Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. It is a new challenge which combines salience ranking problem with novelty detection. Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. Experiments on 6 different datasets which contain 3 156 documents demonstrates the good performance of our system with regard to ROUGE scores.
Reference: text
sentIndex sentText sentNum sentScore
1 edu 3 l Abstract Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. [sent-2, score-0.365]
2 It is a new challenge which combines salience ranking problem with novelty detection. [sent-3, score-0.071]
3 Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. [sent-4, score-0.337]
4 In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. [sent-5, score-0.152]
5 In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. [sent-6, score-0.135]
6 1 Introduction Faced with thousands of news articles, people usually try to ask the general aspects such as the beginning, the evolutionary pattern and the end. [sent-8, score-0.279]
7 General search engines simply return the top ranking articles according to query relevance and fail to trace how a specific event goes. [sent-9, score-0.155]
8 Timeline summarization, which aims at generating a series of concise summaries for news collection published at different epochs can give readers a faster and better access to understand the evolution of news. [sent-10, score-0.419]
9 The key of timeline summarization is how to select sentences which can tell readers the evolutionary pattern of topics in the event. [sent-11, score-0.908]
10 It is very common that the themes of a corpus evolve over time, and topics of adjacent epochs usually exhibit strong correlations. [sent-12, score-0.139]
11 Thus, it is important to model topics across different documents and over different time periods to detect how the events evolve. [sent-13, score-0.105]
12 cn i The task of timelime summarization is firstly proposed by Allan et al. [sent-18, score-0.201]
13 However, these methods seldom explored the evolutionary characteristics of news. [sent-22, score-0.287]
14 (201 1) extended the graph based sentence ranking algorithm used in traditional multi-document summarization (MDS) to timeline generation by projecting sentences from different time into one plane. [sent-24, score-0.681]
15 They further explored the timeline task from the optimization of a function considering the combination of different respects such as relevance, coverage, coherence and diversity (Yan et al. [sent-25, score-0.493]
16 However, their approaches just treat timeline generation as a sentence ranking or optimization problem and seldom explore the topic information lied in the corpus. [sent-27, score-0.604]
17 Recently, topic models have been widely used for capturing the dynamics of topics via time. [sent-28, score-0.138]
18 Many dynamic approaches based on LDA model (Blei et al. [sent-29, score-0.032]
19 , 2006) have been proposed to discover the evolving patterns in the corpus as well as the snapshot clusters at each time epoch (Blei and Lafferty, 2006; Chakrabarti et al. [sent-31, score-0.312]
20 In this paper, we propose EHDP: a evolutionary hierarchical Dirichlet process (HDP) model for timeline summarization. [sent-36, score-0.706]
21 In EHDP, each HDP is built for multiple corpora at each time epoch, and the time dependencies are incorporated into epochs under the Markovian assumptions. [sent-37, score-0.199]
22 Sentences are selected into timelines by consider- ing different aspects such as topic relevance, coverage and coherence. [sent-39, score-0.192]
23 1 Problem Formulation Given a general query Q = , we firstly Gobivtaeinn a gseent eoraf query r eQla =ted { dwoc}uments. [sent-45, score-0.123]
24 We notate different corpus as C = {Ct}tt==T1 according ttaot eth deiirf published tuims ea sw Cher =e C {Ct = denotes the document collection published at epoch t. [sent-46, score-0.432]
25 Document Dit is formulated as a collection of sentences {stij . [sent-47, score-0.028]
26 Each sentence is presented {wqi}i ==1Qn {Dti}i ==1Nt }jj==N1ti {witjl}l ==N1itj with a series of words sitj = and associated with a topic θitj. [sent-48, score-0.201]
27 The output of the algorithm is a series of timelines summarization I {It}tt==T1 where = oItf ⊂ mCelt 2. [sent-50, score-0.285]
28 These HDP shares an identical base measure G0, which serves as an overall bookkeeping of overall measures. [sent-53, score-0.027]
29 We use Gt0 to denote the base measure at eachepochanddrawthelocalmeasure Git foreach document at time t from Gt0. [sent-54, score-0.086]
30 In EHDP, each sentence is assigned to an aspect θitj with the consideration of words within current sentence. [sent-55, score-0.047]
31 K denotes the normalization factor where K = 1+ F(v, δ). [sent-57, score-0.028]
32 ∆ is Pδ∆=0 the time width and λ is the dPecay factor. [sent-58, score-0.055]
33 In Chinese Restaurant Process (CRP), each document is referred to a restaurant and sentences are compared to customers. [sent-59, score-0.12]
34 Customers in the restaurant sit around different tables and each table bitn is associated with a dish (topic) Ψitn according to the dish menu. [sent-60, score-0.501]
35 Let mtk denote the number of tables enjoying dish k in all restaurants at epoch t, mtk = PNi=t1PnN=itb1 1(Ψitn = k). [sent-61, score-0.716]
36 2dr foarw e aascphe scetn θtitejn∼ce G sititjin Dti for w ∈ sitjdraw w ∼ f(w)|θitj Figure 1: Generation process for EHDP another parameter Mtk to incorporate time dependency into EHDP. [sent-66, score-0.055]
37 X∆ Mtk = XF(v,δ) · mt−δ,k (2) Xδ=0 Let nitb denote the number of sentences sitting around table b, in document iat epoch t. [sent-67, score-0.315]
38 In CRP for EHDP, when a new customer sitj comes in, he can sit on the existing table with probability nitb/(nit−1+γ), sharing the dish (topic) Ψitb served at that ta−b1l+e or picking a new stahb(lteo pwicit)hΨ probability γ/(nit − 1 + γ). [sent-68, score-0.316]
39 The customer has to select a dish from− th 1e + global dhiseh c menu eifr hhae sch tooo sseeles a new table. [sent-69, score-0.289]
40 A dish that has already been shared in the gloPbal menu would be chosen with probability Mkt/P(Pk Mkt +α) and a new dish with probability α/(PPk Mkt + α). [sent-70, score-0.419]
41 , φtXb=θijnit−nit 1b + γδφjb+nit−γ 1 + γδφjnbew φtniew |φ, α ∼ XkPiMMttik+ αδφk+PiMαti+ αG0 (3) We can sPee that EHDP degenPerates into a series of independent HDPs when ∆ = 0 and one global HDP when ∆ = T and v = ∞, as discussed in HAmDPred w ahendn Xings Two arnkd (2008). [sent-73, score-0.052]
42 3 Sentence Selection Strategy The task of timeline summarization aims to produce a summary for each time and the generated summary should meet criteria such as relevance , coverage and coherence (Li et al. [sent-75, score-0.904]
43 To care for these three criteria, we propose a topic scoring algorithm based on Kullback-Leibler(KL) divergence. [sent-77, score-0.116]
44 We introduce the decreasing logistic function ζ(x) = 1/(1 + ex) to map the distance into interval (0,1). [sent-78, score-0.053]
45 Relevance: the summary should be related with the proposed query Q. [sent-80, score-0.109]
46 FR(It) = ζ(KL(It| |Q)) Coverage: the summary should highly generalize important topics mentioned in document collection at epoch t. [sent-81, score-0.403]
47 FCv(It) = ζ(KL(It||Ct)) Coherence: News evolves over time and a good component summary is coherent with neighboring corpus so that a timeline tracks the gradual evolution trajectory for multiple correlative news. [sent-82, score-0.668]
48 FCh(It) =Pδδ==−∆∆/2/2PFδδ(v==−,∆δ∆/)2/ ·2 ζF((Kv,Lδ()It| Ct−δ)) Let Score(It) denoteP Pthe score of the summary and it is calculated in Equ. [sent-83, score-0.065]
49 To avoid aspect redundancy, MPMR strategy (Goldstein et al. [sent-87, score-0.047]
50 1 Experiments set-up We downloaded 3 156 news articles from selected sources such as BBC, New York Times and CNN with various time spans and built the evaluation systems which contains 6 real datasets. [sent-90, score-0.103]
51 The news belongs to different categories of Rule of Interpretation (ROI) (Kumaran and Allan, 2004). [sent-91, score-0.048]
52 Summary at each epoch is truncated to the same length of 50 words. [sent-95, score-0.284]
53 Reference timeline in ROUGE evaluation is manually generated by using Amazon Mechanical Turk1 . [sent-98, score-0.426]
54 Workers were asked to generate reference timeline for news at each epoch in less than 50 words and we collect 790 timelines in total. [sent-99, score-0.798]
55 1 · #epoch aNnedx gradually change et ohef vva ilsue se otf t λ1 1fro ·m # e0p to 1h with interval of 0. [sent-110, score-0.08]
56 3 Comparison with other topic models In this subsection, we compare our model with 4 topic model baselines on the test data. [sent-122, score-0.206]
57 StandHDP(1): A topic approach that models different time epochs as a series of independent HDPs without considering time dependency. [sent-123, score-0.339]
58 Rb1 Wu235s142e8 Table 3: Comparison with topic models TSECMyahHesnitSbDufPrmloied4:C0. [sent-132, score-0.088]
59 Rb1Wue352s18 A global HDP which models the whole time span as a restaurant. [sent-139, score-0.055]
60 In LDA based models, aspect number is predefined as 80 2. [sent-141, score-0.047]
61 As we can see, EHDP achieves better results than the two standard HDP baselines where time information is not adequately considered. [sent-143, score-0.085]
62 As we know, how to determine topic number in the LDA-based models is still an open problem. [sent-146, score-0.088]
63 4 Comparison with other baselines We implement several baselines used in traditional summarization or timeline summarization for comparison. [sent-148, score-0.818]
64 , 2004) according to the features including centroid value, position and first-sentence overlap. [sent-150, score-0.053]
65 (3) ETS is the timeline summarization approach developed by Yan et al. [sent-153, score-0.592]
66 (4) Chieu is the timeline system provided by (Chieu and Lee, 2004) utilizing interest and bursty ranking but neglecting trans-temporal news evolution. [sent-155, score-0.508]
67 This is probably because methods in multi-document summarization only care 2In our experiments, the aspect number is set as 50, 80, 100 and 120 respectively and we select the best performed result with the aspect number as 80 about sentence selection and neglect the novelty detection task. [sent-157, score-0.325]
68 We can also see that EHDP under our proposed framework outputs existing timeline summarization approaches ETS and chieu. [sent-158, score-0.592]
69 4 Conclusion In this paper we present an evolutionary HDP model for timeline summarization. [sent-163, score-0.657]
70 Our EHDP extends original HDP by incorporating time dependencies and background information. [sent-164, score-0.055]
71 Experimental results on real multi-time news demonstrate the effectiveness of our topic model. [sent-166, score-0.136]
72 Dynamic non-parametric mixture models and the recurrent chinese restaurant process. [sent-183, score-0.089]
73 Text classification and named entities for new event detection. [sent-208, score-0.027]
74 Enhancing diversity, coverage and balance for summarization through structure learning. [sent-211, score-0.203]
wordName wordTfidf (topN-words)
[('ehdp', 0.487), ('timeline', 0.426), ('epoch', 0.257), ('hdp', 0.242), ('evolutionary', 0.231), ('dish', 0.188), ('summarization', 0.166), ('yan', 0.126), ('mtk', 0.122), ('rouge', 0.107), ('chieu', 0.099), ('hdps', 0.091), ('itj', 0.091), ('nit', 0.089), ('epochs', 0.089), ('restaurant', 0.089), ('topic', 0.088), ('dirichlet', 0.084), ('summaries', 0.067), ('timelines', 0.067), ('summary', 0.065), ('evolution', 0.064), ('wan', 0.064), ('crp', 0.064), ('congrui', 0.061), ('mkt', 0.061), ('sitj', 0.061), ('blei', 0.056), ('seldom', 0.056), ('xiaojun', 0.056), ('time', 0.055), ('ets', 0.054), ('chakrabarti', 0.054), ('centroid', 0.053), ('interval', 0.053), ('ct', 0.052), ('series', 0.052), ('allan', 0.051), ('relevance', 0.05), ('topics', 0.05), ('xiaoming', 0.05), ('dti', 0.05), ('hierarchical', 0.049), ('news', 0.048), ('aspect', 0.047), ('kl', 0.046), ('manifold', 0.044), ('query', 0.044), ('dp', 0.043), ('menu', 0.043), ('rui', 0.041), ('itn', 0.041), ('coherence', 0.04), ('novelty', 0.037), ('kumaran', 0.037), ('coverage', 0.037), ('sit', 0.036), ('firstly', 0.035), ('readers', 0.035), ('acm', 0.035), ('ranking', 0.034), ('ren', 0.033), ('lda', 0.033), ('concise', 0.033), ('sigkdd', 0.032), ('dynamic', 0.032), ('neighboring', 0.031), ('customer', 0.031), ('ahmed', 0.031), ('published', 0.031), ('document', 0.031), ('baselines', 0.03), ('radev', 0.03), ('li', 0.029), ('sigir', 0.029), ('tt', 0.029), ('care', 0.028), ('denotes', 0.028), ('formulated', 0.028), ('notate', 0.027), ('mead', 0.027), ('ttehse', 0.027), ('hurricane', 0.027), ('vva', 0.027), ('git', 0.027), ('ttaot', 0.027), ('pim', 0.027), ('bookkeeping', 0.027), ('correlative', 0.027), ('eifr', 0.027), ('enjoying', 0.027), ('fch', 0.027), ('francois', 0.027), ('horizon', 0.027), ('jianwen', 0.027), ('sitting', 0.027), ('truncated', 0.027), ('kong', 0.027), ('diversity', 0.027), ('event', 0.027)]
simIndex simValue paperId paperTitle
same-paper 1 0.9999997 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization
Author: Jiwei Li ; Sujian Li
Abstract: Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. It is a new challenge which combines salience ranking problem with novelty detection. Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. Experiments on 6 different datasets which contain 3 156 documents demonstrates the good performance of our system with regard to ROUGE scores.
2 0.12800027 319 acl-2013-Sequential Summarization: A New Application for Timely Updated Twitter Trending Topics
Author: Dehong Gao ; Wenjie Li ; Renxian Zhang
Abstract: The growth of the Web 2.0 technologies has led to an explosion of social networking media sites. Among them, Twitter is the most popular service by far due to its ease for realtime sharing of information. It collects millions of tweets per day and monitors what people are talking about in the trending topics updated timely. Then the question is how users can understand a topic in a short time when they are frustrated with the overwhelming and unorganized tweets. In this paper, this problem is approached by sequential summarization which aims to produce a sequential summary, i.e., a series of chronologically ordered short subsummaries that collectively provide a full story about topic development. Both the number and the content of sub-summaries are automatically identified by the proposed stream-based and semantic-based approaches. These approaches are evaluated in terms of sequence coverage, sequence novelty and sequence correlation and the effectiveness of their combination is demonstrated. 1 Introduction and Background Twitter, as a popular micro-blogging service, collects millions of real-time short text messages (known as tweets) every second. It acts as not only a public platform for posting trifles about users’ daily lives, but also a public reporter for real-time news. Twitter has shown its powerful ability in information delivery in many events, like the wildfires in San Diego and the earthquake in Japan. Nevertheless, the side effect is individual users usually sink deep under millions of flooding-in tweets. To alleviate this problem, the applications like whatthetrend 1 have evolved from Twitter to provide services that encourage users to edit explanatory tweets about a trending topic, which can be regarded as topic summaries. It is to some extent a good way to help users understand trending topics. 1 whatthetrend.com There is also pioneering research in automatic Twitter trending topic summarization. (O'Connor et al., 2010) explained Twitter trending topics by providing a list of significant terms. Users could utilize these terms to drill down to the tweets which are related to the trending topics. (Sharifi et al., 2010) attempted to provide a one-line summary for each trending topic using phrase reinforcement ranking. The relevance model employed by (Harabagiu and Hickl, 2011) generated summaries in larger size, i.e., 250word summaries, by synthesizing multiple high rank tweets. (Duan et al., 2012) incorporate the user influence and content quality information in timeline tweet summarization and employ reinforcement graph to generate summaries for trending topics. Twitter summarization is an emerging research area. Current approaches still followed the traditional summarization route and mainly focused on mining tweets of both significance and representativeness. Though, the summaries generated in such a way can sketch the most important aspects of the topic, they are incapable of providing full descriptions of the changes of the focus of a topic, and the temporal information or freshness of the tweets, especially for those newsworthy trending topics, like earthquake and sports meeting. As the main information producer in Twitter, the massive crowd keeps close pace with the development of trending topics and provide the timely updated information. The information dynamics and timeliness is an important consideration for Twitter summarization. That is why we propose sequential summarization in this work, which aims to produce sequential summaries to capture the temporal changes of mass focus. Our work resembles update summarization promoted by TAC 2 which required creating summaries with new information assuming the reader has already read some previous documents under the same topic. Given two chronologically ordered documents sets about a topic, the systems were asked to generate two 2 www.nist.gov/tac 567 summaries, and the second one should inform the user of new information only. In order to achieve this goal, existing approaches mainly emphasized the novelty of the subsequent summary (Li and Croft, 2006; Varma et al., 2009; Steinberger and Jezek, 2009). Different from update summarization, we focus more on the temporal change of trending topics. In particular, we need to automatically detect the “update points” among a myriad of related tweets. It is the goal of this paper to set up a new practical summarization application tailored for timely updated Twitter messages. With the aim of providing a full description of the focus changes and the records of the timeline of a trending topic, the systems are expected to discover the chronologically ordered sets of information by themselves and they are free to generate any number of update summaries according to the actual situations instead of a fixed number of summaries as specified in DUC/TAC. Our main contributions include novel approaches to sequential summarization and corresponding evaluation criteria for this new application. All of them will be detailed in the following sections. 2 Sequential Summarization Sequential summarization proposed here aims to generate a series of chronologically ordered subsummaries for a given Twitter trending topic. Each sub-summary is supposed to represent one main subtopic or one main aspect of the topic, while a sequential summary, made up by the subsummaries, should retain the order the information is delivered to the public. In such a way, the sequential summary is able to provide a general picture of the entire topic development. 2.1 Subtopic Segmentation One of the keys to sequential summarization is subtopic segmentation. How many subtopics have attracted the public attention, what are they, and how are they developed? It is important to provide the valuable and organized materials for more fine-grained summarization approaches. We proposed the following two approaches to automatically detect and chronologically order the subtopics. 2.1.1 Stream-based Subtopic Detection and Ordering Typically when a subtopic is popular enough, it will create a certain level of surge in the tweet stream. In other words, every surge in the tweet stream can be regarded as an indicator of the appearance of a subtopic that is worthy of being summarized. Our early investigation provides evidence to support this assumption. By examining the correlations between tweet content changes and volume changes in randomly selected topics, we have observed that the changes in tweet volume can really provide the clues of topic development or changes of crowd focus. The stream-based subtopic detection approach employs the offline peak area detection (Opad) algorithm (Shamma et al., 2010) to locate such surges by tracing tweet volume changes. It regards the collection of tweets at each such surge time range as a new subtopic. Offline Peak Area Detection (Opad) Algorithm 1: Input: TS (tweets stream, each twi with timestamp ti); peak interval window ∆? (in hour), and time stepℎ (ℎ ≪ ∆?); 2: Output: Peak Areas PA. 3: Initial: two time slots: ?′ = ? = ?0 + ∆?; Tweet numbers: ?′ = ? = ?????(?) 4: while (?? = ? + ℎ) < ??−1 5: update ?′ = ?? + ∆? and ?′ = ?????(?′) 6: if (?′ < ? And up-hilling) 7: output one peak area ??? 8: state of down-hilling 9: else 10: update ? = ?′ and ? = ?′ 11: state of up-hilling 12: 13: function ?????(?) 14: Count tweets in time interval T The subtopics detected by the Opad algorithm are naturally ordered in the timeline. 2.1.2 Semantic-based Subtopic Detection and Ordering Basically the stream-based approach monitors the changes of the level of user attention. It is easy to implement and intuitively works, but it fails to handle the cases where the posts about the same subtopic are received at different time ranges due to the difference of geographical and time zones. This may make some subtopics scattered into several time slots (peak areas) or one peak area mixed with more than one subtopic. In order to sequentially segment the subtopics from the semantic aspect, the semantic-based subtopic detection approach breaks the time order of tweet stream, and regards each tweet as an individual short document. It takes advantage of Dynamic Topic Modeling (David and Michael, 2006) to explore the tweet content. 568 DTM in nature is a clustering approach which can dynamically generate the subtopic underlying the topic. Any clustering approach requires a pre-specified cluster number. To avoid tuning the cluster number experimentally, the subtopic number required by the semantic-based approach is either calculated according to heuristics or determined by the number of the peak areas detected from the stream-based approach in this work. Unlike the stream-based approach, the subtopics formed by DTM are the sets of distributions of subtopic and word probabilities. They are time independent. Thus, the temporal order among these subtopics is not obvious and needs to be discovered. We use the probabilistic relationships between tweets and topics learned from DTM to assign each tweet to a subtopic that it most likely belongs to. Then the subtopics are ordered temporally according to the mean values of their tweets’ timestamps. 2.2 Sequential Summary Generation Once the subtopics are detected and ordered, the tweets belonging to each subtopic are ranked and the most significant one is extracted to generate the sub-summary regarding that subtopic. Two different ranking strategies are adopted to conform to two different subtopic detection mechanisms. For a tweet in a peak area, the linear combination of two measures is considered to independently. Each sub-summary is up to 140 characters in length to comply with the limit of tweet, but the annotators are free to choose the number of sub-summaries. It ends up with 6.3 and 4.8 sub-summaries on average in a sequential summary written by the two annotators respectively. These two sets of sequential summaries are regarded as reference summaries to evaluate system-generated summaries from the following three aspects. Sequence Coverage Sequence coverage measures the N-gram match between system-generated summaries and human-written summaries (stopword removed first). Considering temporal information is an important factor in sequential summaries, we evaluate its significance to be a sub-summary: (1) subtopic representativeness measured by the cosine similarity between the tweet and the centroid of all the tweets in the same peak area; (2) crowding endorsement measured by the times that the tweet is re-tweeted normalized by the total number of re-tweeting. With the DTM model, the significance of the tweets is evaluated directly by word distribution per subtopic. MMR (Carbonell and Goldstein, 1998) is used to reduce redundancy in sub-summary generation. 3 Experiments and Evaluations The experiments are conducted on the 24 Twitter trending topics collected using Twitter APIs 3 . The statistics are shown in Table 1. Due to the shortage of gold-standard sequential summaries, we invite two annotators to read the chronologically ordered tweets, and write a series of sub-summaries for each topic 3https://dev.twitter.com/ propose the position-aware coverage measure by accommodating the position information in matching. Let S={s1, s2, sk} denote a … … …, sequential summary and si the ith sub-summary, N-gram coverage is defined as: ???????? =|? 1?|?∑?∈? ?∑? ? ?∈?∙ℎ ?∑ ? ?∈?-?ℎ? ?∑? ∈-? ?,? ? ? ?∈? ? ? ? ? ? ? (ℎ?(?-?-? ? ? ?) where, ??? = |? − ?| + 1, i and j denote the serial numbers of the sub-summaries in the systemgenerated summary ??? and the human-written summary ?ℎ? , respectively. ? serves as a coefficient to discount long-distance matched sub-summaries. We evaluate unigram, bigram, and skipped bigram matches. Like in ROUGE (Lin, 2004), the skip distance is up to four words. Sequence Novelty Sequence novelty evaluates the average novelty of two successive sub-summaries. Information content (IC) has been used to measure the novelty of update summaries by (Aggarwal et al., 2009). In this paper, the novelty of a system569 generated sequential summary is defined as the average of IC increments of two adjacent subsummaries, ??????? =|?|1 − 1?∑>1(????− ????, ??−1) × where |?| is the number of sub-summaries in the sequential summary. ???? = ∑?∈?? ??? . ????, ??−1 = ∑?∈??∩??−1 ??? is the overlapped information in the two adjacent sub-summaries. ??? = ???? ?????????(?, ???) where w is a word, ???? is the inverse tweet frequency of w, and ??? is all the tweets in the trending topic. The relevance function is introduced to ensure that the information brought by new sub-summaries is not only novel but also related to the topic. Sequence Correlation Sequence correlation evaluates the sequential matching degree between system-generated and human-written summaries. In statistics, Kendall’s tau coefficient is often used to measure the association between two sequences (Lapata, 2006). The basic idea is to count the concordant and discordant pairs which contain the same elements in two sequences. Borrowing this idea, for each sub-summary in a human-generated summary, we find its most matched subsummary (judged by the cosine similarity measure) in the corresponding system-generated summary and then define the correlation according to the concordance between the two matched sub-summary sequences. ??????????? 2(|#???????????????| |#???????????????|) − = ?(? − 1) where n is the number of human-written subsummaries. Tables 2 and 3 below present the evaluation results. For the stream-based approach, we set ∆t=3 hours experimentally. For the semanticbased approach, we compare three different approaches to defining the sub-topic number K: (1) Semantic-based 1: Following the approach proposed in (Li et al., 2007), we first derive the matrix of tweet cosine similarity. Given the 1norm of eigenvalues ?????? (? = 1, 2, ,?) of the similarity matrix and the ratios ?? = ??????/?2 , the subtopic number ? = ? + 1 if ?? − ??+1 > ? (? 0.4 ). (2) Semantic-based 2: Using the rule of thumb in (Wan and Yang, 2008), ? = √? , where n is the tweet number. (3) Combined: K is defined as the number of the peak areas detected from the Opad algorithm, meanwhile we use the … = tweets within peak areas as the tweets of DTM. This is our new idea. The experiments confirm the superiority of the semantic-based approach over the stream-based approach in summary content coverage and novelty evaluations, showing that the former is better at subtopic content modeling. The subsummaries generated by the stream-based approach have comparative sequence (i.e., order) correlation with the human summaries. Combining the advantages the two approaches leads to the best overall results. SCebomaSCs beonmtdivr1eac( ∆nrdδ-bm(ta=i∆g0-cs3e.t)5=d32U0 n.3ig510r32a7m B0 .i1g 6r3589a46m87 SB0 k.i1 gp8725r69ame173d Table 2. N-Gram Coverage Evaluation Sem CtraeonTmaA tmicapb-nplibentria ec3os-de.abcd N(a∆hs(o1evt∆=(sdetδ=3l2)t 0y).a4n)dCoN0r .o 73e vl071ea96lti783 oy nEvCalo0ur a. 3 tei3792ol3a489nt650io n 4 Concluding Remarks We start a new application for Twitter trending topics, i.e., sequential summarization, to reveal the developing scenario of the trending topics while retaining the order of information presentation. We develop several solutions to automatically detect, segment and order subtopics temporally, and extract the most significant tweets into the sub-summaries to compose sequential summaries. Empirically, the combination of the stream-based approach and the semantic-based approach leads to sequential summaries with high coverage, low redundancy, and good order. Acknowledgments The work described in this paper is supported by a Hong Kong RGC project (PolyU No. 5202/12E) and a National Nature Science Foundation of China (NSFC No. 61272291). References Aggarwal Gaurav, Sumbaly Roshan and Sinha Shakti. 2009. Update Summarization. Stanford: CS224N Final Projects. 570 Blei M. David and Jordan I. Michael. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, 113120. Pittsburgh, Pennsylvania. Carbonell Jaime and Goldstein Jade. 1998. The use of MMR, diversity based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International Conference on Research and Development in Information Retrieval, 335-336. Melbourne, Australia. Duan Yajuan, Chen Zhimin, Wei Furu, Zhou Ming and Heung-Yeung Shum. 2012. Twitter Topic Summarization by Ranking Tweets using Social Influence and Content Quality. In Proceedings of the 24th International Conference on Computational Linguistics, 763-780. Mumbai, India. Harabagiu Sanda and Hickl Andrew. 2011. Relevance Modeling for Microblog Summarization. In Proceedings of 5th International AAAI Conference on Weblogs and Social Media. Barcelona, Spain. Lapata Mirella. 2006. Automatic evaluation of information ordering: Kendall’s tau. Computational Linguistics, 32(4): 1-14. Li Wenyuan, Ng Wee-Keong, Liu Ying and Ong Kok-Leong. 2007. Enhancing the Effectiveness of Clustering with Spectra Analysis. IEEE Transactions on Knowledge and Data Engineering, 19(7):887-902. Li Xiaoyan and Croft W. Bruce. 2006. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, 238-247. New York, USA. Lin Chin-Yew. 2004. ROUGE: a Package for Automatic Evaluation of Summaries. In Proceedings of the ACL Workshop on Text Summarization Branches Out, 74-81 . Barcelona, Spain. Liu Fei, Liu Yang and Weng Fuliang. 2011. Why is “SXSW ” trending? Exploring Multiple Text Sources for Twitter Topic Summarization. In Proceedings of the ACL Workshop on Language in Social Media, 66-75. Portland, Oregon. O'Connor Brendan, Krieger Michel and Ahn David. 2010. TweetMotif: Exploratory Search and Topic Summarization for Twitter. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, 384-385. Atlanta, Georgia. Shamma A. David, Kennedy Lyndon and Churchill F. Elizabeth. 2010. Tweetgeist: Can the Twitter Timeline Reveal the Structure of Broadcast Events? In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, 589-593. Savannah, Georgia, USA. Sharifi Beaux, Hutton Mark-Anthony and Kalita Jugal. 2010. Summarizing Microblogs Automatically. In Human Language Technologies: the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 685688. Los Angeles, California. Steinberger Josef and Jezek Karel. 2009. Update summarization based on novel topic distribution. In Proceedings of the 9th ACM Symposium on Document Engineering, 205-213. Munich, Germany. Varma Vasudeva, Bharat Vijay, Kovelamudi Sudheer, Praveen Bysani, Kumar K. N, Kranthi Reddy, Karuna Kumar and Nitin Maganti. 2009. IIIT Hyderabad at TAC 2009. In Proceedings of the 2009 Text Analysis Conference. GaithsBurg, Maryland. Wan Xiaojun and Yang Jianjun. 2008. Multidocument summarization using cluster-based link analysis. In Proceedings of the 3 1st Annual International Conference on Research and Development in Information Retrieval, 299-306. Singapore, Singapore. 571
3 0.12055562 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art
Author: Peter A. Rankel ; John M. Conroy ; Hoa Trang Dang ; Ani Nenkova
Abstract: How good are automatic content metrics for news summary evaluation? Here we provide a detailed answer to this question, with a particular focus on assessing the ability of automatic evaluations to identify statistically significant differences present in manual evaluation of content. Using four years of data from the Text Analysis Conference, we analyze the performance of eight ROUGE variants in terms of accuracy, precision and recall in finding significantly different systems. Our experiments show that some of the neglected variants of ROUGE, based on higher order n-grams and syntactic dependencies, are most accurate across the years; the commonly used ROUGE-1 scores find too many significant differences between systems which manual evaluation would deem comparable. We also test combinations ofROUGE variants and find that they considerably improve the accuracy of automatic prediction.
4 0.1158385 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
Author: Romain Deveaud ; Eric SanJuan ; Patrice Bellot
Abstract: The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those topics. We perform a first experimental evaluation using two major TREC test collections. Results show that retrieval perfor- mances tend to be better when using topics with higher semantic coherence.
5 0.089544214 333 acl-2013-Summarization Through Submodularity and Dispersion
Author: Anirban Dasgupta ; Ravi Kumar ; Sujith Ravi
Abstract: We propose a new optimization framework for summarization by generalizing the submodular framework of (Lin and Bilmes, 2011). In our framework the summarization desideratum is expressed as a sum of a submodular function and a nonsubmodular function, which we call dispersion; the latter uses inter-sentence dissimilarities in different ways in order to ensure non-redundancy of the summary. We consider three natural dispersion functions and show that a greedy algorithm can obtain an approximately optimal summary in all three cases. We conduct experiments on two corpora—DUC 2004 and user comments on news articles—and show that the performance of our algorithm outperforms those that rely only on submodularity.
6 0.088078119 74 acl-2013-Building Comparable Corpora Based on Bilingual LDA Model
7 0.081021555 147 acl-2013-Exploiting Topic based Twitter Sentiment for Stock Prediction
8 0.080542132 351 acl-2013-Topic Modeling Based Classification of Clinical Reports
9 0.079641044 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords
10 0.078528695 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
11 0.076464474 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning
12 0.075852007 353 acl-2013-Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain
13 0.07141576 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization
14 0.070450582 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization
15 0.067825757 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
16 0.067017421 121 acl-2013-Discovering User Interactions in Ideological Discussions
17 0.059147727 197 acl-2013-Incremental Topic-Based Translation Model Adaptation for Conversational Spoken Language Translation
18 0.05900532 225 acl-2013-Learning to Order Natural Language Texts
19 0.058989037 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
20 0.05784313 73 acl-2013-Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions
topicId topicWeight
[(0, 0.132), (1, 0.059), (2, 0.006), (3, -0.032), (4, 0.088), (5, 0.016), (6, 0.118), (7, -0.018), (8, -0.169), (9, -0.082), (10, 0.007), (11, 0.071), (12, -0.025), (13, 0.034), (14, -0.037), (15, 0.041), (16, 0.045), (17, -0.063), (18, -0.013), (19, 0.01), (20, -0.023), (21, -0.072), (22, -0.039), (23, -0.006), (24, 0.01), (25, -0.055), (26, -0.032), (27, -0.029), (28, 0.003), (29, 0.029), (30, 0.037), (31, 0.03), (32, 0.031), (33, -0.07), (34, -0.015), (35, -0.025), (36, -0.026), (37, -0.002), (38, 0.019), (39, -0.022), (40, -0.018), (41, -0.029), (42, 0.01), (43, 0.016), (44, -0.03), (45, -0.019), (46, -0.03), (47, -0.003), (48, 0.071), (49, 0.048)]
simIndex simValue paperId paperTitle
same-paper 1 0.93279344 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization
Author: Jiwei Li ; Sujian Li
Abstract: Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. It is a new challenge which combines salience ranking problem with novelty detection. Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. Experiments on 6 different datasets which contain 3 156 documents demonstrates the good performance of our system with regard to ROUGE scores.
Author: Jackie Chi Kit Cheung ; Gerald Penn
Abstract: In automatic summarization, centrality is the notion that a summary should contain the core parts of the source text. Current systems use centrality, along with redundancy avoidance and some sentence compression, to produce mostly extractive summaries. In this paper, we investigate how summarization can advance past this paradigm towards robust abstraction by making greater use of the domain of the source text. We conduct a series of studies comparing human-written model summaries to system summaries at the semantic level of caseframes. We show that model summaries (1) are more abstractive and make use of more sentence aggregation, (2) do not contain as many topical caseframes as system summaries, and (3) cannot be reconstructed solely from the source text, but can be if texts from in-domain documents are added. These results suggest that substantial improvements are unlikely to result from better optimizing centrality-based criteria, but rather more domain knowledge is needed.
3 0.72116995 333 acl-2013-Summarization Through Submodularity and Dispersion
Author: Anirban Dasgupta ; Ravi Kumar ; Sujith Ravi
Abstract: We propose a new optimization framework for summarization by generalizing the submodular framework of (Lin and Bilmes, 2011). In our framework the summarization desideratum is expressed as a sum of a submodular function and a nonsubmodular function, which we call dispersion; the latter uses inter-sentence dissimilarities in different ways in order to ensure non-redundancy of the summary. We consider three natural dispersion functions and show that a greedy algorithm can obtain an approximately optimal summary in all three cases. We conduct experiments on two corpora—DUC 2004 and user comments on news articles—and show that the performance of our algorithm outperforms those that rely only on submodularity.
4 0.71420997 319 acl-2013-Sequential Summarization: A New Application for Timely Updated Twitter Trending Topics
Author: Dehong Gao ; Wenjie Li ; Renxian Zhang
Abstract: The growth of the Web 2.0 technologies has led to an explosion of social networking media sites. Among them, Twitter is the most popular service by far due to its ease for realtime sharing of information. It collects millions of tweets per day and monitors what people are talking about in the trending topics updated timely. Then the question is how users can understand a topic in a short time when they are frustrated with the overwhelming and unorganized tweets. In this paper, this problem is approached by sequential summarization which aims to produce a sequential summary, i.e., a series of chronologically ordered short subsummaries that collectively provide a full story about topic development. Both the number and the content of sub-summaries are automatically identified by the proposed stream-based and semantic-based approaches. These approaches are evaluated in terms of sequence coverage, sequence novelty and sequence correlation and the effectiveness of their combination is demonstrated. 1 Introduction and Background Twitter, as a popular micro-blogging service, collects millions of real-time short text messages (known as tweets) every second. It acts as not only a public platform for posting trifles about users’ daily lives, but also a public reporter for real-time news. Twitter has shown its powerful ability in information delivery in many events, like the wildfires in San Diego and the earthquake in Japan. Nevertheless, the side effect is individual users usually sink deep under millions of flooding-in tweets. To alleviate this problem, the applications like whatthetrend 1 have evolved from Twitter to provide services that encourage users to edit explanatory tweets about a trending topic, which can be regarded as topic summaries. It is to some extent a good way to help users understand trending topics. 1 whatthetrend.com There is also pioneering research in automatic Twitter trending topic summarization. (O'Connor et al., 2010) explained Twitter trending topics by providing a list of significant terms. Users could utilize these terms to drill down to the tweets which are related to the trending topics. (Sharifi et al., 2010) attempted to provide a one-line summary for each trending topic using phrase reinforcement ranking. The relevance model employed by (Harabagiu and Hickl, 2011) generated summaries in larger size, i.e., 250word summaries, by synthesizing multiple high rank tweets. (Duan et al., 2012) incorporate the user influence and content quality information in timeline tweet summarization and employ reinforcement graph to generate summaries for trending topics. Twitter summarization is an emerging research area. Current approaches still followed the traditional summarization route and mainly focused on mining tweets of both significance and representativeness. Though, the summaries generated in such a way can sketch the most important aspects of the topic, they are incapable of providing full descriptions of the changes of the focus of a topic, and the temporal information or freshness of the tweets, especially for those newsworthy trending topics, like earthquake and sports meeting. As the main information producer in Twitter, the massive crowd keeps close pace with the development of trending topics and provide the timely updated information. The information dynamics and timeliness is an important consideration for Twitter summarization. That is why we propose sequential summarization in this work, which aims to produce sequential summaries to capture the temporal changes of mass focus. Our work resembles update summarization promoted by TAC 2 which required creating summaries with new information assuming the reader has already read some previous documents under the same topic. Given two chronologically ordered documents sets about a topic, the systems were asked to generate two 2 www.nist.gov/tac 567 summaries, and the second one should inform the user of new information only. In order to achieve this goal, existing approaches mainly emphasized the novelty of the subsequent summary (Li and Croft, 2006; Varma et al., 2009; Steinberger and Jezek, 2009). Different from update summarization, we focus more on the temporal change of trending topics. In particular, we need to automatically detect the “update points” among a myriad of related tweets. It is the goal of this paper to set up a new practical summarization application tailored for timely updated Twitter messages. With the aim of providing a full description of the focus changes and the records of the timeline of a trending topic, the systems are expected to discover the chronologically ordered sets of information by themselves and they are free to generate any number of update summaries according to the actual situations instead of a fixed number of summaries as specified in DUC/TAC. Our main contributions include novel approaches to sequential summarization and corresponding evaluation criteria for this new application. All of them will be detailed in the following sections. 2 Sequential Summarization Sequential summarization proposed here aims to generate a series of chronologically ordered subsummaries for a given Twitter trending topic. Each sub-summary is supposed to represent one main subtopic or one main aspect of the topic, while a sequential summary, made up by the subsummaries, should retain the order the information is delivered to the public. In such a way, the sequential summary is able to provide a general picture of the entire topic development. 2.1 Subtopic Segmentation One of the keys to sequential summarization is subtopic segmentation. How many subtopics have attracted the public attention, what are they, and how are they developed? It is important to provide the valuable and organized materials for more fine-grained summarization approaches. We proposed the following two approaches to automatically detect and chronologically order the subtopics. 2.1.1 Stream-based Subtopic Detection and Ordering Typically when a subtopic is popular enough, it will create a certain level of surge in the tweet stream. In other words, every surge in the tweet stream can be regarded as an indicator of the appearance of a subtopic that is worthy of being summarized. Our early investigation provides evidence to support this assumption. By examining the correlations between tweet content changes and volume changes in randomly selected topics, we have observed that the changes in tweet volume can really provide the clues of topic development or changes of crowd focus. The stream-based subtopic detection approach employs the offline peak area detection (Opad) algorithm (Shamma et al., 2010) to locate such surges by tracing tweet volume changes. It regards the collection of tweets at each such surge time range as a new subtopic. Offline Peak Area Detection (Opad) Algorithm 1: Input: TS (tweets stream, each twi with timestamp ti); peak interval window ∆? (in hour), and time stepℎ (ℎ ≪ ∆?); 2: Output: Peak Areas PA. 3: Initial: two time slots: ?′ = ? = ?0 + ∆?; Tweet numbers: ?′ = ? = ?????(?) 4: while (?? = ? + ℎ) < ??−1 5: update ?′ = ?? + ∆? and ?′ = ?????(?′) 6: if (?′ < ? And up-hilling) 7: output one peak area ??? 8: state of down-hilling 9: else 10: update ? = ?′ and ? = ?′ 11: state of up-hilling 12: 13: function ?????(?) 14: Count tweets in time interval T The subtopics detected by the Opad algorithm are naturally ordered in the timeline. 2.1.2 Semantic-based Subtopic Detection and Ordering Basically the stream-based approach monitors the changes of the level of user attention. It is easy to implement and intuitively works, but it fails to handle the cases where the posts about the same subtopic are received at different time ranges due to the difference of geographical and time zones. This may make some subtopics scattered into several time slots (peak areas) or one peak area mixed with more than one subtopic. In order to sequentially segment the subtopics from the semantic aspect, the semantic-based subtopic detection approach breaks the time order of tweet stream, and regards each tweet as an individual short document. It takes advantage of Dynamic Topic Modeling (David and Michael, 2006) to explore the tweet content. 568 DTM in nature is a clustering approach which can dynamically generate the subtopic underlying the topic. Any clustering approach requires a pre-specified cluster number. To avoid tuning the cluster number experimentally, the subtopic number required by the semantic-based approach is either calculated according to heuristics or determined by the number of the peak areas detected from the stream-based approach in this work. Unlike the stream-based approach, the subtopics formed by DTM are the sets of distributions of subtopic and word probabilities. They are time independent. Thus, the temporal order among these subtopics is not obvious and needs to be discovered. We use the probabilistic relationships between tweets and topics learned from DTM to assign each tweet to a subtopic that it most likely belongs to. Then the subtopics are ordered temporally according to the mean values of their tweets’ timestamps. 2.2 Sequential Summary Generation Once the subtopics are detected and ordered, the tweets belonging to each subtopic are ranked and the most significant one is extracted to generate the sub-summary regarding that subtopic. Two different ranking strategies are adopted to conform to two different subtopic detection mechanisms. For a tweet in a peak area, the linear combination of two measures is considered to independently. Each sub-summary is up to 140 characters in length to comply with the limit of tweet, but the annotators are free to choose the number of sub-summaries. It ends up with 6.3 and 4.8 sub-summaries on average in a sequential summary written by the two annotators respectively. These two sets of sequential summaries are regarded as reference summaries to evaluate system-generated summaries from the following three aspects. Sequence Coverage Sequence coverage measures the N-gram match between system-generated summaries and human-written summaries (stopword removed first). Considering temporal information is an important factor in sequential summaries, we evaluate its significance to be a sub-summary: (1) subtopic representativeness measured by the cosine similarity between the tweet and the centroid of all the tweets in the same peak area; (2) crowding endorsement measured by the times that the tweet is re-tweeted normalized by the total number of re-tweeting. With the DTM model, the significance of the tweets is evaluated directly by word distribution per subtopic. MMR (Carbonell and Goldstein, 1998) is used to reduce redundancy in sub-summary generation. 3 Experiments and Evaluations The experiments are conducted on the 24 Twitter trending topics collected using Twitter APIs 3 . The statistics are shown in Table 1. Due to the shortage of gold-standard sequential summaries, we invite two annotators to read the chronologically ordered tweets, and write a series of sub-summaries for each topic 3https://dev.twitter.com/ propose the position-aware coverage measure by accommodating the position information in matching. Let S={s1, s2, sk} denote a … … …, sequential summary and si the ith sub-summary, N-gram coverage is defined as: ???????? =|? 1?|?∑?∈? ?∑? ? ?∈?∙ℎ ?∑ ? ?∈?-?ℎ? ?∑? ∈-? ?,? ? ? ?∈? ? ? ? ? ? ? (ℎ?(?-?-? ? ? ?) where, ??? = |? − ?| + 1, i and j denote the serial numbers of the sub-summaries in the systemgenerated summary ??? and the human-written summary ?ℎ? , respectively. ? serves as a coefficient to discount long-distance matched sub-summaries. We evaluate unigram, bigram, and skipped bigram matches. Like in ROUGE (Lin, 2004), the skip distance is up to four words. Sequence Novelty Sequence novelty evaluates the average novelty of two successive sub-summaries. Information content (IC) has been used to measure the novelty of update summaries by (Aggarwal et al., 2009). In this paper, the novelty of a system569 generated sequential summary is defined as the average of IC increments of two adjacent subsummaries, ??????? =|?|1 − 1?∑>1(????− ????, ??−1) × where |?| is the number of sub-summaries in the sequential summary. ???? = ∑?∈?? ??? . ????, ??−1 = ∑?∈??∩??−1 ??? is the overlapped information in the two adjacent sub-summaries. ??? = ???? ?????????(?, ???) where w is a word, ???? is the inverse tweet frequency of w, and ??? is all the tweets in the trending topic. The relevance function is introduced to ensure that the information brought by new sub-summaries is not only novel but also related to the topic. Sequence Correlation Sequence correlation evaluates the sequential matching degree between system-generated and human-written summaries. In statistics, Kendall’s tau coefficient is often used to measure the association between two sequences (Lapata, 2006). The basic idea is to count the concordant and discordant pairs which contain the same elements in two sequences. Borrowing this idea, for each sub-summary in a human-generated summary, we find its most matched subsummary (judged by the cosine similarity measure) in the corresponding system-generated summary and then define the correlation according to the concordance between the two matched sub-summary sequences. ??????????? 2(|#???????????????| |#???????????????|) − = ?(? − 1) where n is the number of human-written subsummaries. Tables 2 and 3 below present the evaluation results. For the stream-based approach, we set ∆t=3 hours experimentally. For the semanticbased approach, we compare three different approaches to defining the sub-topic number K: (1) Semantic-based 1: Following the approach proposed in (Li et al., 2007), we first derive the matrix of tweet cosine similarity. Given the 1norm of eigenvalues ?????? (? = 1, 2, ,?) of the similarity matrix and the ratios ?? = ??????/?2 , the subtopic number ? = ? + 1 if ?? − ??+1 > ? (? 0.4 ). (2) Semantic-based 2: Using the rule of thumb in (Wan and Yang, 2008), ? = √? , where n is the tweet number. (3) Combined: K is defined as the number of the peak areas detected from the Opad algorithm, meanwhile we use the … = tweets within peak areas as the tweets of DTM. This is our new idea. The experiments confirm the superiority of the semantic-based approach over the stream-based approach in summary content coverage and novelty evaluations, showing that the former is better at subtopic content modeling. The subsummaries generated by the stream-based approach have comparative sequence (i.e., order) correlation with the human summaries. Combining the advantages the two approaches leads to the best overall results. SCebomaSCs beonmtdivr1eac( ∆nrdδ-bm(ta=i∆g0-cs3e.t)5=d32U0 n.3ig510r32a7m B0 .i1g 6r3589a46m87 SB0 k.i1 gp8725r69ame173d Table 2. N-Gram Coverage Evaluation Sem CtraeonTmaA tmicapb-nplibentria ec3os-de.abcd N(a∆hs(o1evt∆=(sdetδ=3l2)t 0y).a4n)dCoN0r .o 73e vl071ea96lti783 oy nEvCalo0ur a. 3 tei3792ol3a489nt650io n 4 Concluding Remarks We start a new application for Twitter trending topics, i.e., sequential summarization, to reveal the developing scenario of the trending topics while retaining the order of information presentation. We develop several solutions to automatically detect, segment and order subtopics temporally, and extract the most significant tweets into the sub-summaries to compose sequential summaries. Empirically, the combination of the stream-based approach and the semantic-based approach leads to sequential summaries with high coverage, low redundancy, and good order. Acknowledgments The work described in this paper is supported by a Hong Kong RGC project (PolyU No. 5202/12E) and a National Nature Science Foundation of China (NSFC No. 61272291). References Aggarwal Gaurav, Sumbaly Roshan and Sinha Shakti. 2009. Update Summarization. Stanford: CS224N Final Projects. 570 Blei M. David and Jordan I. Michael. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, 113120. Pittsburgh, Pennsylvania. Carbonell Jaime and Goldstein Jade. 1998. The use of MMR, diversity based reranking for reordering documents and producing summaries. In Proceedings of the 21st Annual International Conference on Research and Development in Information Retrieval, 335-336. Melbourne, Australia. Duan Yajuan, Chen Zhimin, Wei Furu, Zhou Ming and Heung-Yeung Shum. 2012. Twitter Topic Summarization by Ranking Tweets using Social Influence and Content Quality. In Proceedings of the 24th International Conference on Computational Linguistics, 763-780. Mumbai, India. Harabagiu Sanda and Hickl Andrew. 2011. Relevance Modeling for Microblog Summarization. In Proceedings of 5th International AAAI Conference on Weblogs and Social Media. Barcelona, Spain. Lapata Mirella. 2006. Automatic evaluation of information ordering: Kendall’s tau. Computational Linguistics, 32(4): 1-14. Li Wenyuan, Ng Wee-Keong, Liu Ying and Ong Kok-Leong. 2007. Enhancing the Effectiveness of Clustering with Spectra Analysis. IEEE Transactions on Knowledge and Data Engineering, 19(7):887-902. Li Xiaoyan and Croft W. Bruce. 2006. Improving novelty detection for general topics using sentence level information patterns. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, 238-247. New York, USA. Lin Chin-Yew. 2004. ROUGE: a Package for Automatic Evaluation of Summaries. In Proceedings of the ACL Workshop on Text Summarization Branches Out, 74-81 . Barcelona, Spain. Liu Fei, Liu Yang and Weng Fuliang. 2011. Why is “SXSW ” trending? Exploring Multiple Text Sources for Twitter Topic Summarization. In Proceedings of the ACL Workshop on Language in Social Media, 66-75. Portland, Oregon. O'Connor Brendan, Krieger Michel and Ahn David. 2010. TweetMotif: Exploratory Search and Topic Summarization for Twitter. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, 384-385. Atlanta, Georgia. Shamma A. David, Kennedy Lyndon and Churchill F. Elizabeth. 2010. Tweetgeist: Can the Twitter Timeline Reveal the Structure of Broadcast Events? In Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, 589-593. Savannah, Georgia, USA. Sharifi Beaux, Hutton Mark-Anthony and Kalita Jugal. 2010. Summarizing Microblogs Automatically. In Human Language Technologies: the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 685688. Los Angeles, California. Steinberger Josef and Jezek Karel. 2009. Update summarization based on novel topic distribution. In Proceedings of the 9th ACM Symposium on Document Engineering, 205-213. Munich, Germany. Varma Vasudeva, Bharat Vijay, Kovelamudi Sudheer, Praveen Bysani, Kumar K. N, Kranthi Reddy, Karuna Kumar and Nitin Maganti. 2009. IIIT Hyderabad at TAC 2009. In Proceedings of the 2009 Text Analysis Conference. GaithsBurg, Maryland. Wan Xiaojun and Yang Jianjun. 2008. Multidocument summarization using cluster-based link analysis. In Proceedings of the 3 1st Annual International Conference on Research and Development in Information Retrieval, 299-306. Singapore, Singapore. 571
5 0.69963712 5 acl-2013-A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art
Author: Peter A. Rankel ; John M. Conroy ; Hoa Trang Dang ; Ani Nenkova
Abstract: How good are automatic content metrics for news summary evaluation? Here we provide a detailed answer to this question, with a particular focus on assessing the ability of automatic evaluations to identify statistically significant differences present in manual evaluation of content. Using four years of data from the Text Analysis Conference, we analyze the performance of eight ROUGE variants in terms of accuracy, precision and recall in finding significantly different systems. Our experiments show that some of the neglected variants of ROUGE, based on higher order n-grams and syntactic dependencies, are most accurate across the years; the commonly used ROUGE-1 scores find too many significant differences between systems which manual evaluation would deem comparable. We also test combinations ofROUGE variants and find that they considerably improve the accuracy of automatic prediction.
6 0.68179673 377 acl-2013-Using Supervised Bigram-based ILP for Extractive Summarization
7 0.67929149 18 acl-2013-A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
8 0.67156392 126 acl-2013-Diverse Keyword Extraction from Conversations
9 0.63210106 59 acl-2013-Automated Pyramid Scoring of Summaries using Distributional Semantics
10 0.62467861 23 acl-2013-A System for Summarizing Scientific Topics Starting from Keywords
11 0.60706353 55 acl-2013-Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
12 0.5926457 257 acl-2013-Natural Language Models for Predicting Programming Comments
13 0.59147984 157 acl-2013-Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning
14 0.56618285 332 acl-2013-Subtree Extractive Summarization via Submodular Maximization
15 0.55712277 54 acl-2013-Are School-of-thought Words Characterizable?
16 0.55657339 129 acl-2013-Domain-Independent Abstract Generation for Focused Meeting Summarization
17 0.55217254 351 acl-2013-Topic Modeling Based Classification of Clinical Reports
18 0.54007107 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
19 0.53997988 341 acl-2013-Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
20 0.53420478 178 acl-2013-HEADY: News headline abstraction through event pattern clustering
topicId topicWeight
[(0, 0.038), (6, 0.051), (11, 0.047), (24, 0.05), (26, 0.04), (28, 0.013), (35, 0.094), (42, 0.052), (48, 0.032), (70, 0.054), (81, 0.336), (88, 0.024), (90, 0.033), (95, 0.047)]
simIndex simValue paperId paperTitle
same-paper 1 0.74100834 142 acl-2013-Evolutionary Hierarchical Dirichlet Process for Timeline Summarization
Author: Jiwei Li ; Sujian Li
Abstract: Timeline summarization aims at generating concise summaries and giving readers a faster and better access to understand the evolution of news. It is a new challenge which combines salience ranking problem with novelty detection. Previous researches in this field seldom explore the evolutionary pattern of topics such as birth, splitting, merging, developing and death. In this paper, we develop a novel model called Evolutionary Hierarchical Dirichlet Process(EHDP) to capture the topic evolution pattern in time- line summarization. In EHDP, time varying information is formulated as a series of HDPs by considering time-dependent information. Experiments on 6 different datasets which contain 3 156 documents demonstrates the good performance of our system with regard to ROUGE scores.
2 0.4358086 172 acl-2013-Graph-based Local Coherence Modeling
Author: Camille Guinaudeau ; Michael Strube
Abstract: We propose a computationally efficient graph-based approach for local coherence modeling. We evaluate our system on three tasks: sentence ordering, summary coherence rating and readability assessment. The performance is comparable to entity grid based approaches though these rely on a computationally expensive training phase and face data sparsity problems.
3 0.43049309 46 acl-2013-An Infinite Hierarchical Bayesian Model of Phrasal Translation
Author: Trevor Cohn ; Gholamreza Haffari
Abstract: Modern phrase-based machine translation systems make extensive use of wordbased translation models for inducing alignments from parallel corpora. This is problematic, as the systems are incapable of accurately modelling many translation phenomena that do not decompose into word-for-word translation. This paper presents a novel method for inducing phrase-based translation units directly from parallel data, which we frame as learning an inverse transduction grammar (ITG) using a recursive Bayesian prior. Overall this leads to a model which learns translations of entire sentences, while also learning their decomposition into smaller units (phrase-pairs) recursively, terminating at word translations. Our experiments on Arabic, Urdu and Farsi to English demonstrate improvements over competitive baseline systems.
4 0.43037739 158 acl-2013-Feature-Based Selection of Dependency Paths in Ad Hoc Information Retrieval
Author: K. Tamsin Maxwell ; Jon Oberlander ; W. Bruce Croft
Abstract: Techniques that compare short text segments using dependency paths (or simply, paths) appear in a wide range of automated language processing applications including question answering (QA). However, few models in ad hoc information retrieval (IR) use paths for document ranking due to the prohibitive cost of parsing a retrieval collection. In this paper, we introduce a flexible notion of paths that describe chains of words on a dependency path. These chains, or catenae, are readily applied in standard IR models. Informative catenae are selected using supervised machine learning with linguistically informed features and compared to both non-linguistic terms and catenae selected heuristically with filters derived from work on paths. Automatically selected catenae of 1-2 words deliver significant performance gains on three TREC collections.
5 0.42988291 272 acl-2013-Paraphrase-Driven Learning for Open Question Answering
Author: Anthony Fader ; Luke Zettlemoyer ; Oren Etzioni
Abstract: We study question answering as a machine learning problem, and induce a function that maps open-domain questions to queries over a database of web extractions. Given a large, community-authored, question-paraphrase corpus, we demonstrate that it is possible to learn a semantic lexicon and linear ranking function without manually annotating questions. Our approach automatically generalizes a seed lexicon and includes a scalable, parallelized perceptron parameter estimation scheme. Experiments show that our approach more than quadruples the recall of the seed lexicon, with only an 8% loss in precision.
6 0.42813399 159 acl-2013-Filling Knowledge Base Gaps for Distant Supervision of Relation Extraction
7 0.42775795 169 acl-2013-Generating Synthetic Comparable Questions for News Articles
8 0.42773798 83 acl-2013-Collective Annotation of Linguistic Resources: Basic Principles and a Formal Model
9 0.42767614 224 acl-2013-Learning to Extract International Relations from Political Context
10 0.42705679 212 acl-2013-Language-Independent Discriminative Parsing of Temporal Expressions
11 0.42702231 185 acl-2013-Identifying Bad Semantic Neighbors for Improving Distributional Thesauri
12 0.42662233 275 acl-2013-Parsing with Compositional Vector Grammars
14 0.42621759 283 acl-2013-Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors
15 0.4259859 215 acl-2013-Large-scale Semantic Parsing via Schema Matching and Lexicon Extension
16 0.42569721 194 acl-2013-Improving Text Simplification Language Modeling Using Unsimplified Text Data
17 0.42551878 85 acl-2013-Combining Intra- and Multi-sentential Rhetorical Parsing for Document-level Discourse Analysis
18 0.42507252 250 acl-2013-Models of Translation Competitions
19 0.42494392 176 acl-2013-Grounded Unsupervised Semantic Parsing
20 0.42464641 347 acl-2013-The Role of Syntax in Vector Space Models of Compositional Semantics