emnlp emnlp2011 emnlp2011-135 emnlp2011-135-reference knowledge-graph by maker-knowledge-mining

135 emnlp-2011-Timeline Generation through Evolutionary Trans-Temporal Summarization


Source: pdf

Author: Rui Yan ; Liang Kong ; Congrui Huang ; Xiaojun Wan ; Xiaoming Li ; Yan Zhang

Abstract: We investigate an important and challenging problem in summary generation, i.e., Evolutionary Trans-Temporal Summarization (ETTS), which generates news timelines from massive data on the Internet. ETTS greatly facilitates fast news browsing and knowledge comprehension, and hence is a necessity. Given the collection oftime-stamped web documents related to the evolving news, ETTS aims to return news evolution along the timeline, consisting of individual but correlated summaries on each date. Existing summarization algorithms fail to utilize trans-temporal characteristics among these component summaries. We propose to model trans-temporal correlations among component summaries for timelines, using inter-date and intra-date sen- tence dependencies, and present a novel combination. We develop experimental systems to compare 5 rival algorithms on 6 instinctively different datasets which amount to 10251 documents. Evaluation results in ROUGE metrics indicate the effectiveness of the proposed approach based on trans-temporal information. 1


reference text

James Allan, Rahul Gupta, and Vikas Khandelwal. 2001. Temporal summaries of new topics. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’01, pages 10–18. Hai Leong Chieu and Yoong Keok Lee. 2004. Query based event extraction along a timeline. In Proceedings of the 27th annual international ACM SIGIR con442 ference on Research and development in information retrieval, SIGIR ’04, pages 425–432. G. Erkan and D.R. Radev. 2004. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of EMNLP, volume 4. Jade Goldstein, Mark Kantrowitz, Vibhu Mittal, and Jaime Carbonell. 1999. Summarizing text documents: sentence selection and evaluation metrics. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 121–128. Xin Jin, Scott Spangler, Rui Ma, and Jiawei Han. 2010. Topic initiator detection on the world wide web. In Proceedings of the 19th international conference on WWW’10, pages 481–490. Giridhar Kumaran and James Allan. 2004. Text classification and named entities for new event detection. In Proceedings of the 27th annual international ACM SIGIR’04, pages 297–304. Chin-Yew Lin and Eduard Hovy. 2002. From single to multi-document summarization: a prototype system and its evaluation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 457–464. Chin-Yew Lin and Eduard Hovy. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the Human Language Technology Conference of the NAACL’03, pages 71–78. Yuanhua Lv and ChengXiang Zhai. 2009. Positional language models for information retrieval. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’09, pages 299–306. Qiaozhu Mei, Jian Guo, and Dragomir Radev. 2010. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD’10, pages 1009–1018. R. Mihalcea and P. Tarau. 2005. A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP, volume 5. D.R. Radev, H. Jing, and M. Sty. 2004. Centroid-based summarization of multiple documents. Information Processing and Management, 40(6):919–938. Russell Swan and James Allan. 2000. Automatic generation of overview timelines. In Proceedings of the 23rd annual international ACM SIGIR ’00, pages 49–56. Xiaojun Wan and Jianwu Yang. 2008. Multi-document summarization using cluster-based link analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pages 299–306. X. Wan, J. Yang, and J. Xiao. 2007a. Manifold-ranking based topic-focused multi-document summarization. In Proceedings of IJCAI, volume 7, pages 2903–2908. 443 X. Wan, J. Yang, and J. Xiao. 2007b. Single document summarization with document expansion. In Proceedings of the 22nd AAAI’07, pages 93 1–936. Dingding Wang and Tao Li. 2010. Document update summarization using incremental hierarchical clustering. In Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’ 10, pages 279–288. Rui Yan, Yu Li, Yan Zhang, and Xiaoming Li. 2010. Event recognition from news webpages through latent ingredients extraction. In Information Retrieval Technology - 6th Asia Information Retrieval Societies Conference, AIRS 2010, pages 490–501 . Rui Yan, Liang Kong, Yu Li, Yan Zhang, and Xiaoming Li. 2011a. A fine-grained digestion of news webpages through event snippet extraction. In Proceedings of the 20th international conference companion on world wide web, WWW ’ 11, pages 157–158. Rui Yan, Xiaojun Wan, Jahna Otterbacher, Liang Kong, Xiaoming Li, and Yan Zhang. 2011b. Evolutionary timeline summarization: a balanced optimization framework via iterative substitution. In Proceedings of the 34th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’ 11.