acl acl2012 acl2012-98 acl2012-98-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Qiming Diao ; Jing Jiang ; Feida Zhu ; Ee-Peng Lim
Abstract: Microblogs such as Twitter reflect the general public’s reactions to major events. Bursty topics from microblogs reveal what events have attracted the most online attention. Although bursty event detection from text streams has been studied before, previous work may not be suitable for microblogs because compared with other text streams such as news articles and scientific publications, microblog posts are particularly diverse and noisy. To find topics that have bursty patterns on microblogs, we propose a topic model that simultaneously captures two observations: (1) posts published around the same time are more likely to have the same topic, and (2) posts published by the same user are more likely to have the same topic. The former helps find eventdriven posts while the latter helps identify and filter out “personal” posts. Our experiments on a large Twitter dataset show that there are more meaningful and unique bursty topics in the top-ranked results returned by our model than an LDA baseline and two degenerate variations of our model. We also show some case studies that demonstrate the importance of considering both the temporal information and users’ personal interests for bursty topic detection from microblogs.
Amr Ahmed and Eric P. Xing. 2008. Dynamic nonparametric mixture models and the recurrent Chinese restaurant process: with applications to evolutionary clustering. In Proceedings of the SIAM International Conference on Data Mining, pages 219–230. Amr Ahmed and Eric P. Xing. 2010. Timeline: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. In Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pages 20–29. David M. Blei and John D. Lafferty. 2006. Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993–1022. Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Philip S. Yu, and Hongjun Lu. 2005. Parameter free bursty events detection in text streams. In Proceedings of the 31st International Conference on Very Large Data Bases, pages 181–192. Amit Gruber, Michal Rosen-Zvi, and Yair Weiss. 2007. Hidden topic Markov model. In Proceedings of the International Conference on Artificial Intelligence and Statistics. Liangjie Hong, Byron Dom, Siva Gurumurthy, and Kostas Tsioutsiouliklis. 2011. A time-dependent topic model for multiple text streams. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 832– 840. Alexander Ihler, Jon Hutchins, and Padhraic Smyth. 2006. Adaptive event detection with time-varying poisson processes. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 207–216. Jon Kleinberg. 2002. Bursty and hierarchical structure in streams. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 91–101 . Tomonari Masada, Daiji Fukagawa, Atsuhiro Takasu, Tsuyoshi Hamada, Yuichiro Shibata, and Kiyoshi Oguri. 2009. Dynamic hyperparameter optimization for bayesian topical trend analysis. In Proceedings of the 18th ACM Conference on Information and knowledge management, pages 183 1–1834. Ramesh M. Nallapati, Susan Ditmore, John D. Lafferty, and Kin Ung. 2007. Multiscale topic tomography. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Min- ing, pages 520–529. Michal Rosen-Zvi, Thomas Griffiths, Mark Steyvers, and Padhraic Smyth. 2004. The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence, pages 487–494. 544 Xuerui Wang and Andrew McCallum. 2006. Topics over time: a non-Markov continuous-time model of topical trends. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 424–433. Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat. 2007. Mining correlated bursty topic patterns from coordinated text streams. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 784– 793. Chong Wang, David M. Blei, and David Heckerman. 2008. Continuous time dynamic topic models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, pages 579–586. Jianshu Weng and Francis Lee. 2011. Event detection in Twitter. In Proceedings of the 5th International AAAI Conference on Weblogs and Social Media. Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In Proceedings of the 33rd European conference on Advances in information retrieval, pages 338– 349.