acl acl2011 acl2011-305 acl2011-305-reference knowledge-graph by maker-knowledge-mining

305 acl-2011-Topical Keyphrase Extraction from Twitter


Source: pdf

Author: Xin Zhao ; Jing Jiang ; Jing He ; Yang Song ; Palakorn Achanauparp ; Ee-Peng Lim ; Xiaoming Li

Abstract: Summarizing and analyzing Twitter content is an important and challenging task. In this paper, we propose to extract topical keyphrases as one way to summarize Twitter. We propose a context-sensitive topical PageRank method for keyword ranking and a probabilistic scoring function that considers both relevance and interestingness of keyphrases for keyphrase ranking. We evaluate our proposed methods on a large Twitter data set. Experiments show that these methods are very effective for topical keyphrase extraction.


reference text

Ken Barker and Nadia Cornacchia. 2000. Using noun phrase heads to extract document keyphrases. In Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence, pages 40–52. Thomas L. Griffiths and Mark Steyvers. 2004. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl. 1):5228–5235. Liangjie Hong and Brian D. Davison. 2010. Empirical study of topic modeling in Twitter. In Proceedings of the First Workshop on Social Media Analytics. Kalervo J ¨arvelin and Jaana Kek a¨l a¨inen. 2002. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4):422–446. John Lafferty and Chengxiang Zhai. 2003. Probabilistic relevance models based on document and query generation. Language Modeling and Information Retrieval, 13. Quanzhi Li, Yi-Fang Wu, Razvan Bot, and Xin Chen. 2004. Incorporating document keyphrases in search results. In Proceedings of the 10th Americas Conference on Information Systems. Marina Litvak and Mark Last. 2008. Graph-based keyword extraction for single-document summarization. In Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization, pages 17–24. Zhiyuan Liu, Wenyi Huang, Yabin Zheng, and Maosong Sun. 2010. Automatic keyphrase extraction via topic decomposition. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 366–376. Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. 2007. Automatic labeling of multinomial topic models. In Proceedings ofthe 13thACMSIGKDD International Conference on Knowledge Discovery and Data Mining, pages 490–499. R. Mihalcea and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. Daniel Ramage, Susan Dumais, and Dan Liebling. 2010. Characterizing micorblogs with topic models. In Pro- ceedings of the 4th International Conference on Weblogs and Social Media. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes Twitter users: real-time 388 event detection by social sensors. In Proceedings of the 19th International World Wide Web Conference. Takashi Tomokiyo and Matthew Hurst. 2003. A language model approach to keyphrase extraction. In Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pages 33–40. Andranik Tumasjan, Timm O. Sprenger, Philipp G. Sandner, and Isabell M. Welpe. 2010. Predicting elections with Twitter: What 140 characters reveal about political sentiment. In Proceedings of the 4th International Conference on Weblogs and Social Media. Peter Turney. 2000. Learning algorithms for keyphrase extraction. Information Retrieval, (4):303–336. Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: finding topic-sensitive influential twitterers. In Proceedings of the third ACM International Conference on Web Search and Data Mining. Wei Wu, Bin Zhang, and Mari Ostendorf. 2010. Automatic generation of personalized annotation tags for twitter users. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pages 689–692. Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Lim EePeng, Hongfei Yan, and Xiaoming Li. 2011. Comparing Twitter and traditional media using topic models. In Proceedings of the 33rd European Conference on Information Retrieval.