acl acl2010 acl2010-123 acl2010-123-reference knowledge-graph by maker-knowledge-mining

123 acl-2010-Generating Focused Topic-Specific Sentiment Lexicons

Source: pdf

Author: Valentin Jijkoun ; Maarten de Rijke ; Wouter Weerkamp

Abstract: We present a method for automatically generating focused and accurate topicspecific subjectivity lexicons from a general purpose polarity lexicon that allow users to pin-point subjective on-topic information in a set of relevant documents. We motivate the need for such lexicons in the field of media analysis, describe a bootstrapping method for generating a topic-specific lexicon from a general purpose polarity lexicon, and evaluate the quality of the generated lexicons both manually and using a TREC Blog track test set for opinionated blog post retrieval. Although the generated lexicons can be an order of magnitude more selective than the general purpose lexicon, they maintain, or even improve, the performance of an opin- ion retrieval system.

reference text

Altheide, D. (1996). Qualitative Media Analysis. Sage. Choi, Y., Kim, Y., and Myaeng, S.-H. (2009). Domainspecific sentiment analysis using contextual feature generation. In TSA ’09: Proceeding of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, pages 37–44, New York, NY, USA. ACM. Fahrni, A. and Klenner, M. (2008). Old Wine or Warm Beer: Target-Specific Sentiment Analysis of Adjectives. In Proc.of the Symposium on Affective Language in Human and Machine, AISB 2008 Convention, 1st-2nd April 2008. University of Aberdeen, Aberdeen, Scotland, pages 60 63. Godbole, N., Srinivasaiah, M., and Skiena, S. (2007). Largescale sentiment analysis for news and blogs. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM). Kanayama, H. and Nasukawa, T. (2006). Fully automatic lexicon expansion for domain-oriented sentiment analysis. In EMNLP ’06: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pages 355–363, Morristown, NJ, USA. Association for Computational Linguistics. Kim, S. and Hovy, E. (2004). Determining the sentiment of opinions. In Proceedings of COLING 2004. Lavrenko, V. and Croft, B. (2001). Relevance-based language models. In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval. Lee, Y., Na, S.-H., Kim, J., Nam, S.-H., Jung, H.-Y., and Lee, J.-H. (2008). KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval. In Proceedings of TREC 2008. Liu, B., Hu, M., and Cheng, J. (2005). Opinion observer: analyzing and comparing opinions on the web. In Proceedings of the 14th international conference on World Wide Web. Macdonald, C. and Ounis, I. (2006). The TREC Blogs06 collection: Creating and analysing a blog test collection. Technical Report TR-2006-224, Department of Computer Science, University of Glasgow. – Metzler, D. and Croft, W. B. (2005). A markov random feld model for term dependencies. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, pages 472–479, New York, NY, USA. ACM Press. Na, S.-H., Lee, Y., Nam, S.-H., and Lee, J.-H. (2009). Improving opinion retrieval based on query-specific sentiment lexicon. In ECIR ’09: Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, pages 734–738, Berlin, Heidelberg. Springer-Verlag. Ounis, I., Macdonald, C., de Rijke, M., Mishne, G., and Soboroff, I. (2007). Overview of the TREC 2006 blog track. In The Fifteenth Text REtrieval Conference (TREC 2006). NIST. Popescu, A.-M. and Etzioni, O. (2005). Extracting product features and opinions from reviews. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). Riloff, E. and Wiebe, J. (2003). Learning extraction patterns 593 for subjective expressions. In Proceedings of the 2003 Conference on Empirical methods in Natural Language Processing (EMNLP). Weerkamp, W., Balog, K., and de Rijke, M. (2009). A generative blog post retrieval model that uses query expansion based on external collections. In Joint conference of the 47thAnnual Meeting ofthe Associationfor Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-ICNLP 2009), Singapore. Weerkamp, W. and de Rijke, M. (2008). Credibility improves topical blog post retrieval. In Proceedings of ACL08: HLT, page 92393 1, Columbus, Ohio. Association for Computational Linguistics, Association for Computational Linguistics. Wilson, T., Wiebe, J., and Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. In HLT ’05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 347–354, Morristown, NJ, USA. Association for Computational Linguistics. Wilson, T., Wiebe, J., and Hoffmann, P. (2009). Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. tics, 35(3):399–433. Computational Linguis- 594