acl acl2012 acl2012-61 acl2012-61-reference knowledge-graph by maker-knowledge-mining

61 acl-2012-Cross-Domain Co-Extraction of Sentiment and Topic Lexicons

Source: pdf

Author: Fangtao Li ; Sinno Jialin Pan ; Ou Jin ; Qiang Yang ; Xiaoyan Zhu

Abstract: Extracting sentiment and topic lexicons is important for opinion mining. Previous works have showed that supervised learning methods are superior for this task. However, the performance of supervised methods highly relies on manually labeled training data. In this paper, we propose a domain adaptation framework for sentiment- and topic- lexicon co-extraction in a domain of interest where we do not require any labeled data, but have lots of labeled data in another related domain. The framework is twofold. In the first step, we generate a few high-confidence sentiment and topic seeds in the target domain. In the second step, we propose a novel Relational Adaptive bootstraPping (RAP) algorithm to expand the seeds in the target domain by exploiting the labeled source domain data and the relationships between topic and sentiment words. Experimental results show that our domain adaptation framework can extract precise lexicons in the target domain without any annotation.

reference text

Rie K. Ando and Tong Zhang. 2005. A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res., 6: 1817–1853. John Blitzer, Mark Dredze, and Fernando Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 432–439, Prague, Czech Republic. ACL. Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the 11th Annual Conference on Computational Learning Theory, pages 92–100. Danushka Bollegala, David Weir, and John Carroll. 2011. Using multiple sources to construct a sentiment sensitive thesaurus for cross-domain sentiment classification. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 132–141, Portland, Oregon. ACL. Wenyuan Dai, Qiang Yang, Guirong Xue, and Yong Yu. 2007. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning, pages 193–200, Corvalis, Oregon, USA, June. ACM. Hal Daum e´ III. 2007. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 256– 263, Prague, Czech Republic. ACL. Zhendong Dong and Qiang Dong, editors. 2006. HOWNET and the computation of meaning. World Scientific Publishers, Norwell, MA, USA. Weifu Du, Songbo Tan, Xueqi Cheng, and Xiaochun Yun. 2010. Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In Proceedings of the 3rd ACM international conference on Web search and data mining, pages 111–120, New York, NY, USA. ACM. Andrea Esuli and Fabrizio Sebastiani. 2006. SENTIWORDNET: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation, pages 417–422. Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning, pages 513–520, Bellevue, Washington, USA. Yulan He, Chenghua Lin, and Harith Alani. 2011. Automatically extracting polarity-bearing topics for crossdomain sentiment classification. In Proceedings of the 418 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 123–13 1, Portland, Oregon. ACL. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168–177, Seattle, WA, USA. ACM. Niklas Jakob and Iryna Gurevych. 2010. Extracting opinion targets in a single- and cross-domain setting with conditional random fields. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 1035–1045, Cambridge, Massachusetts, USA. ACL. Jing Jiang and ChengXiang Zhai. 2007. Instance weighting for domain adaptation in NLP. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 264–271, Prague, Czech Republic. ACL. Wei Jin and Hung Hay Ho. 2009. A novel lexicalized HMM-based learning framework for web opinion mining. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 465– 472, Montreal, Quebec, Canada. ACM. Rosie Jones, Andrew Mccallum, Kamal Nigam, and Ellen Riloff. 1999. Bootstrapping for text learning tasks. In In IJCAI-99 Workshop on Text Mining: Foundations, Techniques and Applications, pages 52–63. Jon M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM, 46:604–632, Sept. Shoushan Li and Chengqing Zong. 2008. Multi-domain sentiment classification. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers, pages 257–260, Columbus, Ohio, USA. ACL. Tao Li, Vikas Sindhwani, Chris Ding, and Yi Zhang. 2009. Knowledge transformation for cross-domain sentiment classification. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 716–717, Boston, MA, USA. ACM. Fangtao Li, Chao Han, Minlie Huang, Xiaoyan Zhu, Ying-Ju Xia, Shu Zhang, and Hao Yu. 2010a. Structure-aware review mining and summarization. In Proceedings of the 23rd International Conference on Computational Linguistics, pages 653–661, Beijing, China. Fangtao Li, Minlie Huang, and Xiaoyan Zhu. 2010b. Sentiment analysis with global topics and local dependency. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, Atlanta, Georgia, USA. AAAI Press. Bing Liu. 2010. Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second Edition. Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, and ChengXiang Zhai. 2007. Topic sentiment mixture: modeling facets and opinions in weblogs. In Proceedings of the 16th international conference on World Wide Web, pages 171–180, Banff, Alberta, Canada. ACM. Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng., 22(10): 1345–1359, Oct. Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Chen Zheng. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web, pages 751–760, Raleigh, NC, USA, Apr. ACM. Bo Pang and Lillian Lee. 2004. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona, Spain. ACL. Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2): 1–135. Ana-Maria Popescu and Oren Etzioni. 2005. Extracting product features and opinions from reviews. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 339–346, Vancouver, British Columbia, Canada. ACL. Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. 2009. Expanding domain sentiment lexicon through double propagation. In Proceedings of the 21st international jont conference on Artifical intelligence, pages 1199– 1204, Pasadena, California, USA. Morgan Kaufmann Publishers Inc. Ellen Riloff and Rosie Jones. 1999. Learning dictionaries for information extraction by multi-level bootstrapping. In Proceedings of the 6th national conference on Artificial intelligence, pages 474–479, Orlando, Florida, United States. AAAI. Ellen Riloff, Janyce Wiebe, and Theresa Wilson. 2003. Learning subjective nouns using extraction pattern bootstrapping. In Proceedings of the 7th conference on natural language learning, pages 25–32, Edmonton, Canada. ACL. Ellen Riloff. 1996. Automatically generating extraction patterns from untagged text. In Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 1044–1049, Portland, Oregon, USA. AAAI Press/MIT Press. 419 Songbo Tan, Gaowei Wu, Huifeng Tang, and Xueqi Cheng. 2007. A novel scheme for domain-transfer problem in the context of sentiment analysis. In Proceedings of the 16th ACM conference on Conference on information and knowledge management, pages 979–982, Lisbon, Portugal. ACM. Ivan Titov and Ryan McDonald. 2008. A joint model of text and aspect ratings for sentiment summarization. In Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics: Human Language Technologies, pages 308–3 16, Columbus, Ohio, USA. ACL. Janyce Wiebe, Theresa Wilson, Rebecca Bruce, Matthew Bell, and Melanie Martin. 2004. Learning subjective language. Comput. Linguist. , 30:277–308, Sept. Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pages 347–354, Vancouver, British Columbia, Canada. ACL. Dan Wu, Wee Sun Lee, Nan Ye, and Hai Leong Chieu. 2009. Domain adaptive bootstrapping for named entity recognition. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 1523–1532, Singapore. ACL. Min Zhang and Xingyao Ye. 2008. A generation model to unify topic relevance and lexicon-based sentiment for opinion retrieval. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 411–418, Singapore. ACM. Wayne Xin Zhao, Jing Jiang, Hongfei Yan, and Xiaoming Li. 2010. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pages 56–65, Cambridge, Massachusetts, USA. ACL.