emnlp emnlp2013 emnlp2013-143 emnlp2013-143-reference knowledge-graph by maker-knowledge-mining

143 emnlp-2013-Open Domain Targeted Sentiment

Source: pdf

Author: Margaret Mitchell ; Jacqui Aguilar ; Theresa Wilson ; Benjamin Van Durme

Abstract: We propose a novel approach to sentiment analysis for a low resource setting. The intuition behind this work is that sentiment expressed towards an entity, targeted sentiment, may be viewed as a span of sentiment expressed across the entity. This representation allows us to model sentiment detection as a sequence tagging problem, jointly discovering people and organizations along with whether there is sentiment directed towards them. We compare performance in both Spanish and English on microblog data, using only a sentiment lexicon as an external resource. By leveraging linguisticallyinformed features within conditional random fields (CRFs) trained to minimize empirical risk, our best models in Spanish significantly outperform a strong baseline, and reach around 90% accuracy on the combined task of named entity recognition and sentiment prediction. Our models in English, trained on a much smaller dataset, are not yet statistically significant against their baselines.

reference text

A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the Workshop on Language in Social Media. Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. 2010. Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Luciano Barbosa and Junlan Feng. 2010. Robust sentiment detection on Twitter from biased and noisy data. In Proceedings of Coling: Posters. Adam Bermingham and Alan F Smeaton. 2010. Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of CIKM-2010. Albert Bifet and Eibe Frank. 2010. Sentiment knowledge discovery in Twitter streaming data. In Proceedings of the International Conference on Discovery Science (DS-2010). Juliette Blevins. 1996. The syllable in phonological theory. In John A. Goldswmith, editor, The Handbook of Phonological Theory. Blackwell Publishing, Blackwell Reference Online. N. N. Bora. 2012. Summarizing public opinions in tweets. In Proceedings of CICLing-2012. Samuel Brody and Nicholas Diakopoulos. 2011. Cooooooooooooooollllllllllllll! !!!!!!!!!!!!!: using word lengthening to detect sentiment in microblogs. In Proceedings of EMNLP-2011. P. F. Brown, V. J. Della Pietra, P. V. deSouza, J.C. Lai, and R.L. Mercer. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–479. Pedro Henrique Calais Guerra, Adriano Veloso, Wagner Meira Jr, and Virg ı´lio Almeida. 2011. From bias to opinion: a transfer-learning approach to real-time sentiment analysis. In Proceedings of the KDD-2011. Chris Callison-Burch and Mark Dredze. 2010. Creating speech and language data with amazon’s mechanical turk. In Proceedings of the NAACL:HLT Workshop on Creating Speech and Language Data with Amazon ’s Mechanical Turk. Lu Chen, Wenbo Wang, Meenakshi Nagarajan, Shaojun Wang, and Amit P. Sheth. 2012. Extracting diverse sentiment expressions with target-dependent polarity from twitter. In Proceedings of ICWSM-2012. Yejin Choi, Eric Breck, and Claire Cardie. 2006. Joint extraction of entities and relations for opinion recognition. Proceedings of EMNLP 2006. G. N. Clements. 1990. The role of the sonority cycle in core syllabification. In J. Kingston and M. Beckman, 1653 editors, Papers in Laboratory Phonology, pages 283– 333. CUP, Cambridge. Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010. Enhanced sentiment learning using Twitter hashtags and smileys. In Proceedings of Coling: Posters. Nicholas A Diakopoulos and David A Shamma. 2010. Characterizing debate performance via aggregated twitter sentiment. In Proceedings of CHI-2010. David Etter, Francis Ferraro, Ryan Cotterell, Olivia Buzek, and Benjamin Van Durme. 2013. Nerit: Named entity recognition for informal text. Technical Report 11, Human Language Technology Center of Excellence, Johns Hopkins University, July. Jenny Rose Finkel and Christopher D. Manning. 2010. Hierarchical joint learning: Improving joint parsing and named entity recognition with non-jointly labeled data. In Proceedings of ACL-2010. Joan B. Hooper. 1976. The syllable in phonological theory. Language, 48(3):525–540. Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of KDD. Xia Hu, Lei Tang, Jiliang Tang, and Huan Liu. 2013. Exploiting social relations for sentiment analysis in microblogging. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining (WSDM-2013). Niklas Jakob and Iryna Gurevych. 2010. Extracting opinion targets in a single-and cross-domain setting with conditional random fields. In Proceedings of EMNLP. Long Jiang, Mo Yu, Xiaohua Liu, and Tiejun Zhao. 2011. Target-dependent twitter sentiment classification. In Proceedings of ACL-2011. Wei Jin and Hung Hay Ho. 2009. A novel lexicalized hmm-based learning framework for web opinion mining. Proceedings of ICML 2009. Soo-Min Kim and Eduard Hovy. 2006. Identifying and analyzing judgment opinions. Proceedings of NAACL 2006. Terry Koo, Xavier Carreras, and Michael Collins. 2008. Simple semi-supervised dependency parsing. In Proceedings of ACL/HLT. Michal Kosinskia, David Stillwell, and Thore Graepel. 2013. Private trains and attributes are predictable from digital records of human behavior. Proc. of the National Academy of Sciences of the USA, 110(5). Efthymios Kouloumpis, Theresa Wilson, and Johanna Moore. 2011. Twitter sentiment analysis: The good the bad and the OMG! In Proceedings of ICWSM2011. J. Lafferty, A. McCallum, and F. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-2001. Fangtao Li, Chao Han, Minlie Huang, Xiaoyan Zhu, Ying-Ju Xia, Shu Zhang, and Hao Yu. 2010a. Structure-aware review mining and summarization. Proceedings of Coling 2010. Guangxia Li, Steven CH Hoi, Kuiyu Chang, and Ramesh Jain. 2010b. Micro-blogging sentiment detection by collaborative online learning. In Proceedings of ICDM-2010. Hao Li, Yu Chen, Heng Ji, Smaranda Muresan, and Dequan Zheng. 2012. Combining social cognitive theories with linguistic features for multi-genre sentiment analysis. In Proceedings of the Pacific Asia Conference on Language, Information and Computation (PACLIC-2012). Andrew McCallum and Wei Li. 2003. Early results for named entity recognition with conditional random fields, feature induction, and web-enhanced lexicons. In Proceedings of CoNLL-2003. Frida Morelli. 2003. The relative harmony of /s+stop/ onsets: Obstruent clusters and the sonority sequencing principle. In C. Fery and R. van de Vijver, editors, The syllable in optimality theory, pages 356–371 . CUP, New York. Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of LREC-2010. James W. Pennebaker, Roger J. Booth, and Martha E. Francis. 2007. Linguistic inquiry and word count: Liwc2007, operator’s manual. Veronica Perez-Rosas, Carmen Banea, and Rada Mihalcea. 2012. Learning sentiment lexicons in spanish. Proceedings of the Conference on Language Resources and Evaluations (LREC 2012). Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In Proceedings of Coling:ACL-2006. Ana-Maria Popescu and Oren Etzioni. 2005. Extracting product features and opinions from reviews. In Proceedings of HLT:EMNLP-2005. Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. 2011. Opinion word expansion and target extraction through double propagation. Computational Linguistics, 37(1). Hassan Saif, Yulan He, and Harith Alani. 2012. Alleviating data sparsity for twitter sentiment analysis. Proceedings of the WWW Workshop on Making Sense of Microposts (# MSM2012). Michael Speriosu, Nikita Sudan, Sid Upadhyay, and Jason Baldridge. 2011. Twitter polarity classification with label propagation over lexical links and the follower graph. In Proceedings of the EMNLP-2011 Workshop on Unsupervised Learning in NLP. Veselin Stoyanov and Jason Eisner. 2012. 1654 Minimum- risk training of approximate crf-based nlp systems. In Proceedings of NAACL:HLT-2012. Veselin Stoyanov, Alexander Ropson, and Jason Eisner. 2011. Empirical risk minimization of graphical model parameters given approximate inference, decoding, and model structure. In AIStats. Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, and Ping Li. 2011. User-level sentiment analysis incorporating social networks. In Proceedings of the KDD-2011. Benjamin Van Durme. 2012. Jerboa: A toolkit for randomized and streaming algorithms. Technical report, Human Language Technology Center of Excellence, Johns Hopkins University. Svitlana Volkova, Theresa Wilson, and David Yarowsky. 2013. Exploring sentiment in social media: Bootstrapping subjectivity clues from multilingual twitter streams. In Association for Computational Linguistics (ACL). Xiaolong Wang, Furu Wei, Xiaohua Liu, Ming Zhou, and Ming Zhang. 2011. Topic sentiment analysis in Twitter: A graph-based hashtag sentiment classification approach. In Proceedings of CIKM-2011. T. Wilson, J. Wiebe, and P. Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of HLT-EMNLP. Theresa Wilson, Janyce Wiebe, and Paul Hoffman. 2009. Recognizing contextual polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3). Bishan Yang and Claire Cardie. 2013. Joint inference for fine-grained opinion extraction. Proceedings of ACL 2013. Jeonghee Yi, Tetsuya Nasukawa, Razvan Bunescu, and Wayne Niblack. 2003. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of ICDM-2003.