acl acl2011 acl2011-133 acl2011-133-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Philip Bramsen ; Martha Escobar-Molano ; Ami Patel ; Rafael Alonso
Abstract: Sociolinguists have long argued that social context influences language use in all manner of ways, resulting in lects 1. This paper explores a text classification problem we will call lect modeling, an example of what has been termed computational sociolinguistics. In particular, we use machine learning techniques to identify social power relationships between members of a social network, based purely on the content of their interpersonal communication. We rely on statistical methods, as opposed to language-specific engineering, to extract features which represent vocabulary and grammar usage indicative of social power lect. We then apply support vector machines to model the social power lects representing superior-subordinate communication in the Enron email corpus. Our results validate the treatment of lect modeling as a text classification problem – albeit a hard one – and constitute a case for future research in computational sociolinguistics. 1
Cecilia Ovesdotter Alm, Dan Roth and Richard Sproat. 2005. Emotions from text: machine learning for textbased emotion prediction. HLT/EMNLP 2005. October 6-8, 2005, Vancouver. Penelope Brown and Stephen C. Levinson. 1987. Politeness: Some universals in language usage. Cam- bridge: Cambridge University Press. Eric Breck, Yejin Choi and Claire Cardie. 2007. Identifying expressions of opinion in context. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-2007) CALO Project. 2009. Enron http://www.cs.cmu.edu/~enron/. E-Mail Dataset. Yejin Choi and Claire Cardie. 2008. Learning with compositional semantics as structural inference for subsentential sentiment analysis. Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii: ACM. 793-801 . Yejin Choi and Claire Cardie. 2009. Adapting a polarity lexicon using integer linear programming for domainspecific sentiment classification. Empirical Methods in Natural Language Processing (EMNLP). Christopher P. Diehl, Galileo Namata, and Lise Getoor. 2007. Relationship identification for social network discovery. AAAI '07: Proceedings of the 22nd National Conference on Artificial Intelligence. Bonnie Erickson, et al. 1978. Speech style and impression formation in a court setting: The effects of 'powerful’ and 'powerless' speech. Journal of Experimental Social Psychology 14: 266-79. Norman Fairclough. 1989. Language and power. Lon- don: Longman. Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, and Ian H. Witten. 2009. The WEKA data mining software: An update. SIGKDD Exploration (1): Issue 1. JHU Center for Imaging Science. 2005. Scan Statistics on Enron Graphs. http://cis.jhu.edu/~parky/Enron/ Soo-min Kim and Eduard Hovy. 2004. Determining the Sentiment of Opinions. Proceedings of the COLING Conference. Geneva, Switzerland. Francois Mairesse and Marilyn Walker. 2006. Automatic recognition of personality in conversation. Proceedings of HLT-NAACL. New York City, New York. Galileo Mark S. Namata Jr., Lise Getoor, and Christopher P. Diehl. 2006. Inferring organizational titles in online communication. ICML 2006, 179-1 8 1. Andrew McCallum, Xuerui Wang, and Andres CorradaEmmanuel. 2007. Topic and role discovery in social networks with experiments on Enron and academic eMail. Journal of Artificial Intelligence Research 29. Ryan McDonald, Kerry Hannan, Tyler Neylon, Mike Wells, and Jeff Reynar. 2007. Structured models for 782 fine-to-coarse sentiment analysis. Proceedings of the ACL. David Morand. 2000. Language and power: An empirical analysis of linguistic strategies used in superior/subordinate communication. Journal of Organizational Behavior, 21:235-248. Frederick Mosteller and David L. Wallace. 1964. Inference and disputed authorship: The Federalist. Addison-Wesley, Reading, Mass. Jon Oberlander and Scott Nowson. 2006. Whose thumb is it anyway? Classifying author personality from weblog text. Proceedings of CoLing/ACL. Sydney, Australia. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. Proceedings of EMNLP, 79–86. Bo Pang and Lillian Lee. 2005. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. Proceedings of the ACL. John Platt. 1998. Sequential minimal optimization: A fast algorithm for training support vector machines. In Technical Report MST-TR-98-14. Microsoft Research. Delip Rao and Deepak Ravichandran. 2009. Semisupervised polarity lexicon induction. European Chapter of the Association for Computational Linguistics. Efstathios Stamatatos. 2009. A survey of modern authorship attribution methods. JASIST 60(3): 538-556. Carol Strapparava and Rada Mihalcea. 2008. Learning to identify emotions in text. SAC 2008: 1556-1560 Hiroya Takamura, Takashi Inui, and Manabu Okumura. 2005. Semantic Orientations of Words using Spin Model. Annual Meeting of the Association for Computational Linguistics. Ian H. Witten and Eibe Frank. 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kauffman. Xiaojin Zhu. 2005. Semi-supervised learning literature survey. Technical Report 1530, Department of Computer Sciences, University of Wisconsin, Madison.