acl acl2013 acl2013-211 acl2013-211-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Mohamed Aly ; Amir Atiya
Abstract: We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic language. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and rating classification. We provide standard splits of the dataset into training and testing, for both polarity and rating classification, in both balanced and unbalanced settings. We run baseline experiments on the dataset to establish a benchmark.
M. Rushdi-Saleh, M. Martín-Valdivia, L. Ureña-López, AhfF2mo0era0utd8m.rsAe.bSsAbeClansetMci,mtoTHenrsatnifoascnrahc outlpinyosniCsoh nie nc,Ilmnas fuonslirtdfmipcalAetirola nb Sg iyunsatlegwme sb:. IagwnuitadPhgrJoea.ncP er aroedciaenb-sgiOcsin-rotegfn (gRa le.iAsc2hNe0nL1ctPo1Ara)pd.uvaBsniflocnergsouipnal Neiaoxtnpuermai lnLieangts.(TOIS). Muhammad Abdul-Mageed and Mona Diab. 2012a. Awatif: A multi-genre corpus for modern standard arabic subjectivity and sentiment analysis. In Proceedings of the Eight International Conference on Language Resources and Evaluation. Muhammad Abdul-Mageed and Mona Diab. 2012b. Toward building a large-scale arabic sentiment lexicon. In Proceedings of the 6th International Global Word-Net Conference. Muhammad Abdul-Mageed, Mona Diab, and Mohammed Korayem. 2011. Subjectivity and sentiment analysis of modern standard arabic. In 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Muhammad Abdul-Mageed, Sandra Kübler, and Mona Diab. 2012. Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis. Mohamed Fahmy. opinion preprint Elarnaoty, Samir AbdelRahman, and Aly 2012. A machine learning approach for holder extraction in arabic language. arXiv arXiv:1206.1011. Mohammed Korayem, David Crandall, and Muhammad Abdul-Mageed. 2012. Subjectivity and sentiment analysis of arabic: A survey. In Advanced Machine Learning Technologies and Applications. Bing Liu. 2010. Sentiment analysis and subjectivity. Handbook of Natural Language Processing. Christopher D. Manning and Hinrich Schütze. 2000. Foundations of Statistical Natural Language Processing. MIT Press. Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2: 1–135. B. Pang, L. Lee, and S. Vaithyanathan. 2002. Thumbs up?: Sentiment classification using machine learning techniques. In EMNLP. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, M. Rushdi-Saleh, M. Martín-Valdivia, L. Ureña-López, and J. Perea-Ortega. 2011b. Oca: Opinion corpus for arabic. Journal of the American Society for Information Science and Technology. D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python . Journal of Machine Learning Research, 12:2825–2830. 498