acl acl2011 acl2011-281 acl2011-281-reference knowledge-graph by maker-knowledge-mining

281 acl-2011-Sentiment Analysis of Citations using Sentence Structure-Based Features


Source: pdf

Author: Awais Athar

Abstract: Sentiment analysis of citations in scientific papers and articles is a new and interesting problem due to the many linguistic differences between scientific texts and other genres. In this paper, we focus on the problem of automatic identification of positive and negative sentiment polarity in citations to scientific papers. Using a newly constructed annotated citation sentiment corpus, we explore the effectiveness of existing and novel features, including n-grams, specialised science-specific lexical features, dependency relations, sentence splitting and negation features. Our results show that 3-grams and dependencies perform best in this task; they outperform the sentence splitting, science lexicon and negation based features.


reference text

S. Bird, R. Dale, B.J. Dorr, B. Gibson, M.T. Joseph, M.Y. Kan, D. Lee, B. Powley, D.R. Radev, and Y.F. Tan. 2008. The acl anthology reference corpus: A reference dataset for bibliographic research in computational linguistics. In Proc. of the 6th International Conference on Language Resources and Evaluation Conference (LREC08), pages 1755–1759. Citeseer. J. Blitzer, M. Dredze, and F. Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In ACL, volume 45, page 440. S. Bonzi. 1982. Characteristics of a literature as predictors of relatedness between cited and citing works. Journal of the American Society for Information Sci- ence, 33(4):208–216. C.C. Chang and C.J. Lin. 2001. LIBSVM: a library for support vector machines, 2001. Software available at ht tp ://www. . csi e .nt u . edu . t w/ cjl in/l ibsvm. C. Cortes and V. Vapnik. 1995. Support-vector networks. Machine learning, 20(3):273–297. I.G. Councill, R. McDonald, and L. Velikovich. 2010. What’s great and what’s not: learning to classify the scope of negation for improved sentiment analysis. In Proceedings of the Workshop on Negation and Speculation in Natural Language Processing, pages 5 1–59. Association for Computational Linguistics. K. Dave, S. Lawrence, and D.M. Pennock. 2003. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In Proceedings of the 12th international conference on World Wide Web, pages 519–528. ACM. M.C. de Marneffe and C.D. Manning. 2008. The Stanford typed dependencies representation. In COLING, pages 1–8. Association for Computational Linguistics. Y. EL-Manzalawy and V. Honavar, 2005. WLSVM: Integrating LibSVM into Weka Environment. Software available at http : / /www . cs . iastate . edu / ˜ yas ser /wl svm. C. Engstr¨ om. 2004. Topic dependence in sentiment classification. Unpublished MPhil Dissertation. Univer- sity of Cambridge. M. Gamon and A. Aue. 2005. Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in Natural Language Processing, pages 57– 64. Association for Computational Linguistics. M. Garzone and R. Mercer. 2000. Towards an automated citation classifier. Advances in Artificial Intelligence, pages 337–346. D. Hall, D. Jurafsky, and C.D. Manning. 2008. Studying the history of ideas using topic models. In EMNLP, pages 363–371. V. Hatzivassiloglou and K.R. McKeown. 1997. Predicting the semantic orientation of adjectives. In Proceedings of EACL, pages 174–181 . Association for Computational Linguistics. J.E. Hirsch. 2005. An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46): 16569. K. Hyland. 1995. The Author in the Text: Hedging Scientific Writing. Hong Kong papers in linguistics and language teaching, 18: 11. M. Joshi and C. Penstein-Ros e´. 2009. Generalizing dependency features for opinion mining. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 3 13–3 16. Association for Computational Linguistics. J.S. Justeson and S.M. Katz. 1995. Technical terminology: some linguistic properties and an algorithm for identification in text. Natural language engineering, 1(01):9–27. S. Khan. 2007. Negation and Antonymy in Sentiment Classification. Ph.D. thesis, Computer Lab, University of Cambridge. D.D. Lewis. 1991. Evaluating text categorization. In Proceedings of Speech and Natural Language Workshop, pages 3 12–3 18. M.H. MacRoberts and B.R. MacRoberts. 1984. The negational reference: Or the art of dissembling. Social Studies of Science, 14(1):91–94. T. Nakagawa, K. Inui, and S. Kurohashi. 2010. Dependency tree-based sentiment classification using CRFs with hidden variables. In NAACL HLT, pages 786– 794. Association for Computational Linguistics. H. Nanba and M. Okumura. 1999. Towards multi-paper summarization using reference information. In IJCAI, volume 16, pages 926–931. Citeseer. V. Ng, S. Dasgupta, and SM Arifin. 2006. Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In Proceedings of the COLING/ACL on Main conference poster sessions, pages 611–618. Association for Computational Linguistics. 86 B. Pang, L. Lee, and S. Vaithyanathan. 2002. Thumbs up?: sentiment classification using machine learning techniques. In EMNLP, pages 79–86. Association for Computational Linguistics. S. Piao, S. Ananiadou, Y. Tsuruoka, Y. Sasaki, and J. McNaught. 2007. Mining opinion polarity relations of citations. In International Workshop on Computational Semantics (IWCS), pages 366–371 . Citeseer. L. Polanyi and A. Zaenen. 2006. Contextual valence shifters. Computing attitude and affect in text: Theory and applications, pages 1–10. D.R. Radev, M.T. Joseph, B. Gibson, and P. Muthukrishnan. 2009. A Bibliometric and Network Analysis of the field of Computational Linguistics. Journal of the American Society for Information Science and Technology, 1001:48109–1092. A. Ritchie, S. Robertson, and S. Teufel. 2008. Comparing citation contexts for information retrieval. In Proceeding of the 17th ACM Conference on Information and Knowledge Management, pages 213–222. ACM. I. Spiegel-R o¨sing. 1977. Science studies: Bibliometric and content analysis. Social Studies of Science, 7(1):97–1 13. O. T ¨ackstr o¨m and R. McDonald. 2011. Discovering fine-grained sentiment with latent variable structured prediction models. In Proceedings of the ECIR. S. Teufel, A. Siddharthan, and D. Tidhar. 2006. Automatic classification of citation function. In EMNLP, pages 103–1 10. Association for Computational Linguistics. G. Thompson and Y. Yiyun. 1991 . Evaluation in the reporting verbs used in academic papers. Applied linguistics, 12(4):365. P.D. Turney. 2002. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pages 417–424. Association for Computational Linguistics. W.J. Wilbur, A. Rzhetsky, and H. Shatkay. 2006. New directions in biomedical text annotation: definitions, guidelines and corpus construction. BMC bioinformatics, 7(1):356. T. Wilson, J. Wiebe, and R. Hwa. 2004. Just how mad are you? Finding strong and weak opinion clauses. In Proceedings of the National Conference on Artificial Intelligence, pages 761–769. Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999. T. Wilson, J. Wiebe, and P. Hoffmann. 2005. Recognizing contextual polarity in phrase-level sentiment analysis. In EMNLP, pages 347–354. Association for Computational Linguistics. T. Wilson, J. Wiebe, and P. Hoffmann. 2009. Recognizing Contextual Polarity: an exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35(3):399–433. A. Yessenalina, Y. Yue, and C. Cardie. 2010. Multilevel structured models for document-level sentiment classification. In Proceedings ofEMNLP, pages 1046– 1056, Cambridge, MA, October. Association for Computational Linguistics. H. Yu and V. Hatzivassiloglou. 2003. Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of EMNLP, pages 129–136. Association for Computational Linguistics. J.M. Ziman. 1968. Public Knowledge: An essay concerning the social dimension of science. Cambridge Univ. Press, College Station, Texas. 87