emnlp emnlp2013 emnlp2013-94 emnlp2013-94-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Jiwei Li ; Myle Ott ; Claire Cardie
Abstract: Recent work has developed supervised methods for detecting deceptive opinion spam— fake reviews written to sound authentic and deliberately mislead readers. And whereas past work has focused on identifying individual fake reviews, this paper aims to identify offerings (e.g., hotels) that contain fake reviews. We introduce a semi-supervised manifold ranking algorithm for this task, which relies on a small set of labeled individual reviews for training. Then, in the absence of gold standard labels (at an offering level), we introduce a novel evaluation procedure that ranks artificial instances of real offerings, where each artificial offering contains a known number of injected deceptive reviews. Experiments on a novel dataset of hotel reviews show that the proposed method outperforms state-of-art learning baselines.
David Blei, Ng Andrew and Michael Jordan. Latent Dirichlet allocation. 2003. In Journal of Machine Learning Research. Carlos Castillo, Debora Donato, Luca Becchetti, Paolo Boldi, Stefano Leonardi, Massimo Santini and Sebastiano Vigna. A reference collection for web spam. In ACM Sigir Forum. 2006. Paul-Alexandru Chirita, Jrg Diederich and Wolfgang Nejdl. MailRank: using ranking for spam detection. In Proceedings of the 14th ACM international conference on Information and knowledge management. 2005. Cone. 2011 Online Influence Trend Tracker. http://www.coneinc.com/negative-reviews-onlinereverse-purchase-decisions. August. Yajuan Duan, Zhumin Chen, Furu Wei, Ming Zhou and Heung-Yeung Shum. Twitter Topic Summarization by Ranking Tweets Using Social Influence and Content Quality. In Proceedings of 24th International Conference on Computational Linguistics 2012. Federal Trade Commission. Guides Concerning Use of Endorsements and Testimonials in Advertising. In FTC 16 CFR Part 255. 2009. Socialogue: Five Stars? Thumbs Up? A+ or Just Average? URL:http://www.ipsos-na.com/newspolls/pressrelease.aspx?id=5929g Nitin Jindal, and Bing Liu. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining. 2008. Nitin Jindal, Bing Liu and Ee-Peng Lim. Finding Unusual Review Patterns Using Unexpected Rules. In Proceedings of the 19th ACM international conference on Information and knowledge management.2010. Thorsten Joachims. Making large-scale support vector machine learning practical. In Advances in kernel methods. 1999. Fangtao Li, Minlie Huang, Yi Yang and Xiaoyan Zhu. Learning to identify review Spam. In Proceedings of the Twenty-Second international joint conference on Artificial Intelligence. 2011. Jiwei Li, Claire Cardie and Sujian Li. TopicSpam: a Topic-Model-Based Approach for Spam Detection. In Proceedings of the 51th Annual Meeting of the Association for Computational Linguis- tics. 2013. Peng Li, Jing Jiang and Yinglin Wang. Generating templates of entity summaries with an entity-aspect model and pattern mining. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010. Ee-Peng Lim, Viet-An Nguyen, Nitin Jindal, Bing Liu, and Hady Wirawan Lauw. Detecting Product Review Spammers Using Rating Behavior. In Proceedings of the 19th ACM international conference on Information and knowledge management. 2010. Tieyan Liu. Learning to Rank for Information Retrieval. In Foundations and Trends in Information Retrieval 2009. Arjun Mukherjee, Bing Liu and Natalie Glance. Spotting Fake Reviewer Groups in Consumer Reviews . In Proceedings of the 21st international conference on World Wide Web. 2012. Juan Martinez-Romo and Lourdes Araujo. Web spam identification through language model analysis. In Proceedings of the 5th international workshop on adversarial information retrieval on the web. 2009. Myle Ott, Claire Cardie and Jeffrey Hancock. Estimating the Prevalence of Deception in Online Review Communities. In Proceedings of the 21st international conference on World Wide Web. 2012. Myle Ott, Yejin Choi, Claire Cardie and Jeffrey Hancock. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011. Daniel Ramage, David Hall, Ramesh Nallapati and Christopher Manning. Labeled LDA: A supervised 1942 topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. 2009. Michal Rosen-zvi, Thomas Griffith, Mark Steyvers and Padhraic Smyth. The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence.2004. Xiaojun Wan and Jianwu Yang. Multi-Document Summarization Using Cluster-Based Link Analysis. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 2008. Xiaojun Wan, Jianwu Yang and Jianguo Xiao ManifoldRanking Based Topic-Focused Multi-Document Summarization. In Proceedings of International Joint Conferences on Artificial Intelligence,2007. Guan Wang, Sihong Xie, Bing Liu and Philip Yu. Review Graph based Online Store Review Spammer Detection. In Proceedings of International Conference of Data Mining. 2011. Guangyu Wu, Derek Greene and , Padraig Cunningham. Merging multiple criteria to identify suspicious reviews. In Proceedings of the fourth ACM conference on Recommender systems. 2011. Kyung-Hyan Yoo and Ulrike Gretzel. Comparison of Deceptive and Truthful Travel Reviews. In Information and Communication Technologies in Tourism. 2009. Dengyong Zhou, Olivier Bousquet, Thomas Navin and Jason Weston. Learning with local and global consistency. In Proceedings of Advances in neural information processing systems.2003. Dengyong Zhou, Jason Weston, Arthur Gretton and Olivier Bousquet. Ranking on data manifolds. In Proceedings ofAdvances in neural informationprocessing systems.2003.