ABSTRACT
User-generated content is becoming increasingly valuable to both individuals and businesses due to its usefulness and influence in e-commerce markets. As consumers rely more on such information, posting deceptive opinions, which can be deliberately used for potential profit, is becoming more of an issue. Existing work on opinion spam detection focuses mainly on linguistic features such as n-grams, syntactic patterns, or LIWC. However, deep semantic analysis remains largely unstudied. In this paper, we propose a frame-based deep semantic analysis method for understanding rich characteristics of deceptive and truthful opinions written by various types of individuals including crowdsourcing workers, employees who have expert-level domain knowledge about local businesses, and online users who post on Yelp and TripAdvisor. Using our proposed semantic frame feature, we developed a classification model that outperforms the baseline model and achieves an accuracy of nearly 91%. Also, we performed qualitative analysis of deceptive and truthful review datasets and considered their semantic differences. Finally, we successfully found some interesting features that existing methods were unable to identify.
- 2013 study: 79% of consumers trust online reviews as much as personal recommendations, "http://searchengineland.com/2013-study-79-of-consumers-trust-online-reviews-as-much-as-personal-recommendations-164565". Accessed: 2015-04-05.Google Scholar
- A. A. Benczur, K. Csalogany, T. Sarlos, and M. Uher. Spamrank--fully automatic link spam detection work in progress. In Proceedings of the first international workshop on adversarial information retrieval on the web, AIRWeb '05, Chiba, Japan, 2005.Google Scholar
- C. Castillo, D. Donato, A. Gionis, V. Murdock, and F. Silvestri. Know your neighbors: Web spam detection using the web topology. In Proceedings of SIGIR, Amsterdam, Netherlands, July 2007. ACM. Google ScholarDigital Library
- S. Feng, R. Banerjee, and Y. Choi. Syntactic stylometry for deception detection. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012. Google ScholarDigital Library
- C. Fillmore, C. Johnson, and M. Petruck. Background to framenet. International journal of lexicography, 16(3):235, 2003.Google ScholarCross Ref
- C. J. Fillmore. Frame semantics and the nature of language. In Origins and Evolution of Language and Speech, 280, 1976.Google Scholar
- T. Gamerschlag, D. Gerland, R. Osswald, and W. Petersen. Frames and Concept Types: Applications in Language and Philosophy, volume 94 of 0924--4662. Springer International Publishing, 1 edition, 2014.Google ScholarCross Ref
- Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, VLDB '04, pages 576--587, 2004. Google ScholarDigital Library
- N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining, WSDM '08, 2008. Google ScholarDigital Library
- N. Jindal, B. Liu, and E.-P. Lim. Finding unusual review patterns using unexpected rules. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, pages 1549--1552, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- J. Li, M. Ott, C. Cardie, and E. Hovy. Towards a general rule for identifying deceptive opinion spam. In Proceedings of the 52th Annual Meeting of the Association for Computational Linguistics, 2014.Google ScholarCross Ref
- E.-P. Lim, V.-A. Nguyen, N. Jindal, B. Liu, and H. W. Lauw. Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM '10, 2010. Google ScholarDigital Library
- A. Mukherjee, B. Liu, and N. S. Glance. Spotting fake reviewer groups in consumer reviews. In Proceedings of the 21st World Wide Web Conference, WWW 2012, Lyon, France, April 16-20, 2012, pages 191--200. Google ScholarDigital Library
- A. Mukherjee, V. Venkataraman, B. Liu, and N. S. Glance. What yelp fake review filter might be doing? In Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM 2013, Cambridge, Massachusetts, USA, July 8-11, 2013.Google Scholar
- M. Ott, Y. Choi, C. Cardie, and J. T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, 2011. Google ScholarDigital Library
- J. W. Pennebaker, C. K. Chung, M. Ireland, A. Gonzales, and R. J. Booth. The Development and Psychometric Properties of LIWC2007. Austin, TX, USA LIWC. Net.Google Scholar
- N. Spirin and J. Han. Survey on web spam detection: principles and algorithms. ACM SIGKDD Explorations Newsletter, 13(2):50--64, 2012. Google ScholarDigital Library
- G. Wang, S. Xie, B. Liu, and P. S. Yu. Review graph based online store review spammer detection. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining, ICDM '11, pages 1242--1247, Washington, DC, USA, 2011. IEEE Computer Society. Google ScholarDigital Library
Index Terms
- Deep Semantic Frame-Based Deceptive Opinion Spam Analysis
Recommendations
Opinion spam and analysis
WSDM '08: Proceedings of the 2008 International Conference on Web Search and Data MiningEvaluative texts on the Web have become a valuable source of opinions on products, services, events, individuals, etc. Recently, many researchers have studied such opinion sources as product reviews, forum posts, and blogs. However, existing research ...
Constructing and Evaluating a Novel Crowdsourcing-based Paraphrased Opinion Spam Dataset
WWW '17: Proceedings of the 26th International Conference on World Wide WebOpinion spam, intentionally written by spammers who do not have actual experience with services or products, has recently become a factor that undermines the credibility of information online. In recent years, studies have attempted to detect opinion ...
Neural networks for deceptive opinion spam detection
The products reviews are increasingly used by individuals and organizations for purchase and business decisions. Driven by the desire of profit, spammers produce synthesized reviews to promote some products or demote competitors products. So deceptive ...
Comments