Abstract
Consumers increasingly rely on online reviews to assist them in their buying decisions. The rising popularity of e-commerce websites, hotel reviews, and social media has become a relevant research field in recent years. Online reviews affect people’s decisions in their day-to-day life; the fake review impacts both consumers and business organizations. They need to know how different types of consumers prefer consumer feedback, which influences their opinion. Automatic detection of such reviews is a difficult job, provided that the author writes in such a way that it seems like a real review. Previous work has tackled the identification of fake reviews in many fields, including food reviews or company reviews in a restaurant and hotels. In this study, we proposed a fully supervised approach to distinguish opinion spammers in online reviews. In this work, we have used labeled data that can be useful to classify real and fake reviews. We have also implemented various machine learning algorithms for classification on two different datasets (Yelp hotel review dataset, Yelp restaurant review dataset). We have performed the classification task on the features engineered dataset. Our experiment’s measured results show that Logistic regression performs better than other algorithms on most occasions. We may conclude that the presented study contributes to the existing literature with better accuracy from the obtained results.
Similar content being viewed by others
References
Chen, Y., & Xie, J. (2008). Online consumer review: Word-of-mouth as a new element of marketing communication mix. Management Science, 54(3), 477–491.
Ha, S. H., Bae, S., & Son, L. K. (2015). Impact of online consumer reviews on product sales: Quantitative analysis of the source effect. Applied Mathematics and Information Sciences, 9(2L), 373–387.
Filieri, R., & McLeay, F. (2014). E-wom and accommodation: An analysis of the factors that influence travelers’ adoption of information from online reviews. Journal of Travel Research, 53(1), 44–57.
Sotiriadis, M. D., & Van Zyl, C. (2013). Electronic word-of-mouth and online reviews in tourism services: The use of twitter by tourists. Electronic Commerce Research, 13(1), 103–124.
Hernandez-Nieves, E., Hernández, G., Gil-González, A.-B., Rodríguez-González, S., & Corchado, J. M. (2020). Fog computing architecture for personalized recommendation of banking products. Expert Systems with Applications, 140, 112900.
Jindal, N., & Liu, B. (2008). Opinion spam and analysis. In Proceedings of the 2008 international conference on web search and data mining (pp. 219–230).
Mukherjee, A. (2015). Detecting deceptive opinion spam using linguistics, behavioral and statistical modeling. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing: Tutorial abstracts (pp. 21–22).
Peddinti, S. T., Bilogrevic, I., Taft, N., Pelikan, M., Erlingsson, Ú., Anthonysamy, P., & Hogben, G. (2019). Reducing permission requests in mobile apps. Proceedings of the internet measurement conference (pp. 259–266).
Rudolph, S. The impact of online reviews on customers’ buying decisions, Business 2 Community.
Alothali, E., Zaki, N., Mohamed, E. A., & Alashwal, H. (2018). Detecting social bots on twitter: A literature review. In 2018 international conference on innovations in information technology (IIT), IEEE, 2018 (pp. 175–180).
Gillum, E. C., Ke, Q., Xie, Y., Yu, F., & Zhao, Y. (2011). Graph based bot-user detection, uS Patent 8,069,210 (Nov. 29).
Kudugunta, S., & Ferrara, E. (2018). Deep neural networks for bot detection. Information Sciences, 467, 312–322.
Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th annual meeting of the association for computational linguistics: Human language technologies-volume 1, Association for Computational Linguistics (pp. 309–319).
Feng, S., Xing, L., Gogar, A., & Choi, Y. (2012). Distributional footprints of deceptive product reviews. In Sixth international AAAI conference on weblogs and social media.
Zhang, D., Zhou, L., Kehoe, J. L., & Kilic, I. Y. (2016). What online reviewer behaviors really matter? Effects of verbal and nonverbal behaviors on detection of fake online reviews. Journal of Management Information Systems, 33(2), 456–481.
Keiningham, T. L., Cooil, B., Andreassen, T. W., & Aksoy, L. (2007). A longitudinal examination of net promoter and firm revenue growth. Journal of Marketing, 71(3), 39–51.
Jang, S., Prasad, A., & Ratchford, B. T. (2012). How consumers use product reviews in the purchase decision process. Marketing Letters, 23(3), 825–838.
Mudambi, S.M., & Schuff, D. (2010). Research note: What makes a helpful online review? a study of customer reviews on amazon. com, MIS quarterly 185–200.
Ye, Q., Law, R., Gu, B., & Chen, W. (2011). The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Computers in Human behavior, 27(2), 634–639.
Yin, D., Bond, S. D., & Zhang, H. (2014). Anxious or angry? effects of discrete emotions on the perceived helpfulness of online reviews. MIS Quarterly, 38(2), 539–560.
Reichheld, F. F. (2003). The one number you need to grow. Harvard Business Review, 81(12), 46–55.
Ding, X., Liu, B., & Yu, P.S. (2008). A holistic lexicon-based approach to opinion mining. In Proceedings of the 2008 international conference on web search and data mining (pp. 231–240).
Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Spotting opinion spammers using behavioral footprints. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 632–640).
Wang, G., Xie, S., Liu, B., & Philip, S. Y. (2011). Review graph based online store review spammer detection. In 2011 IEEE 11th international conference on data mining, IEEE (pp. 1242–1247).
Xie, S., Wang, G., Lin, S., & Yu, P. S. (2012). Review spam detection via temporal pattern discovery. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 823–831).
Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., & Ghosh, R. (2013). Exploiting burstiness in reviews for review spammer detection. In Seventh international AAAI conference on weblogs and social media.
Lim, E. -P., Nguyen, V. -A., Jindal, N., Liu, B., & Lauw, H. W. (2010). Detecting product review spammers using rating behaviors. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 939–948).
Ott, M., Cardie, C., & Hancock, J. (2012). Estimating the prevalence of deception in online review communities. In Proceedings of the 21st international conference on World Wide Web (pp. 201–210).
Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N. (2013). What yelp fake review filter might be doing? In Seventh international AAAI conference on weblogs and social media.
Fornaciari, T., & Poesio, M. Identifying fake amazon reviews as learning from crowds.
Dewang, R. K., & Singh, A. (2015). Identification of fake reviews using new set of lexical and syntactic features. Proceedings of the Sixth International Conference on Computer and Communication Technology, 2015, 115–119.
Li, H., Chen, Z., Mukherjee, A., Liu, B., & Shao, J. (2015). Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In Ninth international AAAI conference on web and social Media.
Fusilier, D. H., Montes-y Gómez, M., Rosso, P., & Cabrera, R. G. (2015). Detecting positive and negative deceptive opinions using pu-learning. Information Processing & Management, 51(4), 433–443.
Li, Y., Feng, X., & Zhang, S. (2016). Detecting fake reviews utilizing semantic and emotion model. In 2016 3rd international conference on information science and control engineering (ICISCE), IEEE (pp. 317–320).
Albitar, S., Fournier, S., & Espinasse, B. (2014). An effective tf/idf-based text-to-text semantic similarity measure for text classification. In International conference on web information systems engineering, Springer (pp. 105–114).
Dwoskin, E., & Timberg, C. How merchants use facebook to flood amazon with fake reviews, Washington Post.
Martin-Fuentes, E., Mateu, C., & Fernandez, C. (2018). Does verifying uses influence rankings? analyzing booking. com and tripadvisor, Tourism Analysis 23 (1) 1–15.
McNamee, R. (2020). Zucked: Waking up to the Facebook catastrophe. London: Penguin Books.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declairs that they do not have any conflict of interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jain, P.K., Pamula, R. & Ansari, S. A Supervised Machine Learning Approach for the Credibility Assessment of User-Generated Content. Wireless Pers Commun 118, 2469–2485 (2021). https://doi.org/10.1007/s11277-021-08136-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-021-08136-5