Top

Published in:

2020 | OriginalPaper | Chapter

Automatic Annotation of Deceptive Online Reviews Using Topic Modelling

Authors : R. N. Pramukha, P. S. Venugopala

Published in: Modeling, Machine Learning and Astronomy

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The increasing popularity of e-commerce websites and online review platforms has unfortunately led to the advent of review spammers. This has, in turn, led to many problems, both in business and in academia. One of the major challenges in this field is the annotation of deceptive reviews. To date, different approaches have been employed in the creation of a labelled dataset for classification tasks. Many of these works follow a general approach and do not focus on any particular property of deceptive reviews. We believe that a fine-grained approach would be more suitable for such a complex problem. This paper focuses on a single property of deceptive reviews; the out-of-context property. We first find the minimum length of review required for obtaining coherent topics. We then propose a methodology for scoring and labelling the reviews and evaluate it by training different classifiers. We obtain an F-measure of 93.64 using labelled reviews obtained through the proposed methodology.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Random Subspace Combined LDA Based Machine Learning Model for OSCC Classifier

next chapter Machine Learning Technique for Analyzing the Behavior of Fish in an Aquarium

https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visualnetworking-index-vni/white-paper-c11-738429.html.

Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 219–230 (2008)

Hammad, A.A., El-Halees, A.: An approach for detecting spam in Arabic opinion reviews. Int. Arab J. Inf. Technol. 12(1), 9–16 (2015)

Fornaciari, T., Poesio, M.: Identifying fake Amazon reviews as learning from crowds (2015)

Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: IJCAI International Joint Conference on Artificial Intelligence (2011)

Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination, pp. 309–319, July 2011

Ott, M., Cardie, C., Hancock, J.T.: Negative deceptive opinion spam. In: Proceedings of NAACL-HLT 2013, pp. 497–501 (2013)

Harris, C.G.: Detecting deceptive opinion spam using human computation. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)

Ahmed, H., Traore, I., Saad, S.: Detecting opinion spams and fake news using text classification. Secur. Priv. 1(1), e9 (2018)CrossRef

Wilson, J., Hernández-Hall, C.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)

10.

Etaiwi, W., Naymat, G.: The impact of applying different preprocessing steps on review spam detection. Procedia Comput. Sci. 113, 273–279 (2017)CrossRef

11.

Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: AAAI, pp. 409–418 (2013)

12.

Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: KDD (2015)

13.

Li, H., Chen, Z., Liu, B., Wei, X., Shao, J.: Spotting fake reviews via collective positive-unlabeled learning. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2015)

14.

Mujtaba, G., Shuib, L., Raj, R.G., Majeed, N., Al-Garadi, M.A.: Email classification research trends: review and open issues. IEEE Access 5, 9044–9064 (2017)CrossRef

15.

Bhowmick, A., Hazarika, S.M.: E-mail spam filtering: a review of techniques and trends. In: Kalam, A., Das, S., Sharma, K. (eds.) Advances in Electronics, Communication and Computing. LNEE, vol. 443, pp. 583–590. Springer, Singapore (2018). https://doi.org/10.1007/978-981-10-4765-7_61CrossRef

16.

Abdulhamid, S.M., et al.: A review on mobile SMS spam filtering techniques. IEEE Access 5, 15650–15666 (2017)CrossRef

17.

Goh, K.L., Singh, A.K.: Comprehensive literature review on machine learning structures for web spam classification. Procedia Comput. Sci. 70, 434–441 (2015)CrossRef

18.

Wu, T., Wen, S., Xiang, Y., Zhou, W.: Twitter spam detection: survey of new approaches and comparative study. Comput. Secur. 76, 265–284 (2018)CrossRef

19.

Abdullah, A.O., Ali, M.A., Karabatak, M., Sengur, A.: A comparative analysis of common YouTube comment spam filtering techniques. In: 6th International Symposium on Digital Forensic and Security, ISDFS 2018 - Proceeding (2018)

20.

Jindal, N., Liu, B.: Review spam detection. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007), pp. 1189–1190 (2007)

21.

Bond, C.F., DePaulo, B.M.: Accuracy of deception judgments. Pers. Soc. Psychol. Rev. 10(3), 214–234 (2006)CrossRef

22.

Yoo, K.-H., Gretzel, U.: Comparison of deceptive and truthful travel reviews. In: Information and Communication Technologies in Tourism 2009 (2009)

23.

Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a General Rule for Identifying Deceptive Opinion Spam (2015)

24.

Cardoso, E.F., Silva, R.M., Almeida, T.A.: Towards automatic filtering of fake reviews. Neurocomputing 309, 106–116 (2018)CrossRef

25.

Kumar, A., et al.: Spotting opinion spammers using behavioral footprints. In: KDD 2013 Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640 (2013)

26.

Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013 (2013)

27.

Lau, R.Y.K., Liao, S.Y., Kwok, R.C.-W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detection. ACM Trans. Manag. Inf. Syst. 2(4), 1–30 (2011)CrossRef

28.

Fusilier, D.H., Cabrera, R.G., Montes-y-Gómez, M., Rosso, P.: Using PU-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (2013)

29.

Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH

30.

Blei, D., Carin, L., Dunson, D.: Probabilistic topic models. IEEE Signal Process. Mag. 27(6), 55–65 (2010)

31.

Chang, J., Gerrish, S., Wang, C., Boyd-graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 288–296. Curran Associates, Inc. (2009)

32.

Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 399–408 (2015)

33.

Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011)

34.

He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016)

35.

McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015)

36.

Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0026683CrossRef

Title: Automatic Annotation of Deceptive Online Reviews Using Topic Modelling
Authors: R. N. Pramukha
P. S. Venugopala
Publisher: Springer Singapore
Book: Modeling, Machine Learning and Astronomy
Print ISBN: 978-981-336-462-2

Electronic ISBN: 978-981-336-463-9

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-981-33-6463-9_4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner