Skip to main content
Top

2020 | OriginalPaper | Chapter

Automatic Annotation of Deceptive Online Reviews Using Topic Modelling

Authors : R. N. Pramukha, P. S. Venugopala

Published in: Modeling, Machine Learning and Astronomy

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The increasing popularity of e-commerce websites and online review platforms has unfortunately led to the advent of review spammers. This has, in turn, led to many problems, both in business and in academia. One of the major challenges in this field is the annotation of deceptive reviews. To date, different approaches have been employed in the creation of a labelled dataset for classification tasks. Many of these works follow a general approach and do not focus on any particular property of deceptive reviews. We believe that a fine-grained approach would be more suitable for such a complex problem. This paper focuses on a single property of deceptive reviews; the out-of-context property. We first find the minimum length of review required for obtaining coherent topics. We then propose a methodology for scoring and labelling the reviews and evaluate it by training different classifiers. We obtain an F-measure of 93.64 using labelled reviews obtained through the proposed methodology.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 219–230 (2008) Jindal, N., Liu, B.: Opinion spam and analysis. In: Proceedings of the 2008 International Conference on Web Search and Data Mining, pp. 219–230 (2008)
2.
go back to reference Hammad, A.A., El-Halees, A.: An approach for detecting spam in Arabic opinion reviews. Int. Arab J. Inf. Technol. 12(1), 9–16 (2015) Hammad, A.A., El-Halees, A.: An approach for detecting spam in Arabic opinion reviews. Int. Arab J. Inf. Technol. 12(1), 9–16 (2015)
3.
go back to reference Fornaciari, T., Poesio, M.: Identifying fake Amazon reviews as learning from crowds (2015) Fornaciari, T., Poesio, M.: Identifying fake Amazon reviews as learning from crowds (2015)
4.
go back to reference Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: IJCAI International Joint Conference on Artificial Intelligence (2011) Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: IJCAI International Joint Conference on Artificial Intelligence (2011)
5.
go back to reference Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination, pp. 309–319, July 2011 Ott, M., Choi, Y., Cardie, C., Hancock, J.T.: Finding deceptive opinion spam by any stretch of the imagination, pp. 309–319, July 2011
6.
go back to reference Ott, M., Cardie, C., Hancock, J.T.: Negative deceptive opinion spam. In: Proceedings of NAACL-HLT 2013, pp. 497–501 (2013) Ott, M., Cardie, C., Hancock, J.T.: Negative deceptive opinion spam. In: Proceedings of NAACL-HLT 2013, pp. 497–501 (2013)
7.
go back to reference Harris, C.G.: Detecting deceptive opinion spam using human computation. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012) Harris, C.G.: Detecting deceptive opinion spam using human computation. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)
8.
go back to reference Ahmed, H., Traore, I., Saad, S.: Detecting opinion spams and fake news using text classification. Secur. Priv. 1(1), e9 (2018)CrossRef Ahmed, H., Traore, I., Saad, S.: Detecting opinion spams and fake news using text classification. Secur. Priv. 1(1), e9 (2018)CrossRef
9.
go back to reference Wilson, J., Hernández-Hall, C.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014) Wilson, J., Hernández-Hall, C.: VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Eighth International AAAI Conference on Weblogs and Social Media (2014)
10.
go back to reference Etaiwi, W., Naymat, G.: The impact of applying different preprocessing steps on review spam detection. Procedia Comput. Sci. 113, 273–279 (2017)CrossRef Etaiwi, W., Naymat, G.: The impact of applying different preprocessing steps on review spam detection. Procedia Comput. Sci. 113, 273–279 (2017)CrossRef
11.
go back to reference Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: AAAI, pp. 409–418 (2013) Mukherjee, A., Venkataraman, V., Liu, B., Glance, N.: What yelp fake review filter might be doing? In: AAAI, pp. 409–418 (2013)
12.
go back to reference Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: KDD (2015) Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: KDD (2015)
13.
go back to reference Li, H., Chen, Z., Liu, B., Wei, X., Shao, J.: Spotting fake reviews via collective positive-unlabeled learning. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2015) Li, H., Chen, Z., Liu, B., Wei, X., Shao, J.: Spotting fake reviews via collective positive-unlabeled learning. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2015)
14.
go back to reference Mujtaba, G., Shuib, L., Raj, R.G., Majeed, N., Al-Garadi, M.A.: Email classification research trends: review and open issues. IEEE Access 5, 9044–9064 (2017)CrossRef Mujtaba, G., Shuib, L., Raj, R.G., Majeed, N., Al-Garadi, M.A.: Email classification research trends: review and open issues. IEEE Access 5, 9044–9064 (2017)CrossRef
16.
go back to reference Abdulhamid, S.M., et al.: A review on mobile SMS spam filtering techniques. IEEE Access 5, 15650–15666 (2017)CrossRef Abdulhamid, S.M., et al.: A review on mobile SMS spam filtering techniques. IEEE Access 5, 15650–15666 (2017)CrossRef
17.
go back to reference Goh, K.L., Singh, A.K.: Comprehensive literature review on machine learning structures for web spam classification. Procedia Comput. Sci. 70, 434–441 (2015)CrossRef Goh, K.L., Singh, A.K.: Comprehensive literature review on machine learning structures for web spam classification. Procedia Comput. Sci. 70, 434–441 (2015)CrossRef
18.
go back to reference Wu, T., Wen, S., Xiang, Y., Zhou, W.: Twitter spam detection: survey of new approaches and comparative study. Comput. Secur. 76, 265–284 (2018)CrossRef Wu, T., Wen, S., Xiang, Y., Zhou, W.: Twitter spam detection: survey of new approaches and comparative study. Comput. Secur. 76, 265–284 (2018)CrossRef
19.
go back to reference Abdullah, A.O., Ali, M.A., Karabatak, M., Sengur, A.: A comparative analysis of common YouTube comment spam filtering techniques. In: 6th International Symposium on Digital Forensic and Security, ISDFS 2018 - Proceeding (2018) Abdullah, A.O., Ali, M.A., Karabatak, M., Sengur, A.: A comparative analysis of common YouTube comment spam filtering techniques. In: 6th International Symposium on Digital Forensic and Security, ISDFS 2018 - Proceeding (2018)
20.
go back to reference Jindal, N., Liu, B.: Review spam detection. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007), pp. 1189–1190 (2007) Jindal, N., Liu, B.: Review spam detection. In: Proceedings of the 16th International Conference on World Wide Web (WWW 2007), pp. 1189–1190 (2007)
21.
go back to reference Bond, C.F., DePaulo, B.M.: Accuracy of deception judgments. Pers. Soc. Psychol. Rev. 10(3), 214–234 (2006)CrossRef Bond, C.F., DePaulo, B.M.: Accuracy of deception judgments. Pers. Soc. Psychol. Rev. 10(3), 214–234 (2006)CrossRef
22.
go back to reference Yoo, K.-H., Gretzel, U.: Comparison of deceptive and truthful travel reviews. In: Information and Communication Technologies in Tourism 2009 (2009) Yoo, K.-H., Gretzel, U.: Comparison of deceptive and truthful travel reviews. In: Information and Communication Technologies in Tourism 2009 (2009)
23.
go back to reference Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a General Rule for Identifying Deceptive Opinion Spam (2015) Li, J., Ott, M., Cardie, C., Hovy, E.: Towards a General Rule for Identifying Deceptive Opinion Spam (2015)
24.
go back to reference Cardoso, E.F., Silva, R.M., Almeida, T.A.: Towards automatic filtering of fake reviews. Neurocomputing 309, 106–116 (2018)CrossRef Cardoso, E.F., Silva, R.M., Almeida, T.A.: Towards automatic filtering of fake reviews. Neurocomputing 309, 106–116 (2018)CrossRef
25.
go back to reference Kumar, A., et al.: Spotting opinion spammers using behavioral footprints. In: KDD 2013 Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640 (2013) Kumar, A., et al.: Spotting opinion spammers using behavioral footprints. In: KDD 2013 Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 632–640 (2013)
26.
go back to reference Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013 (2013) Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Exploiting burstiness in reviews for review spammer detection. In: Proceedings of the 7th International Conference on Weblogs and Social Media, ICWSM 2013 (2013)
27.
go back to reference Lau, R.Y.K., Liao, S.Y., Kwok, R.C.-W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detection. ACM Trans. Manag. Inf. Syst. 2(4), 1–30 (2011)CrossRef Lau, R.Y.K., Liao, S.Y., Kwok, R.C.-W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detection. ACM Trans. Manag. Inf. Syst. 2(4), 1–30 (2011)CrossRef
28.
go back to reference Fusilier, D.H., Cabrera, R.G., Montes-y-Gómez, M., Rosso, P.: Using PU-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (2013) Fusilier, D.H., Cabrera, R.G., Montes-y-Gómez, M., Rosso, P.: Using PU-learning to detect deceptive opinion spam. In: Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (2013)
29.
go back to reference Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
30.
go back to reference Blei, D., Carin, L., Dunson, D.: Probabilistic topic models. IEEE Signal Process. Mag. 27(6), 55–65 (2010) Blei, D., Carin, L., Dunson, D.: Probabilistic topic models. IEEE Signal Process. Mag. 27(6), 55–65 (2010)
31.
go back to reference Chang, J., Gerrish, S., Wang, C., Boyd-graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 288–296. Curran Associates, Inc. (2009) Chang, J., Gerrish, S., Wang, C., Boyd-graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Information Processing Systems 22, pp. 288–296. Curran Associates, Inc. (2009)
32.
go back to reference Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 399–408 (2015) Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015, pp. 399–408 (2015)
33.
go back to reference Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011) Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011)
34.
go back to reference He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016) He, R., McAuley, J.: Ups and downs: modeling the visual evolution of fashion trends with one-class collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, pp. 507–517 (2016)
35.
go back to reference McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015) McAuley, J., Targett, C., Shi, Q., van den Hengel, A.: Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 43–52 (2015)
Metadata
Title
Automatic Annotation of Deceptive Online Reviews Using Topic Modelling
Authors
R. N. Pramukha
P. S. Venugopala
Copyright Year
2020
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-33-6463-9_4

Premium Partner