Skip to main content

2019 | OriginalPaper | Buchkapitel

Review Spam Detection Using Word Embeddings and Deep Neural Networks

verfasst von : Aliaksandr Barushka, Petr Hajek

Erschienen in: Artificial Intelligence Applications and Innovations

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Review spam (fake review) detection is increasingly important taking into consideration the rapid growth of internet purchases. Therefore, sophisticated spam filters must be designed to tackle the problem. Traditional machine learning algorithms use review content and other features to detect review spam. However, as demonstrated in related studies, the linguistic context of words may be of particular importance for text categorization. In order to enhance the performance of review spam detection, we propose a novel content-based approach that considers both bag-of-words and word context. More precisely, our approach utilizes n-grams and the skip-gram word embedding method to build a vector model. As a result, high-dimensional feature representation is generated. To handle the representation and classify the review spam accurately, a deep feed-forward neural network is used in the second step. To verify our approach, we use two hotel review datasets, including positive and negative reviews. We show that the proposed detection system outperforms other popular algorithms for review spam detection in terms of accuracy and area under ROC. Importantly, the system provides balanced performance on both classes, legitimate and spam, irrespective of review polarity.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Harris, C.: Detecting deceptive opinion spam using human computation. In: Workshops at AAAI on Artificial Intelligence, pp. 87–93. AAAI (2012) Harris, C.: Detecting deceptive opinion spam using human computation. In: Workshops at AAAI on Artificial Intelligence, pp. 87–93. AAAI (2012)
4.
Zurück zum Zitat Heydari, A., ali Tavakoli, M., Salim, N., Heydari, Z.: Detection of review spam: a survey. Expert Syst. Appl. 42(7), 3634–3642 (2015)CrossRef Heydari, A., ali Tavakoli, M., Salim, N., Heydari, Z.: Detection of review spam: a survey. Expert Syst. Appl. 42(7), 3634–3642 (2015)CrossRef
5.
Zurück zum Zitat Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Al Najada, H.: Survey of review spam detection using machine learning techniques. J. Big Data 2(1), 1–23 (2015)CrossRef Crawford, M., Khoshgoftaar, T.M., Prusa, J.D., Richter, A.N., Al Najada, H.: Survey of review spam detection using machine learning techniques. J. Big Data 2(1), 1–23 (2015)CrossRef
6.
Zurück zum Zitat Ren, Y., Ji, D.: Neural networks for deceptive opinion spam detection: an empirical study. Inf. Sci. 385, 213–224 (2017)CrossRef Ren, Y., Ji, D.: Neural networks for deceptive opinion spam detection: an empirical study. Inf. Sci. 385, 213–224 (2017)CrossRef
7.
Zurück zum Zitat Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, NIPS, vol. 26, pp. 3111–3119 (2013) Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, NIPS, vol. 26, pp. 3111–3119 (2013)
8.
Zurück zum Zitat Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, vol. 32, pp. 1188–1196. JMLR (2014) Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, vol. 32, pp. 1188–1196. JMLR (2014)
10.
Zurück zum Zitat Barushka, A., Hajek, P.: Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl. Intell. 48(10), 3538–3556 (2018)CrossRef Barushka, A., Hajek, P.: Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks. Appl. Intell. 48(10), 3538–3556 (2018)CrossRef
12.
Zurück zum Zitat Jindal, N., Liu, B.: Analyzing and detecting review spam. In: 7th IEEE International Conference on Data Mining (ICDM 2007), pp. 547–552. IEEE (2007) Jindal, N., Liu, B.: Analyzing and detecting review spam. In: 7th IEEE International Conference on Data Mining (ICDM 2007), pp. 547–552. IEEE (2007)
13.
Zurück zum Zitat Lim, E.P., Nguyen, V.A., Jindal, N., Liu, B., Lauw, H.W.: Detecting product review spammers using rating behaviors. In: 19th ACM International Conference on Information and Knowledge Management, pp. 939–948. ACM (2010) Lim, E.P., Nguyen, V.A., Jindal, N., Liu, B., Lauw, H.W.: Detecting product review spammers using rating behaviors. In: 19th ACM International Conference on Information and Knowledge Management, pp. 939–948. ACM (2010)
14.
Zurück zum Zitat Wang, G., Xie, S., Liu, B., Philip, S.Y.: Review graph based online store review spammer detection. In: 11th International Conference on Data mining (ICDM 2011), pp. 1242–1247. IEEE (2011) Wang, G., Xie, S., Liu, B., Philip, S.Y.: Review graph based online store review spammer detection. In: 11th International Conference on Data mining (ICDM 2011), pp. 1242–1247. IEEE (2011)
15.
Zurück zum Zitat Lau, R.Y., Liao, S.Y., Kwok, R.C.W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detecting. ACM Trans. Manage. Inf. Syst. 2(4), 1–30 (2011)CrossRef Lau, R.Y., Liao, S.Y., Kwok, R.C.W., Xu, K., Xia, Y., Li, Y.: Text mining and probabilistic language modeling for online review spam detecting. ACM Trans. Manage. Inf. Syst. 2(4), 1–30 (2011)CrossRef
16.
Zurück zum Zitat Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 2488–2493 (2011) Li, F., Huang, M., Yang, Y., Zhu, X.: Learning to identify review spam. In: International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 2488–2493 (2011)
17.
Zurück zum Zitat Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985–994. ACM (2015) Rayana, S., Akoglu, L.: Collective opinion spam detection: bridging review networks and metadata. In: 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985–994. ACM (2015)
18.
Zurück zum Zitat Xie, S., Wang, G., Lin, S., Yu, P.S.: Review spam detection via temporal pattern discovery. In: 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831. ACM (2012) Xie, S., Wang, G., Lin, S., Yu, P.S.: Review spam detection via temporal pattern discovery. In: 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831. ACM (2012)
19.
Zurück zum Zitat Ye, J., Kumar, S., Akoglu, L.: Temporal opinion spam detection by multivariate indicative signals. In: 10th International AAAI Conference on Web and Social Media (ICWSM 2016), pp. 743–746. AAAI (2016) Ye, J., Kumar, S., Akoglu, L.: Temporal opinion spam detection by multivariate indicative signals. In: 10th International AAAI Conference on Web and Social Media (ICWSM 2016), pp. 743–746. AAAI (2016)
20.
Zurück zum Zitat Li, H., et al.: Bimodal distribution and co-bursting in review spam detection. In: 26th International Conference on World Wide Web, pp. 1063–1072 (2017) Li, H., et al.: Bimodal distribution and co-bursting in review spam detection. In: 26th International Conference on World Wide Web, pp. 1063–1072 (2017)
21.
Zurück zum Zitat Li, H., Chen, Z., Mukherjee, A., Liu, B., Shao, J.: Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In: 9th International AAAI Conference on Web and Social Media (ICWSM 2015), pp. 634–637. AAAI (2015) Li, H., Chen, Z., Mukherjee, A., Liu, B., Shao, J.: Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In: 9th International AAAI Conference on Web and Social Media (ICWSM 2015), pp. 634–637. AAAI (2015)
22.
Zurück zum Zitat Ott, M., Cardie, C., Hancock, J.: Estimating the prevalence of deception in online review communities. In: 21st International Conference on World Wide Web, pp. 201–210. ACM (2012) Ott, M., Cardie, C., Hancock, J.: Estimating the prevalence of deception in online review communities. In: 21st International Conference on World Wide Web, pp. 201–210. ACM (2012)
23.
Zurück zum Zitat Liu, Y., Pang, B.: A unified framework for detecting author spamicity by modeling review deviation. Expert Syst. Appl. 112, 148–155 (2018)CrossRef Liu, Y., Pang, B.: A unified framework for detecting author spamicity by modeling review deviation. Expert Syst. Appl. 112, 148–155 (2018)CrossRef
24.
Zurück zum Zitat Ott, M., Cardie, C., Hancock, J.T.: Negative deceptive opinion spam. In: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 497–501 (2013) Ott, M., Cardie, C., Hancock, J.T.: Negative deceptive opinion spam. In: 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 497–501 (2013)
25.
Zurück zum Zitat Yilmaz, C.M., Durahim, A.O.: SPR2EP: a semi-supervised spam review detection framework. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 306–313. IEEE (2018) Yilmaz, C.M., Durahim, A.O.: SPR2EP: a semi-supervised spam review detection framework. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 306–313. IEEE (2018)
26.
Zurück zum Zitat Li, L., Qin, B., Ren, W., Liu, T.: Document representation and feature combination for deceptive spam review detection. Neurocomputing 254, 33–41 (2017)CrossRef Li, L., Qin, B., Ren, W., Liu, T.: Document representation and feature combination for deceptive spam review detection. Neurocomputing 254, 33–41 (2017)CrossRef
27.
Zurück zum Zitat Zhang, Y., Wang, S., Phillips, P., Ji, G.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl.-Based Syst. 64, 22–31 (2014)CrossRef Zhang, Y., Wang, S., Phillips, P., Ji, G.: Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl.-Based Syst. 64, 22–31 (2014)CrossRef
28.
Metadaten
Titel
Review Spam Detection Using Word Embeddings and Deep Neural Networks
verfasst von
Aliaksandr Barushka
Petr Hajek
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-19823-7_28