Skip to main content
Erschienen in: Multimedia Systems 6/2022

11.11.2020 | Special Issue Paper

CyberBERT: BERT for cyberbullying identification

BERT for cyberbullying identification

verfasst von: Sayanta Paul, Sriparna Saha

Erschienen in: Multimedia Systems | Ausgabe 6/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Cyberbullying can be delineated as a purposive and recurrent act, which is aggressive in nature, done via different social media platforms such as Facebook, Twitter, Instagram, and others. A state-of-the-art pre-training language model, BERT (Bidirectional Encoder Representations from Transformers), has achieved remarkable results in many language understanding tasks. In this paper, we present a novel application of BERT for cyberbullying identification. A straightforward classification model using BERT is able to achieve the state-of-the-art results across three real-world corpora: Formspring (\(\sim 12\hbox {k}\) posts), Twitter (\(\sim 16\hbox {k}\) posts), and Wikipedia (\(\sim 100\hbox {k}\) posts). Experimental results demonstrate that our proposed model achieves significant improvements over existing works, in comparison with the slot-gated or attention-based deep neural network models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Peter, K.S., et al.: Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 49(4), 376–385 (2008)CrossRef Peter, K.S., et al.: Cyberbullying: Its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 49(4), 376–385 (2008)CrossRef
2.
Zurück zum Zitat Devlin, J., Chang, M.W., Lee, K., Toutanova K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: arXiv preprint arXiv:1810.04805 (2018) Devlin, J., Chang, M.W., Lee, K., Toutanova K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: arXiv preprint arXiv:​1810.​04805 (2018)
3.
Zurück zum Zitat Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: Fifth International AAAI Conference on Weblogs and Social Media (2011) Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)
4.
Zurück zum Zitat Reynolds, K., Kontostathis, A., Edwards, L.: Using machine learning to detect cyberbullying. Int. Conf. Mach. Learn. Appl. Workshop 2, 241–244 (2011) Reynolds, K., Kontostathis, A., Edwards, L.: Using machine learning to detect cyberbullying. Int. Conf. Mach. Learn. Appl. Workshop 2, 241–244 (2011)
5.
Zurück zum Zitat Djuric, N., et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web. pp. 29–30 (2015) Djuric, N., et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web. pp. 29–30 (2015)
6.
Zurück zum Zitat Badjatiya, P., et al.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017) Badjatiya, P., et al.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)
7.
Zurück zum Zitat Balakrishnan, V., Khan, S., Arabnia, H.R.: Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Comput. Sec. 90, 101710 (2020)CrossRef Balakrishnan, V., Khan, S., Arabnia, H.R.: Improving cyberbullying detection using Twitter users’ psychological features and machine learning. Comput. Sec. 90, 101710 (2020)CrossRef
8.
Zurück zum Zitat Raisi, E., Huang, B.: Cyberbullying identification using participant-vocabulary consistency. In: arXiv preprint arXiv:1606.08084 (2016) Raisi, E., Huang, B.: Cyberbullying identification using participant-vocabulary consistency. In: arXiv preprint arXiv:​1606.​08084 (2016)
9.
Zurück zum Zitat Squicciarini, A., et al.: Identification and characterization of cyberbullying dynamics in an online social network. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 280–285 (2015) Squicciarini, A., et al.: Identification and characterization of cyberbullying dynamics in an online social network. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, pp. 280–285 (2015)
10.
Zurück zum Zitat Aggarwal, A., et al.: Classification of fake news by fine-tuning deep bidirectional transformers based language model. EAI Endorsed Transactions on Scalable Information Systems Online First. EAI, Ghent (2020) Aggarwal, A., et al.: Classification of fake news by fine-tuning deep bidirectional transformers based language model. EAI Endorsed Transactions on Scalable Information Systems Online First. EAI, Ghent (2020)
11.
Zurück zum Zitat Lee, J., et al.: BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020) Lee, J., et al.: BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)
12.
Zurück zum Zitat Sergio, G.C., Lee, M.: Stacked DeBERT: All Attention in Incomplete Data for Text Classification. In: arXiv preprint arXiv:2001.00137 (2020) Sergio, G.C., Lee, M.: Stacked DeBERT: All Attention in Incomplete Data for Text Classification. In: arXiv preprint arXiv:​2001.​00137 (2020)
13.
Zurück zum Zitat Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. International Conference on Complex Networks and Their Applications, pp. 928–940. Springer, Berlin (2019) Mozafari, M., Farahbakhsh, R., Crespi, N.: A BERT-based transfer learning approach for hate speech detection in online social media. International Conference on Complex Networks and Their Applications, pp. 928–940. Springer, Berlin (2019)
14.
Zurück zum Zitat Pavlopoulos, J., et al.: Convai at semeval-2019 task 6: Offensive language identification and categorization with perspective and bert. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 571–576 (2019) Pavlopoulos, J., et al.: Convai at semeval-2019 task 6: Offensive language identification and categorization with perspective and bert. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 571–576 (2019)
15.
Zurück zum Zitat Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp. 88–93 (2016) Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp. 88–93 (2016)
16.
Zurück zum Zitat Wulczyn, E., Thain, N., Dixon, L.: Ex machina: Personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1391–1399 (2017) Wulczyn, E., Thain, N., Dixon, L.: Ex machina: Personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1391–1399 (2017)
17.
Zurück zum Zitat Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRefMATH Chawla, N.V., et al.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRefMATH
18.
19.
Zurück zum Zitat Agrawal, S., Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. European Conference on Information Retrieval, pp. 141–153. Springer, Berlin (2018) Agrawal, S., Awekar, A.: Deep learning for detecting cyberbullying across multiple social media platforms. European Conference on Information Retrieval, pp. 141–153. Springer, Berlin (2018)
20.
Zurück zum Zitat Dietteric, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)CrossRef Dietteric, T.G.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 10(7), 1895–1923 (1998)CrossRef
21.
Zurück zum Zitat Nuzzo, R.: Scientific method: Statistical errors. Nat. News 5067487(487), 150 (2014)CrossRef Nuzzo, R.: Scientific method: Statistical errors. Nat. News 5067487(487), 150 (2014)CrossRef
Metadaten
Titel
CyberBERT: BERT for cyberbullying identification
BERT for cyberbullying identification
verfasst von
Sayanta Paul
Sriparna Saha
Publikationsdatum
11.11.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
Multimedia Systems / Ausgabe 6/2022
Print ISSN: 0942-4962
Elektronische ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-020-00710-4

Weitere Artikel der Ausgabe 6/2022

Multimedia Systems 6/2022 Zur Ausgabe