Skip to main content

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

  • Conference paper
  • First Online:
Complex Networks and Their Applications VIII (COMPLEX NETWORKS 2019)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 881))

Included in the following conference series:

Abstract

Generated hateful and toxic content by a portion of users in social media is a rising phenomenon that motivated researchers to dedicate substantial efforts to the challenging direction of hateful content identification. We not only need an efficient automatic hate speech detection model based on advanced machine learning and natural language processing, but also a sufficiently large amount of annotated data to train a model. The lack of a sufficient amount of labelled hate speech data, along with the existing biases, has been the main issue in this domain of research. To address these needs, in this study we introduce a novel transfer learning approach based on an existing pre-trained language model called BERT (Bidirectional Encoder Representations from Transformers). More specifically, we investigate the ability of BERT at capturing hateful context within social media content by using new fine-tuning methods based on transfer learning. To evaluate our proposed approach, we use two publicly available datasets that have been annotated for racism, sexism, hate, or offensive content on Twitter. The results show that our solution obtains considerable performance on these datasets in terms of precision and recall in comparison to existing approaches. Consequently, our model can capture some biases in data annotation and collection process and can potentially lead us to a more accurate model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Anti-muslim hate crime surges after Manchester and London Bridge attacks (2017): https://www.theguardian.com.

  2. 2.

    A.: Hate on the rise after Trump’s election: http://www.newyorker.com.

  3. 3.

    https://sites.google.com/view/alw3/home.

  4. 4.

    https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/.

References

  1. Badjatiya, P., Gupta, S., Gupta, M., et al.: Deep learning for hate speech detection in tweets. CoRR abs/1706.00188 (2017). http://arxiv.org/abs/1706.00188

  2. Davidson, T., Bhattacharya, D., Weber, I.: Racial bias in hate speech and abusive language detection datasets. CoRR abs/1905.12516 (2019). http://arxiv.org/abs/1905.12516

  3. Davidson, T., Warmsley, D., Macy, M.W., et al.: Automated hate speech detection and the problem of offensive language. CoRR abs/1703.04009 (2017). http://arxiv.org/abs/1703.04009

  4. Devlin, J., Chang, M., Lee, K., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805

  5. Djuric, N., Zhou, J., Morris, R., et al.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Companion, pp. 29–30. ACM, New York (2015). https://doi.org/10.1145/2740908.2742760

  6. Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 85:1–85:30 (2018). https://doi.org/10.1145/3232676

    Article  Google Scholar 

  7. Founta, A.M., Chatzakou, D., Kourtellis, N., et al.: A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM Conference on Web Science, WebSci 2019, pp. 105–114. ACM, New York (2019)

    Google Scholar 

  8. Gambäck, B., Sikdar, U.K.: Using convolutional neural networks to classify hate-speech. In: Proceedings of the First Workshop on Abusive Language Online, pp. 85–90. Association for Computational Linguistics, Vancouver (2017). https://doi.org/10.18653/v1/W17-3013

  9. Howard, J., Ruder, S.: Fine-tuned language models for text classification. CoRR abs/1801.06146 (2018). http://arxiv.org/abs/1801.06146

  10. Malmasi, S., Zampieri, M.: Challenges in discriminating profanity from hate speech. CoRR abs/1803.05495 (2018). http://arxiv.org/abs/1803.05495

  11. Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 299–303. Association for Computational Linguistics, Los Angeles (2016). https://doi.org/10.18653/v1/W16-3638

  12. Mittos, A., Zannettou, S., Blackburn, J., et al.: And We Will Fight For Our Race! A Measurement Study of Genetic Testing Conversations on Reddit and 4chan. CoRR abs/1901.09735 (2019). http://arxiv.org/abs/1901.09735

  13. Nobata, C., Tetreault, J., Thomas, A., et al.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, WWW 2016, pp. 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland (2016). https://doi.org/10.1145/2872427.2883062

  14. Olteanu, A., Castillo, C., Boy, J., et al.: The effect of extremist violence on hateful speech online. CoRR abs/1804.05704 (2018). http://arxiv.org/abs/1804.05704

  15. Ottoni, R., Cunha, E., Magno, G., et al.: Analyzing right-wing Youtube channels: hate, violence and discrimination. In: Proceedings of the 10th ACM Conference on Web Science, WebSci 2018, pp. 323–332. ACM, New York (2018). https://doi.org/10.1145/3201064.3201081

  16. Pete, B., Williams, M.L.: Cyber hate speech on Twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7(2), 223–242 (2015). https://doi.org/10.1002/poi3.8

    Article  Google Scholar 

  17. Peters, M.E., Neumann, M., Iyyer, M., et al.: Deep contextualized word representations. CoRR abs/1802.05365 (2018). http://arxiv.org/abs/1802.05365

  18. Radford, A.: Improving language understanding by generative pre-training (2018)

    Google Scholar 

  19. Sap, M., Card, D., Gabriel, S., et al.: The risk of racial bias in hate speech detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1668–1678. Association for Computational Linguistics, Florence (2019). https://doi.org/10.18653/v1/P19-1163

  20. Waseem, Z.: Are you a racist or am I seeing things? Annotator influence on hate speech detection on Twitter. In: Proceedings of the First Workshop on NLP and Computational Social Science, pp. 138–142. Association for Computational Linguistics, Austin (2016). https://doi.org/10.18653/v1/W16-5618

  21. Waseem, Z., Davidson, T., Warmsley, D., et al.: Understanding abuse: a typology of abusive language detection subtasks. In: Proceedings of the First Workshop on Abusive Language Online, pp. 78–84. Association for Computational Linguistics, Vancouver (2017). https://doi.org/10.18653/v1/W17-3012, https://www.aclweb.org/anthology/W17-3012

  22. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93. Association for Computational Linguistics, San Diego (2016). https://doi.org/10.18653/v1/N16-2013

  23. Waseem, Z., Thorne, J., Bingel, J.: Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection, pp. 29–55. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-319-78583-7_3

    Chapter  Google Scholar 

  24. Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 602–608. Association for Computational Linguistics, Minneapolis (2019). https://doi.org/10.18653/v1/N19-1060

  25. Zhang, Z., Robinson, D., Tepper, J.: Detecting hate speech on twitter using a convolution-GRU based deep neural network. In: The Semantic Web, pp. 745–760. Springer International Publishing, Cham (2018)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marzieh Mozafari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mozafari, M., Farahbakhsh, R., Crespi, N. (2020). A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media. In: Cherifi, H., Gaito, S., Mendes, J., Moro, E., Rocha, L. (eds) Complex Networks and Their Applications VIII. COMPLEX NETWORKS 2019. Studies in Computational Intelligence, vol 881. Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_77

Download citation

Publish with us

Policies and ethics