Skip to main content

2018 | OriginalPaper | Buchkapitel

3. Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection

Multi-task Learning for Domain Transfer of Hate Speech Detection

verfasst von : Zeerak Talat, James Thorne, Joachim Bingel

Erschienen in: Online Harassment

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Accurately detecting hate speech using supervised classification is dependent on data that is annotated by humans. Attaining high agreement amongst annotators though is difficult due to the subjective nature of the task, and different cultural, geographic and social backgrounds of the annotators. Furthermore, existing datasets capture only single types of hate speech such as sexism or racism; or single demographics such as people living in the United States, which negatively affects the recall when classifying data that are not captured in the training examples. End users of websites where hate speech may occur are exposed to risk of being exposed to explicit content due to the shortcomings in the training of automatic hate speech detection systems where unseen forms of hate speech or hate speech towards unseen groups are not captured. In this paper, we investigate methods for bridging differences in annotation and data collection of abusive language tweets such as different annotation schemes, labels, or geographic and cultural influences from data sampling. We consider three distinct sets of annotations, namely the annotations provided by Talat (2016), Talat and Hovy (2016), and Davidson et al. (2017). Specifically, we train a machine learning model using a multi-task learning (MTL) framework, where typically some auxiliary task is learned alongside a main task in order to gain better performance on the latter. Our approach distinguishes itself from most previous work in that we aim to train a model that is robust across data originating from different distributions and labeled under differing annotation guidelines, and that we understand these different datasets as different learning objectives in the way that classical work in multi-task learning does with different tasks. Here, we experiment with using fine-grained tags for annotation. Aided by the predictions in our models as well as the baseline models, we seek to show that it is possible to utilize distinct domains for classification as well as showing how cultural contexts influence classifier performance as the datasets we use are collected either exclusively from the U.S. Davidson et al. (2017) or collected globally with no geographic restriction (Talat 2016; Talat and Hovy 2016). Our choice for a multi-task learning set-up is motivated by a number of factors. Most importantly, MTL allows us to share knowledge between two or more objectives, such that we can leverage information encoded in one dataset to better fit another. As shown by Bingel and Søgaard (2017) and Martínez Alonso and Plank (2017), this is particularly promising when the auxiliary task has a more coarse-grained set of labels in comparison to the main task. Another benefit of MTL is that it lets us learn lower-level representations from greater amounts of data when compared to a single-task setup. This, in connection with MTL being known to work as a regularizer, is not only promising when it comes to fitting the training data, but also helps to prevent overfitting, especially when we have to deal with small datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
After re-annotation to unify class labels, if necessary.
 
2
The fact that in MTL we tend to learn both tasks simultaneously rather than in succession weakens this analogy to some degree. In fact, the simultaneous learning of two languages could actually make learning harder for humans. For a machine, however, the temporal order is less critical given its far superior memory when compared to humans.
 
3
Such choices include the number and width of the hidden layers, input representations, task-specific learning rates, training schedules, among others.
 
4
Context is not defined more clearly in their paper.
 
5
Note that in principle, hard parameter sharing also allows us to predict the different tasks at different depths of the model, e.g. to compute the output for task A from some hidden representation \(h_m\) and task B from \(h_n\) (with \(m \ne n\)). Yet another possible variation is to compute further hidden representations that are task-specific and not shared, but ultimately draw on some common lower-level representation.
 
6
A one-hot vector is a binary vector of indicator features that are 1 if that feature occurs in the document otherwise 0 in the feature does not occur in document.
 
7
Emoticons used in the text are removed, urls are replaced with “<url>” token, and usernames are replaced with “@user”.
 
8
“My n*ggah my n*ggah” is a reference to Denzel Washington’s character in the movie Training Day.
 
Literatur
Zurück zum Zitat Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion, WWW ’17 Companion, pp 759–760. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://doi.org/10.1145/3041021.3054223 Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion, WWW ’17 Companion, pp 759–760. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://​doi.​org/​10.​1145/​3041021.​3054223
Zurück zum Zitat Bingel J, Søgaard (2017) A identifying beneficial task relations for multi-task learning in deep neural networks. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, short papers, vol 2, pp 164–169. Association for Computational Linguistics, Valencia, Spain. http://www.aclweb.org/anthology/E17-2026 Bingel J, Søgaard (2017) A identifying beneficial task relations for multi-task learning in deep neural networks. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, short papers, vol 2, pp 164–169. Association for Computational Linguistics, Valencia, Spain. http://​www.​aclweb.​org/​anthology/​E17-2026
Zurück zum Zitat Bjerva J (2017) Will my auxiliary tagging task help? estimating auxiliary tasks effectivity in multi-task learning. In: Proceedings of the 21st Nordic conference on computational linguistics, NoDaLiDa, 22–24 May 2017, Gothenburg, Sweden, 131, pp 216–220. Linköping University Electronic Press (2017) Bjerva J (2017) Will my auxiliary tagging task help? estimating auxiliary tasks effectivity in multi-task learning. In: Proceedings of the 21st Nordic conference on computational linguistics, NoDaLiDa, 22–24 May 2017, Gothenburg, Sweden, 131, pp 216–220. Linköping University Electronic Press (2017)
Zurück zum Zitat Bollmann M, Bingel J, Søgaard A (2017) Learning attention for historical text normalization by learning to pronounce. In: Proceedings of the 55th annual meeting of the association for computational linguistics, long papers, vol 1, pp 332–344 Bollmann M, Bingel J, Søgaard A (2017) Learning attention for historical text normalization by learning to pronounce. In: Proceedings of the 55th annual meeting of the association for computational linguistics, long papers, vol 1, pp 332–344
Zurück zum Zitat Boyle K (2001) Hate speech-the united states versus the rest of the world. Maine Law Rev 53(2):487–502 Boyle K (2001) Hate speech-the united states versus the rest of the world. Maine Law Rev 53(2):487–502
Zurück zum Zitat Caruana R (1998) Multitask learning. Learning to learn, pp 95–133. Springer Caruana R (1998) Multitask learning. Learning to learn, pp 95–133. Springer
Zurück zum Zitat Caruana RA (1993) Multitask connectionist learning. In: Proceedings of the 1993 connectionist models summer school. CiteSeer Caruana RA (1993) Multitask connectionist learning. In: Proceedings of the 1993 connectionist models summer school. CiteSeer
Zurück zum Zitat Chandrasekharan E, Samory M, Srinivasan A, Gilbert E (2017) The bag of communities: identifying abusive behavior online with preexisting internet data. In: Proceedings of the 2017 CHI conference on human factors in computing systems, CHI ’17, ACM, New York, NY, USA, pp 3175–3187. https://doi.org/10.1145/3025453.3026018 Chandrasekharan E, Samory M, Srinivasan A, Gilbert E (2017) The bag of communities: identifying abusive behavior online with preexisting internet data. In: Proceedings of the 2017 CHI conference on human factors in computing systems, CHI ’17, ACM, New York, NY, USA, pp 3175–3187. https://​doi.​org/​10.​1145/​3025453.​3026018
Zurück zum Zitat Crenshaw K (1989) Demarginalizing the intersection of race and sex: a black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. Univ Chicago Legal Forum 1989(1) Crenshaw K (1989) Demarginalizing the intersection of race and sex: a black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. Univ Chicago Legal Forum 1989(1)
Zurück zum Zitat Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of ICWSM Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of ICWSM
Zurück zum Zitat European Commission (2016) Code of conduct on countering illegal hate speech online. Technical report European Commission (2016) Code of conduct on countering illegal hate speech online. Technical report
Zurück zum Zitat Gambäck B, Sikdar UK (2017) Using convolutional neural networks to classify hate-speech. In: Proceedings of the first workshop on abusive language online, pp 85–90. Association for Computational Linguistics. http://aclweb.org/anthology/W17-3013 Gambäck B, Sikdar UK (2017) Using convolutional neural networks to classify hate-speech. In: Proceedings of the first workshop on abusive language online, pp 85–90. Association for Computational Linguistics. http://​aclweb.​org/​anthology/​W17-3013
Zurück zum Zitat Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448 Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Zurück zum Zitat Home office (2016) Action against hate the UK governments plan for tackling hate crime. Technical report (2016) Home office (2016) Action against hate the UK governments plan for tackling hate crime. Technical report (2016)
Zurück zum Zitat Jha A, Mamidi R (2017) When does a compliment become sexist? analysis and classification of ambivalent sexism using twitter data. In: Proceedings of the second workshop on NLP and computational social science, pp 7–16. Association for Computational Linguistics. http://aclweb.org/anthology/W17-2902 Jha A, Mamidi R (2017) When does a compliment become sexist? analysis and classification of ambivalent sexism using twitter data. In: Proceedings of the second workshop on NLP and computational social science, pp 7–16. Association for Computational Linguistics. http://​aclweb.​org/​anthology/​W17-2902
Zurück zum Zitat Jørgensen A, Hovy D, Søgaard A (2016) Learning a pos tagger for AAVE-like language. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1115–1120. Association for Computational Linguistics, San Diego, California. http://www.aclweb.org/anthology/N16-1130 Jørgensen A, Hovy D, Søgaard A (2016) Learning a pos tagger for AAVE-like language. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1115–1120. Association for Computational Linguistics, San Diego, California. http://​www.​aclweb.​org/​anthology/​N16-1130
Zurück zum Zitat Klerke S, Goldberg Y, Søgaard A (2016) Improving sentence compression by learning to predict gaze. In: Proceedings of NAACL-HLT, pp 1528–1533 Klerke S, Goldberg Y, Søgaard A (2016) Improving sentence compression by learning to predict gaze. In: Proceedings of NAACL-HLT, pp 1528–1533
Zurück zum Zitat Levin S (2017) Moderators who had to view child abuse content sue Microsoft, claiming PTSD Levin S (2017) Moderators who had to view child abuse content sue Microsoft, claiming PTSD
Zurück zum Zitat Martínez Alonso H, Plank B (2017) When is multitask learning effective? semantic sequence prediction under varying data conditions. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, long papers, vol 1, pp 44–53. Association for Computational Linguistics, Valencia, Spain. http://www.aclweb.org/anthology/E17-1005 Martínez Alonso H, Plank B (2017) When is multitask learning effective? semantic sequence prediction under varying data conditions. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, long papers, vol 1, pp 44–53. Association for Computational Linguistics, Valencia, Spain. http://​www.​aclweb.​org/​anthology/​E17-1005
Zurück zum Zitat McIntosh P (1988) White privilege and male privilege: a personal account of coming to see correpondences through work in women’s studies McIntosh P (1988) White privilege and male privilege: a personal account of coming to see correpondences through work in women’s studies
Zurück zum Zitat Müller K, Schwarz C (2017) Fanning the flames of hate: social media and hate crime Müller K, Schwarz C (2017) Fanning the flames of hate: social media and hate crime
Zurück zum Zitat Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web, WWW ’16, pp 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://doi.org/10.1145/2872427.2883062 Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web, WWW ’16, pp 145–153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://​doi.​org/​10.​1145/​2872427.​2883062
Zurück zum Zitat Park JH, Fung P (2017) One-step and two-step classification for abusive language detection on twitter. In: Proceedings of the first workshop on abusive language online, pp 41–45. Association for Computational Linguistics. http://aclweb.org/anthology/W17-3006 Park JH, Fung P (2017) One-step and two-step classification for abusive language detection on twitter. In: Proceedings of the first workshop on abusive language online, pp 41–45. Association for Computational Linguistics. http://​aclweb.​org/​anthology/​W17-3006
Zurück zum Zitat Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv:1502.02072 Ramsundar B, Kearnes S, Riley P, Webster D, Konerding D, Pande V (2015) Massively multitask networks for drug discovery. arXiv:​1502.​02072
Zurück zum Zitat Roberts DE (2004) The social and moral cost of mass incarceration in African American communities. Stanf Law Rev 56(5):1271–1306 Roberts DE (2004) The social and moral cost of mass incarceration in African American communities. Stanf Law Rev 56(5):1271–1306
Zurück zum Zitat Ross B, Rist M, Carbonell G, Cabrera B, Kurowsky N, Wojatzki M (2016) Measuring the reliability of hate speech annotations: the case of the European refugee crisis. In: Beißwenger M, Wojatzki M, Zesch T (eds) Proceedings of NLP4CMC III: 3rd workshop on natural language processing for computer-mediated communication, Bochumer Linguistische Arbeitsberichte, vol 17, pp 6–9. Bochum Ross B, Rist M, Carbonell G, Cabrera B, Kurowsky N, Wojatzki M (2016) Measuring the reliability of hate speech annotations: the case of the European refugee crisis. In: Beißwenger M, Wojatzki M, Zesch T (eds) Proceedings of NLP4CMC III: 3rd workshop on natural language processing for computer-mediated communication, Bochumer Linguistische Arbeitsberichte, vol 17, pp 6–9. Bochum
Zurück zum Zitat Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642. Association for Computational Linguistics, Stroudsburg, PA Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642. Association for Computational Linguistics, Stroudsburg, PA
Zurück zum Zitat The Guardian (2017) Germany approves plans to fine social media firms up to €50 M (2017) The Guardian (2017) Germany approves plans to fine social media firms up to €50 M (2017)
Zurück zum Zitat Talat Z (2016) Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science, pp 138–142. Association for Computational Linguistics, Austin, Texas. http://aclweb.org/anthology/W16-5618 Talat Z (2016) Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the first workshop on NLP and computational social science, pp 138–142. Association for Computational Linguistics, Austin, Texas. http://​aclweb.​org/​anthology/​W16-5618
Zurück zum Zitat Talat Z, Davidson T, Warmsley D, Weber I (2017) Understanding abuse: a typology of abusive language detection subtasks. In: Proceedings of the first workshop on abusive language online. Association for Computational Linguistics Talat Z, Davidson T, Warmsley D, Weber I (2017) Understanding abuse: a typology of abusive language detection subtasks. In: Proceedings of the first workshop on abusive language online. Association for Computational Linguistics
Zurück zum Zitat Talat Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. Association for Computational Linguistics, San Diego, California Talat Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop. Association for Computational Linguistics, San Diego, California
Zurück zum Zitat Wulczyn E, Thain N, Dixon L (2017) Ex machina: personal attacks seen at scale. In: Proceedings of the 26th international conference on world wide web, WWW ’17, pp 1391–1399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://doi.org/10.1145/3038912.3052591 Wulczyn E, Thain N, Dixon L (2017) Ex machina: personal attacks seen at scale. In: Proceedings of the 26th international conference on world wide web, WWW ’17, pp 1391–1399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland. https://​doi.​org/​10.​1145/​3038912.​3052591
Zurück zum Zitat Yu J, Jiang J (2016) Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification. Association for Computational Linguistics Yu J, Jiang J (2016) Learning sentence embeddings with auxiliary tasks for cross-domain sentiment classification. Association for Computational Linguistics
Metadaten
Titel
Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection
verfasst von
Zeerak Talat
James Thorne
Joachim Bingel
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-78583-7_3

Neuer Inhalt