nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

The Thin Line Between Hate and Profanity

verfasst von : Kosisochukwu Judith Madukwe, Xiaoying Gao

Erschienen in: AI 2019: Advances in Artificial Intelligence

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Hate speech can be defined as a language used to demean people within a specific group. Hate speech often contains explicitly profane words, however, the presence of these words does not always mean that the text instance is hateful. In some cases, text instances with profane words are just offensive language and they do not target any specific group, and so cannot be classified as hate speech. In this work, we build on existing studies to find a better demarcation between hate speech and offensive language. Our main contribution is to introduce the use of typed dependency as new features in our feature set. This new feature enables us to consider the relationship between long distance words in a text instance, thereby provides more identifying information than single word-based features. We evaluate our approach using a dataset with the classes: hate, offensive and neither. Comparing our work with existing studies, our feature set is much smaller but we achieve better accuracy and show comparable results in further analysis. Our detailed analysis also showed instances missed by the lexical features that were correctly predicted by the proposed feature set.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Using Feature Filtering Metrics as Meta-dimensions in Constructing Distributional Representations

Nächstes Kapitel To Extend or Not to Extend? Context-Specific Corpus Enrichment

http://alt.qcri.org/semeval2019/.

https://competitions.codalab.org/competitions/20011.

https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy.

https://www.facebook.com/communitystandards/hate_speech.

The authors have added ‘*’ for public viewing. These were not part of the original tweet.

https://spacy.io/.

https://spacy.io/api/annotation#dependency-parsing.

https://github.com/t-davidson/hate-speech-and-offensive-language.

https://hatebase.org/.

https://www.figure-eight.com/.

https://www.nltk.org/_modules/nltk/stem/wordnet.html.

Alorainy, W., Burnap, P., Liu, H., Williams, M.: The enemy among us: detecting hate speech with threats based ’othering’ language embeddings, vol. 9, no. 4, pp. 1–26 (2018). http://arxiv.org/abs/1801.07495

Burnap, P., Williams, M.: Hate speech, machine classification and statistical modelling of information flows on Twitter: interpretation and communication for policy decision making. Internet Policy Polit. 9999(9999), 1–18 (2015). http://orca.cf.ac.uk/id/eprint/65227%0A

Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5(1), 11 (2016). https://doi.org/10.1140/epjds/s13688-016-0072-6CrossRef

Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80, September 2012. https://doi.org/10.1109/SocialCom-PASSAT.2012.55

Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language (2017). http://arxiv.org/abs/1703.04009

Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 85:1–85:30 (2018). https://doi.org/10.1145/3232676CrossRef

Gaydhani, A., Doma, V., Kendre, S., Bhagwat, L.: Detecting hate speech and offensive language on Twitter using machine learning: An n-gram and TFIDF based approach. CoRR abs/1809.08651 (2018). http://arxiv.org/abs/1809.08651

Greevy, E.: Automatic text categorisation of racist webpages harassment (August 2004). http://doras.dcu.ie/17275/

Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2004, pp. 168–177. ACM, New York (2004). https://doi.org/10.1145/1014052.1014073

10.

Kim, E., Sung, Y., Kang, H.: Brand followers’ retweeting behavior on Twitter: how brand relationships influence brand electronic word-of-mouth. Comput. Hum. Behav. 37, 18–25 (2014)CrossRef

11.

Komninos, A., Manandhar, S.: Dependency based embeddings for sentence classification tasks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1490–1500. Association for Computational Linguistics, San Diego, June 2016. https://www.aclweb.org/anthology/N16-1175

12.

Levy, O., Goldberg, Y.: Dependency-based word embeddings. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers, pp. 302–308. Association for Computational Linguistics, Baltimore, June 2014. https://www.aclweb.org/anthology/P14-2050

13.

MacAvaney, S., Zeldes, A.: A deeper look into dependency-based word embeddings. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 40–45. Association for Computational Linguistics, New Orleans, June 2018. https://www.aclweb.org/anthology/N18-4006

14.

Malmasi, S., Cahill, A.: Measuring feature diversity in native language identification (July 2015). https://doi.org/10.3115/v1/W15-0606

15.

Malmasi, S., Zampieri, M.: Detecting hate speech in social media (2017). http://arxiv.org/abs/1712.06427

16.

Malmasi, S., Zampieri, M.: Challenges in discriminating profanity from hate speech. J. Exp. Theor. Artif. Intell. 30(2), 187–202 (2018). https://doi.org/10.1080/0952813X.2017.1409284CrossRef

17.

de Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006). European Language Resources Association (ELRA), Genoa, May 2006. http://www.lrec-conf.org/proceedings/lrec2006/pdf/440_pdf.pdf

18.

Mehdad, Y., Tetreault, J.: Do characters abuse more than words?, pp. 299–303 (September 2016). https://doi.org/10.18653/v1/w16-3638

19.

Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content pp. 145–153 (2017). https://doi.org/10.1145/2872427.2883062

20.

Rizoiu, M., Wang, T., Ferraro, G., Suominen, H.: Transfer learning for hate speech detection in social media. CoRR abs/1906.03829 (2019). http://arxiv.org/abs/1906.03829

21.

Robinson, D., Zhang, Z., Tepper, J.A.: Hate speech detection on Twitter: feature engineering v.s. feature selection. In: ESWC (2018)

22.

Stephens-Davidowitz, S.I.: The effects of racial animus on a black presidential candidate: using google search data to find what surveys miss (June 2012). https://ssrn.com/abstract=2050673

23.

Tan, L.K.W., Na, J.C., Theng, Y.L., Chang, K.: Phrase-level sentiment polarity classification using rule-based typed dependencies and additional complex phrases consideration. J. Comput. Sci. Technol. 27(3), 650–666 (2012). https://doi.org/10.1007/s11390-012-1251-yCrossRef

24.

Warner, W., Hirschberg, J.: Detecting hate speech on the world wide web. In: Proceedings of the 2012 Workshop on Language in Social Media (LSM), pp. 19–26 (2012). http://info.yahoo.com/legal/us/yahoo/utos/utos-173.html

25.

Waseem, Z.: Are you a racist or am i seeing things? annotator influence on hate speech detection on Twitter pp. 138–142 (2016). https://doi.org/10.18653/v1/w16-5618

26.

Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for hate speech detection on Twitter, pp. 88–93 (2016). https://doi.org/10.18653/v1/n16-2013

27.

Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018). https://doi.org/10.1109/ACCESS.2018.2806394CrossRef

28.

Zhang, Z., Luo, L.: Hate speech detection: a solved problem? the challenging case of long tail on Twitter. CoRR abs/1803.03662 (2018). http://arxiv.org/abs/1803.03662

Titel: The Thin Line Between Hate and Profanity
verfasst von: Kosisochukwu Judith Madukwe
Xiaoying Gao
Verlag: Springer International Publishing
Buch: AI 2019: Advances in Artificial Intelligence
Print ISBN: 978-3-030-35287-5

Electronic ISBN: 978-3-030-35288-2

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-35288-2_28

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner