abstract

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech

Authors:
Sandip Modha

LDRP-ITR, India

LDRP-ITR, India
View Profile

,
Thomas Mandl

University of Hildesheim, Germany

University of Hildesheim, Germany
View Profile

,
Gautam Kishore Shahi

University of Duisburg-Essen, Germany

University of Duisburg-Essen, Germany
View Profile

,
Hiren Madhu

Indian Institute of Science, India

Indian Institute of Science, India
View Profile

,
Shrey Satapara

DA-IICT, India

DA-IICT, India
View Profile

,
Tharindu Ranasinghe

University of Wolverhampton, United Kingdom

University of Wolverhampton, United Kingdom
View Profile

,
Marcos Zampieri

Rochester Institute of Technology, USA

Rochester Institute of Technology, USA
View Profile

FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval EvaluationDecember 2021Pages 1–3https://doi.org/10.1145/3503162.3503176

Published:26 January 2022Publication History

FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation

Pages 1–3

ABSTRACT

The HASOC track is dedicated to the evaluation of technology for finding Offensive Language and Hate Speech. HASOC is creating a multilingual data corpus mainly for English and under-resourced languages(Hindi and Marathi). This paper presents one HASOC subtrack with two tasks. In 2021, we organized the classification task for English, Hindi, and Marathi. The first task consists of two classification tasks; Subtask 1A consists of a binary and fine-grained classification into offensive and non-offensive tweets. Subtask 1B asks to classify the tweets into Hate, Profane and offensive. Task 2 consists of identifying tweets given additional context in the form of the preceding conversion. During the shared task, 65 teams have submitted 652 runs. This overview paper briefly presents the task descriptions, the data and the results obtained from the participant’s submission.

References

Kadam Aditya, Goel Anmol, Jain Jivitesh, Kalra Jushaan, Singh, Subramanian Mallika, Reddy Manvith, Kodali Prashant, H Arjun, T, Shrivastava Manish, and Kumaraguru Ponnurangam. 2021. Battling Hateful Content in Indic Languages HASOC ’21. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Glazkova Anna, Kadantsev Michael, and Glazkov Maksim. 2021. Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane Content Detection in English and Marathi. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Mitra Arka and Sankhala Priyanshu. 2021. Multilingual Hate Speech and Offensive Content Detection using Modified Cross-entropy Loss. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Thomas Mandl, Sandip Modha, Anand Kumar M, and Bharathi Raja Chakravarthi. 2020. Overview of the HASOC Track at FIRE 2020: Hate Speech and Offensive Language Identification in Tamil, Malayalam, Hindi, English and German. In FIRE 2020: Forum for Information Retrieval Evaluation, Hyderabad, India, December 16-20, 2020, Prasenjit Majumder, Mandar Mitra, Surupendu Gangopadhyay, and Parth Mehta (Eds.). ACM, 29–32. https://doi.org/10.1145/3441501.3441517Google ScholarDigital Library
Thomas Mandl, Sandip Modha, Prasenjit Majumder, Daksh Patel, Mohana Dave, Chintak Mandlia, and Aditya Patel. 2019. Overview of the hasoc track at fire 2019: Hate speech and offensive content identification in indo-european languages. In Proceedings of the 11th forum for information retrieval evaluation. 14–17.Google ScholarDigital Library
Thomas Mandl, Sandip Modha, Gautam Kishore Shahi, Hiren Madhu, Shrey Satapara, Prasenjit Majumder, Johannes Schäfer, Tharindu Ranasinghe, Marcos Zampieri, Durgesh Nandini, and Amit Kumar Jaiswal. 2021. Overview of the HASOC subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages. In Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation. CEUR. http://ceur-ws.org/Google Scholar
Nene Mayuresh, North Kai, Ranasinghe Tharindu, and Zampieri Marcos. 2021. Transformer Models for Offensive Language Identification in Marathi. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Bhatia Mehar, Bhotia Tenzin, Singhay, Agarwal Akshat, Ramesh Prakash, Gupta Shubham, Shridhar Kumar, Laumann Felix, and Dash Ayushman. 2021. One to Rule Them All: Towards Joint Indic Language Hate Speech Detection. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Sandip Modha, Prasenjit Majumder, Thomas Mandl, and Rishab Singla. 2021. Design and analysis of microblog-based summarization system. Social Network Analysis and Mining 11, 1 (2021), 1–16. https://doi.org/10.1007/s13278-021-00830-3Google ScholarCross Ref
Bölücü Necva and Canbay Pelin. 2021. Hate Speech and Offensive Content Identification with Graph Convolutional Networks. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Shrey Satapara, Sandip Modha, Thomas Mandl, Hiren Madhu, and Prasenjit Majumder. 2021. Overview of the HASOC Subtrack at FIRE 2021: Conversational Hate Speech Detection in Code-mixed language. In Working Notes of FIRE 2021 - Forum for Information Retrieval Evaluation. CEUR.Google Scholar
Mundra Shikha, Singh Nikhil, and Mittal Namita. 2021. Fine-tune BERT to Classify Hate Speech in Hindi English Code-Mixed Text. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Banerjee Somnath, Sarkar Maulindu, Agrawal Nancy, Saha Punyajoy, and Das Mithun. 2021. Exploring Transformer Based Models to Identify Hate Speech and Offensive Content in English and Indo-Aryan Languages. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Agustian Surya, Saputra Reski, and Fadhilah Aidil. 2021. “Feature Selection” with Pretrained-BERT for Hate Speech and Offensive Content Identification in English and Hindi Languages. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Kui Yongyi. 2021. Detect Hate and Offensive Content in English and Indo-Aryan Languages based on Transformer. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Bestgen Yves. 2021. A simple language-agnostic yet strong baseline system for hate speech and offensive content identification. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar
Farooqi Zaki, Mustafa, Ghosh Sreyan, and Shah Rajiv, Ratn. 2021. Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets. In Forum for Information Retrieval Evaluation (Working Notes) (FIRE). CEUR-WS.org.Google Scholar

Index Terms

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech

Index terms have been assigned to the content through auto-classification.

Recommendations

Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages
FIRE '22: Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation

In recent years, the spread of online offensive content has become of great concern, motivating researchers to develop robust systems capable of identifying such content automatically. To carry out a fair evaluation of these systems, several ...
Read More
Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala
FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

The evaluation of content moderation systems requires reliable benchmark data. This task becomes particularly formidable for low-resource languages, where obtaining or curating such data poses significant challenges. Addressing this issue, HASOC 2023 ...
Read More
Identifying and Categorising Profane Words in Hate Speech
ICCDA '18: Proceedings of the 2nd International Conference on Compute and Data Analysis

This study attempts to explore the different types of Hate Speech appearing in social media by identifying profane words used in hate speech. This study also compares the profane words used in different generations to assist in identifying the user's ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation
December 2021
113 pages
ISBN:9781450395960
DOI:10.1145/3503162
Editors:
Debasis Ganguly,
Surupendu Gangopadhyay,
Mandar Mitra,
Prasenjit Majumder
Copyright © 2021 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 January 2022
Check for updates
Author Tags
Multilingual Datasets
Under-resourced language
hate speech
social media
Qualifiers
- abstract
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate19of64submissions,30%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 328
  Total Downloads
- Downloads (Last 12 months)99
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech

FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala

Identifying and Categorising Profane Words in Hate Speech

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages and Conversational Hate Speech

FIRE '21: Proceedings of the 13th Annual Meeting of the Forum for Information Retrieval Evaluation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Overview of the HASOC Subtrack at FIRE 2022: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Overview of the HASOC Subtracks at FIRE 2023: Hate Speech and Offensive Content Identification in Assamese, Bengali, Bodo, Gujarati and Sinhala

Identifying and Categorising Profane Words in Hate Speech

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media