Skip to main content

2024 | OriginalPaper | Buchkapitel

A Deep Learning Framework for Assamese Toxic Comment Detection: Leveraging LSTM and BiLSTM Models with Attention Mechanism

verfasst von : Mandira Neog, Nomi Baruah

Erschienen in: Advances in Data-Driven Computing and Intelligent Systems

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As social media platforms grow in popularity, this research piece discusses the significance of creating a secure and positive online environment. The major goal is to protect users by detecting objectionable language in Assamese social media comments. The ultimate goal is to create a very effective mechanism for detecting toxic comments in Assamese, supporting a safe online environment. To address the lack of available datasets, a well-curated dataset was manually assembled for the experiment. Deep learning models such as LSTM and bidirectional LSTM (BiLSTM) were used to capture the contextual intricacies of user-generated comments. Notably, the BiLSTM model beats the LSTM model by including an attention mechanism, attaining a promising accuracy rate of 86.9% in successfully identifying toxic comments. Using the capabilities of the LSTM and BiLSTM models, a more robust and efficient approach for recognizing toxic phrases in Assamese is developed, aligned with the goal of building a secure, respectful, and toxic-free online environment.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Neelakandan S, Sridevi M, Saravanan C, Murugeswari K, Singh Pundir AK, Sridevi R, Lingaiah TB (2022) Deep learning approaches for cyberbullying detection and classification on social media. Computat Intell Neurosci 11:1–13 Neelakandan S, Sridevi M, Saravanan C, Murugeswari K, Singh Pundir AK, Sridevi R, Lingaiah TB (2022) Deep learning approaches for cyberbullying detection and classification on social media. Computat Intell Neurosci 11:1–13
2.
Zurück zum Zitat Vaidya A, Mai F, Ning Y (2020) Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection. In: ICWSM. 2020 May 26, vol 14(1). pp 683–9 Vaidya A, Mai F, Ning Y (2020) Empirical analysis of multi-task learning for reducing identity bias in toxic comment detection. In: ICWSM. 2020 May 26, vol 14(1). pp 683–9
3.
Zurück zum Zitat Maslej-Krešňáková V, Sarnovský M, Butka P, Machová K (2020) Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification. Appl Sci 10(23):8631CrossRef Maslej-Krešňáková V, Sarnovský M, Butka P, Machová K (2020) Comparison of deep learning models and various text pre-processing techniques for the toxic comments classification. Appl Sci 10(23):8631CrossRef
4.
Zurück zum Zitat Deka RR, Kalita S, Bhuyan MP, Sarma SK (2020) A study of various natural language processing works for assamese language. In: Dawn S, Balas V, Esposito A, Gope S, (eds) Intelligent techniques and applications in modern science and technology. ICIMSAT 2019. Learning and analytics in intelligent systems. vol 12. Springer, Cham, pp 6–15 Deka RR, Kalita S, Bhuyan MP, Sarma SK (2020) A study of various natural language processing works for assamese language. In: Dawn S, Balas V, Esposito A, Gope S, (eds) Intelligent techniques and applications in modern science and technology. ICIMSAT 2019. Learning and analytics in intelligent systems. vol 12. Springer, Cham, pp 6–15
5.
Zurück zum Zitat Dubey K, Nair R, Khan MU, Shaikh S (2020) Toxic comment detection using LSTM. ICAECC. 3rd edn. IEEE Xplore, pp 1–8 Dubey K, Nair R, Khan MU, Shaikh S (2020) Toxic comment detection using LSTM. ICAECC. 3rd edn. IEEE Xplore, pp 1–8
6.
Zurück zum Zitat Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–32CrossRef Xu G, Meng Y, Qiu X, Yu Z, Wu X (2019) Sentiment analysis of comment texts based on BiLSTM. IEEE Access 7:51522–32CrossRef
7.
Zurück zum Zitat Liu G, Guo J (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–38CrossRef Liu G, Guo J (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–38CrossRef
8.
Zurück zum Zitat Appidi AR, Srirangam VK, Suhas D, Shrivastava M (2020) Creation of corpus and analysis in code-mixed kannada-english twitter data for emotion prediction. In: Proceedings of the 28th international conference on computational linguistics; 2020 Dec 8–13; Barcelona, Spain (Online): International Committee on Computational Linguistics, pp 6703–9 Appidi AR, Srirangam VK, Suhas D, Shrivastava M (2020) Creation of corpus and analysis in code-mixed kannada-english twitter data for emotion prediction. In: Proceedings of the 28th international conference on computational linguistics; 2020 Dec 8–13; Barcelona, Spain (Online): International Committee on Computational Linguistics, pp 6703–9
9.
Zurück zum Zitat Murthy GS, Allu SR, Andhavarapu B, Bagadi M, Belusonti M (2020) Text based sentiment analysis using LSTM. Int J Eng Res Technol 9(5):299–303 Murthy GS, Allu SR, Andhavarapu B, Bagadi M, Belusonti M (2020) Text based sentiment analysis using LSTM. Int J Eng Res Technol 9(5):299–303
10.
Zurück zum Zitat Tripathi M (2021) Sentiment analysis of Nepali Covid19 tweets using nb svm and LSTM. J Artif Intell 3(03):151–68 Tripathi M (2021) Sentiment analysis of Nepali Covid19 tweets using nb svm and LSTM. J Artif Intell 3(03):151–68
11.
Zurück zum Zitat Huang F, Li X, Yuan C, Zhang S, Zhang J, Qiao S (2021) Attention-emotion-enhanced convolutional LSTM for sentiment analysis. IEEE Trans Neural Netw Learn Syst 33(9):4332–45CrossRef Huang F, Li X, Yuan C, Zhang S, Zhang J, Qiao S (2021) Attention-emotion-enhanced convolutional LSTM for sentiment analysis. IEEE Trans Neural Netw Learn Syst 33(9):4332–45CrossRef
12.
Zurück zum Zitat Long F, Zhou K, Ou W (2019) Sentiment analysis of text based on bidirectional LSTM with multi-head attention. IEEE Access 20(7):141960–9CrossRef Long F, Zhou K, Ou W (2019) Sentiment analysis of text based on bidirectional LSTM with multi-head attention. IEEE Access 20(7):141960–9CrossRef
13.
Zurück zum Zitat Elfaik H, Nfaoui EH (2020) Deep bidirectional LSTM network learning-based sentiment analysis for Arabic text. J Intell Syst 30(1):395–412 Elfaik H, Nfaoui EH (2020) Deep bidirectional LSTM network learning-based sentiment analysis for Arabic text. J Intell Syst 30(1):395–412
14.
Zurück zum Zitat Zhang Y, Wang J, Zhang X (2021) Conciseness is better: recurrent attention LSTM model for document-level sentiment analysis. Neurocomputing 28(462):101–12CrossRef Zhang Y, Wang J, Zhang X (2021) Conciseness is better: recurrent attention LSTM model for document-level sentiment analysis. Neurocomputing 28(462):101–12CrossRef
15.
Zurück zum Zitat Muhammad PF, Kusumaningrum R, Wibowo A (2021) Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Proc Comput Sci. 1(179):728–35CrossRef Muhammad PF, Kusumaningrum R, Wibowo A (2021) Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Proc Comput Sci. 1(179):728–35CrossRef
16.
Zurück zum Zitat Gandhi UD, Malarvizhi PK, Chandrababu G, Karthick G (2021) Sentiment analysis on twitter data by using convolutional neural network (CNN) and long short term memory (LSTM). Wireless Personal Commun 17:1–10 Gandhi UD, Malarvizhi PK, Chandrababu G, Karthick G (2021) Sentiment analysis on twitter data by using convolutional neural network (CNN) and long short term memory (LSTM). Wireless Personal Commun 17:1–10
17.
Zurück zum Zitat Srivastava T, Arora D, Sharma P (2023) Sentiment analysis of COVID-19 Tweets Using BiLSTM and CNN-BiLSTM. ICRTC 2022. In: Proceedings of international conference on recent trends in computing. Lecture notes in networks and systems, Singapore, Springer Nature, Mar 21 2023, pp 523–35 Srivastava T, Arora D, Sharma P (2023) Sentiment analysis of COVID-19 Tweets Using BiLSTM and CNN-BiLSTM. ICRTC 2022. In: Proceedings of international conference on recent trends in computing. Lecture notes in networks and systems, Singapore, Springer Nature, Mar 21 2023, pp 523–35
18.
Zurück zum Zitat Anusha MD, Shashirekha HL (2021) BiLSTM-sentiments analysis in code-mixed Dravidian Languages. FIRE 2021. In: Proceedings of forum for information retrieval evaluation, 13-17 Dec 2021, India, CEUR-WS vol 3159. pp 6–13 Anusha MD, Shashirekha HL (2021) BiLSTM-sentiments analysis in code-mixed Dravidian Languages. FIRE 2021. In: Proceedings of forum for information retrieval evaluation, 13-17 Dec 2021, India, CEUR-WS vol 3159. pp 6–13
19.
Zurück zum Zitat Wei J, Liao J, Yang Z, Wang S, Zhao Q (2020) BiLSTM with multi-polarity orthogonal attention for implicit sentiment analysis. Neurocomputing 28(383):165–73CrossRef Wei J, Liao J, Yang Z, Wang S, Zhao Q (2020) BiLSTM with multi-polarity orthogonal attention for implicit sentiment analysis. Neurocomputing 28(383):165–73CrossRef
20.
Zurück zum Zitat Hameed Z, Garcia-Zapirain B (2020) Sentiment classification using a single-layered BiLSTM model. IEEE Access 17(8):73992–4001CrossRef Hameed Z, Garcia-Zapirain B (2020) Sentiment classification using a single-layered BiLSTM model. IEEE Access 17(8):73992–4001CrossRef
21.
Zurück zum Zitat Lin CH, Nuha U (2023) Sentiment analysis of Indonesian datasets based on a hybrid deep-learning strategy. J Big Data 10(1):1–19CrossRef Lin CH, Nuha U (2023) Sentiment analysis of Indonesian datasets based on a hybrid deep-learning strategy. J Big Data 10(1):1–19CrossRef
22.
Zurück zum Zitat Saleh H, Alhothali A, Moria K (2023) Detection of hate speech using BERT and hate speech word embedding with deep model. Appl Artif Intell 37(1):384–405CrossRef Saleh H, Alhothali A, Moria K (2023) Detection of hate speech using BERT and hate speech word embedding with deep model. Appl Artif Intell 37(1):384–405CrossRef
23.
Zurück zum Zitat Vujici’c Stankovi’c S, Mladenovi’c M (2023) An approach to automatic classification of hate speech in sports domain on social media. J Big Data 10(1):1–6 Vujici’c Stankovi’c S, Mladenovi’c M (2023) An approach to automatic classification of hate speech in sports domain on social media. J Big Data 10(1):1–6
24.
Zurück zum Zitat Naqvi U, Majid A, Abbas SA (2021) UTSA: Urdu text sentiment analysis using deep learning methods. IEEE Access 12(9):114085–94CrossRef Naqvi U, Majid A, Abbas SA (2021) UTSA: Urdu text sentiment analysis using deep learning methods. IEEE Access 12(9):114085–94CrossRef
25.
Zurück zum Zitat Yang M, Wang J (2022) Adaptability of financial time series prediction based on BiLSTM. Proc Comput Sci 1(199):18–25CrossRef Yang M, Wang J (2022) Adaptability of financial time series prediction based on BiLSTM. Proc Comput Sci 1(199):18–25CrossRef
Metadaten
Titel
A Deep Learning Framework for Assamese Toxic Comment Detection: Leveraging LSTM and BiLSTM Models with Attention Mechanism
verfasst von
Mandira Neog
Nomi Baruah
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-99-9521-9_37