Skip to main content
Erschienen in: Social Network Analysis and Mining 1/2023

01.12.2023 | Original Article

An ensemble transformer-based model for Arabic sentiment analysis

verfasst von: Omar Mohamed, Aly M. Kassem, Ali Ashraf, Salma Jamal, Ensaf Hussein Mohamed

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Sentiment analysis is a common and challenging task in natural language processing (NLP). It is a widely studied area of research; it facilitates capturing public opinions about a topic, product, or service. There is much research that tackles English sentiment analysis. However, the research in the Arabic language is behind other high-resource languages. Recently, models such as bidirectional encoder representations from transformers (BERT) and generative pre-trained transformer (GPT) have been widely used in many NLP tasks; it significantly improved performance in NLP tasks, especially sentiment analysis. However, Arabic was not a priority in their development. Several models focusing on Arabic have recently begun to pave the way for the latest technologies, such as ARBERT, MARBERT, and others. We used multiple datasets for training and testing-ASAD-A Twitter-based Benchmark Arabic Sentiment Analysis Dataset, ArSarcasm-v2, and SemEval-2017. We propose an ensemble learning approach that combines the multilingual model(XLM-T) and the monolingual model(MARBERT) to overcome the intricacies of the Arabic language that are difficult to address with a single model. It also addresses the problem of imbalanced data using a combination of focal loss and label smoothing. The experiments showed that our ensemble learning approach outperforms the state-of-the-art models on all the used datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbes I, Zaghouani W, El-Hardlo O, Ashour F (2020) DAICT: a dialectal arabic irony corpus extracted from twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6265–6271. https://aclanthology.org/2020.lrec-1.768 Abbes I, Zaghouani W, El-Hardlo O, Ashour F (2020) DAICT: a dialectal arabic irony corpus extracted from twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6265–6271. https://​aclanthology.​org/​2020.​lrec-1.​768
Zurück zum Zitat Abdelali A, Hassan S, Mubarak H, Darwish K, Samih Y (2021) Pre-Training BERT on Arabic Tweets: Practical Considerations. arXiv preprint arXiv:2102.10684 Abdelali A, Hassan S, Mubarak H, Darwish K, Samih Y (2021) Pre-Training BERT on Arabic Tweets: Practical Considerations. arXiv preprint arXiv:​2102.​10684
Zurück zum Zitat Abdel-Salam Reem (2021) WANLP 2021 Shared-Task: Towards Irony and Sentiment Detection in Arabic Tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. In: Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 306–311. https://aclanthology.org/2021.wanlp-1.37 Abdel-Salam Reem (2021) WANLP 2021 Shared-Task: Towards Irony and Sentiment Detection in Arabic Tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. In: Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 306–311. https://​aclanthology.​org/​2021.​wanlp-1.​37
Zurück zum Zitat Abdul-Mageed M, Elmadany A, Nagoudi E, Moatez B (2021) ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 7088–7105. https://doi.org/10.18653/v1/2021.acl-long.551 Abdul-Mageed M, Elmadany A, Nagoudi E, Moatez B (2021) ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 7088–7105. https://​doi.​org/​10.​18653/​v1/​2021.​acl-long.​551
Zurück zum Zitat Abo MEM, Raj RG, Qazi A (2019) A review on Arabic sentiment analysis: state-of-the-art, taxonomy and open research challenges. IEEE Access 7(2019):162008–162024CrossRef Abo MEM, Raj RG, Qazi A (2019) A review on Arabic sentiment analysis: state-of-the-art, taxonomy and open research challenges. IEEE Access 7(2019):162008–162024CrossRef
Zurück zum Zitat Alamro H, Alshehri M, Alharbi B, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2021) Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST Alamro H, Alshehri M, Alharbi B, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2021) Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST
Zurück zum Zitat Alayba AM, Palade V, England M, Iqbal R (2017) Arabic language sentiment analysis on health services, In 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) 1, 1, 114–118. https://doi.org/10.1109/ASAR.2017.8067771 Alayba AM, Palade V, England M, Iqbal R (2017) Arabic language sentiment analysis on health services, In 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) 1, 1, 114–118. https://​doi.​org/​10.​1109/​ASAR.​2017.​8067771
Zurück zum Zitat Alayba AM, Palade V, England M, Iqbal R (2018) A combined CNN and LSTM model for Arabic sentiment analysis. In: International Andreas H, Peter K, Min Tjoa A, Edgar W (eds) Machine Learning and Knowledge Extraction. Springer Publishing, Cham, pp 179–191CrossRef Alayba AM, Palade V, England M, Iqbal R (2018) A combined CNN and LSTM model for Arabic sentiment analysis. In: International Andreas H, Peter K, Min Tjoa A, Edgar W (eds) Machine Learning and Knowledge Extraction. Springer Publishing, Cham, pp 179–191CrossRef
Zurück zum Zitat Alharbi AI, Lee M (2020) Combining character and word embeddings for affect in Arabic Informal social media microblogs. In: International Elisabeth M, Farid M, Helmut H, Philipp C (eds) Natural language processing and information systems. Springer Publishing, Cham, pp 213–224CrossRef Alharbi AI, Lee M (2020) Combining character and word embeddings for affect in Arabic Informal social media microblogs. In: International Elisabeth M, Farid M, Helmut H, Philipp C (eds) Natural language processing and information systems. Springer Publishing, Cham, pp 213–224CrossRef
Zurück zum Zitat Alharbi B, Alamro H, Alshehri M, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2020) ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset Alharbi B, Alamro H, Alshehri M, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2020) ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset
Zurück zum Zitat Al-Twairesh N, Al-Negheimish H (2019) Surface and deep features ensemble for sentiment analysis of arabic tweets. IEEE Access 7(2019):84122–84131CrossRef Al-Twairesh N, Al-Negheimish H (2019) Surface and deep features ensemble for sentiment analysis of arabic tweets. IEEE Access 7(2019):84122–84131CrossRef
Zurück zum Zitat Antoun Wissam, Baly Fady, Hajj Hazem (2020) AraBERT: Transformer-based Model for Arabic Language Understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 9–15. https://aclanthology.org/2020.osact-1.2 Antoun Wissam, Baly Fady, Hajj Hazem (2020) AraBERT: Transformer-based Model for Arabic Language Understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 9–15. https://​aclanthology.​org/​2020.​osact-1.​2
Zurück zum Zitat Arazo E, Ortego D, Albert P, O’Connor N E, McGuinness K (2020) Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, online, 1–8 Arazo E, Ortego D, Albert P, O’Connor N E, McGuinness K (2020) Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, online, 1–8
Zurück zum Zitat Bahdanau Dzmitry, Cho Kyunghyun, Bengio Yoshua (2015) Neural Machine Translation by Jointly Learning to Align and Translate Bahdanau Dzmitry, Cho Kyunghyun, Bengio Yoshua (2015) Neural Machine Translation by Jointly Learning to Align and Translate
Zurück zum Zitat Barbieri F, Anke LE, Camacho-Collados J (2021) Xlm-t: a multilingual language model toolkit for twitter Barbieri F, Anke LE, Camacho-Collados J (2021) Xlm-t: a multilingual language model toolkit for twitter
Zurück zum Zitat Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Associat Computat Linguist 5(7):135–146CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Associat Computat Linguist 5(7):135–146CrossRef
Zurück zum Zitat Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747 Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8440–8451. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​747
Zurück zum Zitat Darwish K, Habash N, Abbas M, Al-Khalifa H, Al-Natsheh HT, Bouamor H, Bouzoubaa K, Cavalli-Sforza V, El-Beltagy SR, El-Hajj W et al (2021) A panoramic survey of natural language processing in the Arab world. Commun ACM 64(4):72–81CrossRef Darwish K, Habash N, Abbas M, Al-Khalifa H, Al-Natsheh HT, Bouamor H, Bouzoubaa K, Cavalli-Sforza V, El-Beltagy SR, El-Hajj W et al (2021) A panoramic survey of natural language processing in the Arab world. Commun ACM 64(4):72–81CrossRef
Zurück zum Zitat Darwish K, Mubarak H (2016) Farasa: a new fast and accurate Arabic Word Segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, 1070–1074. https://aclanthology.org/L16-1170 Darwish K, Mubarak H (2016) Farasa: a new fast and accurate Arabic Word Segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, 1070–1074. https://​aclanthology.​org/​L16-1170
Zurück zum Zitat Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423 Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://​doi.​org/​10.​18653/​v1/​N19-1423
Zurück zum Zitat DeYoung J, Jain S, Rajani N F, Lehman E, Xiong C, Socher R, Wallace B C (2020) ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4443–4458. https://doi.org/10.18653/v1/2020.acl-main.408 DeYoung J, Jain S, Rajani N F, Lehman E, Xiong C, Socher R, Wallace B C (2020) ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4443–4458. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​408
Zurück zum Zitat El Mahdaouy A, El Mekki A, Essefar K, El Mamoun N, Berrada I, Khoumsi A (2021) Deep multi-task model for sarcasm detection and sentiment analysis in Arabic Language. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 334–339. https://aclanthology.org/2021.wanlp-1.42 El Mahdaouy A, El Mekki A, Essefar K, El Mamoun N, Berrada I, Khoumsi A (2021) Deep multi-task model for sarcasm detection and sentiment analysis in Arabic Language. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 334–339. https://​aclanthology.​org/​2021.​wanlp-1.​42
Zurück zum Zitat El-Beltagy S R, El Kalamawy M, Soliman A B (2017) NileTMRG at SemEval-2017 Task 4: Arabic sentiment analysis. in proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 790–795. https://doi.org/10.18653/v1/S17-2133 El-Beltagy S R, El Kalamawy M, Soliman A B (2017) NileTMRG at SemEval-2017 Task 4: Arabic sentiment analysis. in proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 790–795. https://​doi.​org/​10.​18653/​v1/​S17-2133
Zurück zum Zitat Farha Ibrahim Abu, Magdy Walid (2019) Mazajak: An Online Arabic Sentiment Analyser. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Florence, Italy, 192–198. https://doi.org/10.18653/v1/W19-4621 Farha Ibrahim Abu, Magdy Walid (2019) Mazajak: An Online Arabic Sentiment Analyser. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Florence, Italy, 192–198. https://​doi.​org/​10.​18653/​v1/​W19-4621
Zurück zum Zitat Farha Ibrahim Abu, Magdy Walid (2020) From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 32–39. https://aclanthology.org/2020.osact-1.5 Farha Ibrahim Abu, Magdy Walid (2020) From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 32–39. https://​aclanthology.​org/​2020.​osact-1.​5
Zurück zum Zitat Farha Ibrahim Abu, Magdy Walid (2021) Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 21–31. https://aclanthology.org/2021.wanlp-1.3 Farha Ibrahim Abu, Magdy Walid (2021) Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 21–31. https://​aclanthology.​org/​2021.​wanlp-1.​3
Zurück zum Zitat Farha Ibrahim Abu, Zaghouani Wajdi, Magdy Walid (2021) Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 296–305. https://aclanthology.org/2021.wanlp-1.36 Farha Ibrahim Abu, Zaghouani Wajdi, Magdy Walid (2021) Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 296–305. https://​aclanthology.​org/​2021.​wanlp-1.​36
Zurück zum Zitat Gaanoun K, Benelallam I (2021) Sarcasm and sentiment detection in Arabic language a hybrid approach combining embeddings and rule-based features. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 351–356. https://aclanthology.org/2021.wanlp-1.45 Gaanoun K, Benelallam I (2021) Sarcasm and sentiment detection in Arabic language a hybrid approach combining embeddings and rule-based features. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 351–356. https://​aclanthology.​org/​2021.​wanlp-1.​45
Zurück zum Zitat Ganaie MA, Hu M et al. (2021) Ensemble deep learning: A review Ganaie MA, Hu M et al. (2021) Ensemble deep learning: A review
Zurück zum Zitat González José-Ángel, Pla F, Hurtado L-F (2017) ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 723–727. https://doi.org/10.18653/v1/S17-2121 González José-Ángel, Pla F, Hurtado L-F (2017) ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 723–727. https://​doi.​org/​10.​18653/​v1/​S17-2121
Zurück zum Zitat Goyal N, Du J, Ott M, Anantharaman G, Conneau A (2021) Larger-Scale transformers for multilingual masked language modeling. In: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). Association for Computational Linguistics, Online, 29–33. https://doi.org/10.18653/v1/2021.repl4nlp-1.4 Goyal N, Du J, Ott M, Anantharaman G, Conneau A (2021) Larger-Scale transformers for multilingual masked language modeling. In: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). Association for Computational Linguistics, Online, 29–33. https://​doi.​org/​10.​18653/​v1/​2021.​repl4nlp-1.​4
Zurück zum Zitat Gururangan S, Marasović A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith N A (2020) Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8342–8360. https://doi.org/10.18653/v1/2020.acl-main.740 Gururangan S, Marasović A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith N A (2020) Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8342–8360. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​740
Zurück zum Zitat Hegazi MO, Al-Dossari Y, Al-Yahy A, Al-Sumari A, Hilal A (2021) Preprocessing Arabic text on social media. Heliyon 7(2):e06191CrossRef Hegazi MO, Al-Dossari Y, Al-Yahy A, Al-Sumari A, Hilal A (2021) Preprocessing Arabic text on social media. Heliyon 7(2):e06191CrossRef
Zurück zum Zitat Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Proced Comput Sci 142(2018):114–122CrossRef Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Proced Comput Sci 142(2018):114–122CrossRef
Zurück zum Zitat Hinton G, Vinyals O, Dean J et al (2015) Distilling the knowledge in a neural network Hinton G, Vinyals O, Dean J et al (2015) Distilling the knowledge in a neural network
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computat 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computat 9(8):1735–1780CrossRef
Zurück zum Zitat Htait A, Fournier S, Bellot P (2017) LSIS at SemEval-2017 Task 4: using adapted sentiment similarity seed words for english and arabic tweet polarity classification. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 718–722. https://doi.org/10.18653/v1/S17-2120 Htait A, Fournier S, Bellot P (2017) LSIS at SemEval-2017 Task 4: using adapted sentiment similarity seed words for english and arabic tweet polarity classification. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 718–722. https://​doi.​org/​10.​18653/​v1/​S17-2120
Zurück zum Zitat Jabreel M, Moreno A (2017) SiTAKA at SemEval-2017 Task 4: sentiment analysis in twitter based on a rich set of features. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 694–699. https://doi.org/10.18653/v1/S17-2115 Jabreel M, Moreno A (2017) SiTAKA at SemEval-2017 Task 4: sentiment analysis in twitter based on a rich set of features. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 694–699. https://​doi.​org/​10.​18653/​v1/​S17-2115
Zurück zum Zitat Jacovi A, Goldberg Y (2020) Towards faithfully interpretable NLP systems: how should we define and evaluate faithfulness?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4198–4205. https://doi.org/10.18653/v1/2020.acl-main.386 Jacovi A, Goldberg Y (2020) Towards faithfully interpretable NLP systems: how should we define and evaluate faithfulness?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4198–4205. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​386
Zurück zum Zitat James B, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(2):281–305MathSciNet James B, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(2):281–305MathSciNet
Zurück zum Zitat Jurek A, Mulvenna MD, Bi Y (2015) Improved lexicon-based sentiment analysis for social media analytics. Sec Informat 4(1):1–13 Jurek A, Mulvenna MD, Bi Y (2015) Improved lexicon-based sentiment analysis for social media analytics. Sec Informat 4(1):1–13
Zurück zum Zitat Kaushik C, Mishra A (2014) A scalable, lexicon based technique for sentiment analysis Kaushik C, Mishra A (2014) A scalable, lexicon based technique for sentiment analysis
Zurück zum Zitat Khalil T, Halaby A, Hammad M, El-Beltagy S R (2015) Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis. In: 2015 First International Conference on Arabic Computational Linguistics (ACLing). IEEE, online, 86–93 Khalil T, Halaby A, Hammad M, El-Beltagy S R (2015) Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis. In: 2015 First International Conference on Arabic Computational Linguistics (ACLing). IEEE, online, 86–93
Zurück zum Zitat Khan HU, Peacock D (2019) Possible effects of emoticon and emoji on sentiment analysis web services of work organisations. Int J Work Organisat Emot 10(2):130–161CrossRef Khan HU, Peacock D (2019) Possible effects of emoticon and emoji on sentiment analysis web services of work organisations. Int J Work Organisat Emot 10(2):130–161CrossRef
Zurück zum Zitat Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, Melnikov A, Kliushkina N, Araya C, Yan S et al (2020) A unified and generic model interpretability library for pytorch, Captum Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, Melnikov A, Kliushkina N, Araya C, Yan S et al (2020) A unified and generic model interpretability library for pytorch, Captum
Zurück zum Zitat Kudo Taku (2018) Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 66–75. https://doi.org/10.18653/v1/P18-1007 Kudo Taku (2018) Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 66–75. https://​doi.​org/​10.​18653/​v1/​P18-1007
Zurück zum Zitat LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Computat 1(4):541–551CrossRef LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Computat 1(4):541–551CrossRef
Zurück zum Zitat Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. IEEE, online, 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. IEEE, online, 2980–2988
Zurück zum Zitat Liu C, Fang F, Lin X, Cai T, Tan X, Liu J, Lu X (2021) Improving sentiment analysis accuracy with emoji embedding. J Safety Sci Resil 2(4):246–252CrossRef Liu C, Fang F, Lin X, Cai T, Tan X, Liu J, Lu X (2021) Improving sentiment analysis accuracy with emoji embedding. J Safety Sci Resil 2(4):246–252CrossRef
Zurück zum Zitat Mahmoud A-A (2015) Essa Safa Bani, Alsmadi Izzat (2015) Lexicon-based sentiment analysis of arabic tweets. Int J Soc Network Min 2(2):101–114CrossRef Mahmoud A-A (2015) Essa Safa Bani, Alsmadi Izzat (2015) Lexicon-based sentiment analysis of arabic tweets. Int J Soc Network Min 2(2):101–114CrossRef
Zurück zum Zitat Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space
Zurück zum Zitat Mohammad A-S, Bashar T, Mahmoud A-A, Yaser J (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybernet 10(8):2163–2175CrossRef Mohammad A-S, Bashar T, Mahmoud A-A, Yaser J (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybernet 10(8):2163–2175CrossRef
Zurück zum Zitat Morris J, Lifland E, Yoo J Y, Grigsby J, Jin D, Qi Y (2020) TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 119–126. https://doi.org/10.18653/v1/2020.emnlp-demos.16 Morris J, Lifland E, Yoo J Y, Grigsby J, Jin D, Qi Y (2020) TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 119–126. https://​doi.​org/​10.​18653/​v1/​2020.​emnlp-demos.​16
Zurück zum Zitat Mubarak H, Hassan S, Chowdhury S A (2022) Emojis as anchors to detect Arabic offensive language and hate speech Mubarak H, Hassan S, Chowdhury S A (2022) Emojis as anchors to detect Arabic offensive language and hate speech
Zurück zum Zitat Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P HS, Dokania P K (2020) Calibrating deep neural networks using focal loss Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P HS, Dokania P K (2020) Calibrating deep neural networks using focal loss
Zurück zum Zitat Nabil M, Aly M, Atiya A (2015) ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for computational linguistics, Lisbon, Portugal, 2515–2519. https://doi.org/10.18653/v1/D15-1299 Nabil M, Aly M, Atiya A (2015) ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for computational linguistics, Lisbon, Portugal, 2515–2519. https://​doi.​org/​10.​18653/​v1/​D15-1299
Zurück zum Zitat Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. In: SICS Technical Report. Swedish Institute of Computer Science, online, p 1–59 Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. In: SICS Technical Report. Swedish Institute of Computer Science, online, p 1–59
Zurück zum Zitat Opitz David, Maclin Richard (1999) Popular ensemble methods: an empirical study. J Artific Intell R 11(1999):169–198 Opitz David, Maclin Richard (1999) Popular ensemble methods: an empirical study. J Artific Intell R 11(1999):169–198
Zurück zum Zitat Oueslati Oumaima, Cambria Erik, HajHmida Moez Ben, Ounelli Habib (2020) A review of sentiment analysis research in Arabic language. Future Generat Comput Syst 112(2020):408–430CrossRef Oueslati Oumaima, Cambria Erik, HajHmida Moez Ben, Ounelli Habib (2020) A review of sentiment analysis research in Arabic language. Future Generat Comput Syst 112(2020):408–430CrossRef
Zurück zum Zitat Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Informat Sci 46(4):544–559CrossRef Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Informat Sci 46(4):544–559CrossRef
Zurück zum Zitat Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532–1543. https://doi.org/10.3115/v1/D14-1162 Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532–1543. https://​doi.​org/​10.​3115/​v1/​D14-1162
Zurück zum Zitat Rabbimov I, Mporas I, Simaki V, Kobilov S (2020) Investigating the effect of emoji in opinion classification of Uzbek movie review comments. In: International Conference on Speech and Computer. Springer, online, p 435–445 Rabbimov I, Mporas I, Simaki V, Kobilov S (2020) Investigating the effect of emoji in opinion classification of Uzbek movie review comments. In: International Conference on Speech and Computer. Springer, online, p 435–445
Zurück zum Zitat Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9 Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9
Zurück zum Zitat Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International Conference on Machine Learning. PMLR, online, p 5301–5310 Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International Conference on Machine Learning. PMLR, online, p 5301–5310
Zurück zum Zitat Ribeiro M, Singh S, Guestrin C (2016) Why Should I Trust You?: explaining the predictions of any classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, San Diego, California, 97–101. https://doi.org/10.18653/v1/N16-3020 Ribeiro M, Singh S, Guestrin C (2016) Why Should I Trust You?: explaining the predictions of any classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, San Diego, California, 97–101. https://​doi.​org/​10.​18653/​v1/​N16-3020
Zurück zum Zitat Robert G, Jörn-Henrik J, Claudio M, Richard Z, Wieland B, Matthias B, Wichmann Felix A (2020) Shortcut learning in deep neural networks. Nature Mach Intell 2(11):665–673CrossRef Robert G, Jörn-Henrik J, Claudio M, Richard Z, Wieland B, Matthias B, Wichmann Felix A (2020) Shortcut learning in deep neural networks. Nature Mach Intell 2(11):665–673CrossRef
Zurück zum Zitat Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 Task 4: sentiment analysis in twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 502–518. https://doi.org/10.18653/v1/S17-2088 Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 Task 4: sentiment analysis in twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 502–518. https://​doi.​org/​10.​18653/​v1/​S17-2088
Zurück zum Zitat Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the fourteenth workshop on semantic evaluation. International Committee for Computational Linguistics, Barcelona (online), 2054–2059. https://doi.org/10.18653/v1/2020.semeval-1.271 Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the fourteenth workshop on semantic evaluation. International Committee for Computational Linguistics, Barcelona (online), 2054–2059. https://​doi.​org/​10.​18653/​v1/​2020.​semeval-1.​271
Zurück zum Zitat Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units
Zurück zum Zitat Shekar BH, Dagnew G (2019) Grid search-based hyperparameter tuning and classification of microarray cancer data. In 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). IEEE, online, 1–8 Shekar BH, Dagnew G (2019) Grid search-based hyperparameter tuning and classification of microarray cancer data. In 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). IEEE, online, 1–8
Zurück zum Zitat Shiha M, Ayvaz S (2017) The effects of emoji in sentiment analysis. Int J Comput Electr Eng (IJCEE) 9(1):360–369CrossRef Shiha M, Ayvaz S (2017) The effects of emoji in sentiment analysis. Int J Comput Electr Eng (IJCEE) 9(1):360–369CrossRef
Zurück zum Zitat Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Patwary M, Prabhat MR, Adams R (2015) Scalable bayesian optimization using deep neural networks. In International conference on machine learning. PMLR, online, 2171–2180 Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Patwary M, Prabhat MR, Adams R (2015) Scalable bayesian optimization using deep neural networks. In International conference on machine learning. PMLR, online, 2171–2180
Zurück zum Zitat Soliman T-H, Elmasry MA, Hedar A, Doss MM (2014) Sentiment analysis of Arabic slang comments on facebook. Int J Comput Technol 12(5):3470–3478CrossRef Soliman T-H, Elmasry MA, Hedar A, Doss MM (2014) Sentiment analysis of Arabic slang comments on facebook. Int J Comput Technol 12(5):3470–3478CrossRef
Zurück zum Zitat Song B, Pan C, Wang S, Luo Z (2021) DeepBlueAI at WANLP-EACL2021 task 2: a deep ensemble-based method for sarcasm and sentiment detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 390–394. https://aclanthology.org/2021.wanlp-1.52 Song B, Pan C, Wang S, Luo Z (2021) DeepBlueAI at WANLP-EACL2021 task 2: a deep ensemble-based method for sarcasm and sentiment detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 390–394. https://​aclanthology.​org/​2021.​wanlp-1.​52
Zurück zum Zitat Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, online, p 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, online, p 2818–2826
Zurück zum Zitat Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Computat Linguist 37(2):267–307CrossRef Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Computat Linguist 37(2):267–307CrossRef
Zurück zum Zitat Tenney I, Wexler J, Bastings J, Bolukbasi T, Coenen A, Gehrmann S, Jiang E, Pushkarna M, Radebaugh C, Reif E, et al (2020) The language interpretability tool: extensible, interactive visualizations and analysis for NLP models. (2020) Tenney I, Wexler J, Bastings J, Bolukbasi T, Coenen A, Gehrmann S, Jiang E, Pushkarna M, Radebaugh C, Reif E, et al (2020) The language interpretability tool: extensible, interactive visualizations and analysis for NLP models. (2020)
Zurück zum Zitat Utlu I, Yücesoy V, Koc A, Cukur T, Senel L-K (2018) Semantic structure and interpretability of word embeddings. IEEE/ACM Trans Audio, Speech Language Process 26(10):1769–1779CrossRef Utlu I, Yücesoy V, Koc A, Cukur T, Senel L-K (2018) Semantic structure and interpretability of word embeddings. IEEE/ACM Trans Audio, Speech Language Process 26(10):1769–1779CrossRef
Zurück zum Zitat Wadhawan A(2021) Arabert and farasa segmentation based approach for sarcasm and sentiment detection in arabic tweets Wadhawan A(2021) Arabert and farasa segmentation based approach for sarcasm and sentiment detection in arabic tweets
Zurück zum Zitat Wang J, Xu J, Wang X (2018) Combination of hyperband and Bayesian optimization for hyperparameter optimization in deep learning Wang J, Xu J, Wang X (2018) Combination of hyperband and Bayesian optimization for hyperparameter optimization in deep learning
Zurück zum Zitat Wu Y, Schuster M, Chen Z, Le Q V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation Wu Y, Schuster M, Chen Z, Le Q V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation
Zurück zum Zitat Xue L, Gao M, Chen Z, Xiong C, Xu R (2021) Robustness evaluation of transformer-based form field extractors via form attacks Xue L, Gao M, Chen Z, Xiong C, Xu R (2021) Robustness evaluation of transformer-based form field extractors via form attacks
Zurück zum Zitat Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P HS (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision. online, p 1529–1537 Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P HS (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision. online, p 1529–1537
Zurück zum Zitat Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artific Intell 137(1–2):239–263MathSciNetCrossRef Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artific Intell 137(1–2):239–263MathSciNetCrossRef
Metadaten
Titel
An ensemble transformer-based model for Arabic sentiment analysis
verfasst von
Omar Mohamed
Aly M. Kassem
Ali Ashraf
Salma Jamal
Ensaf Hussein Mohamed
Publikationsdatum
01.12.2023
Verlag
Springer Vienna
Erschienen in
Social Network Analysis and Mining / Ausgabe 1/2023
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-022-01009-0

Weitere Artikel der Ausgabe 1/2023

Social Network Analysis and Mining 1/2023 Zur Ausgabe

Original Article

Social bot metrics

Premium Partner