Skip to main content
Top
Published in:

01-12-2023 | Original Article

An ensemble transformer-based model for Arabic sentiment analysis

Authors: Omar Mohamed, Aly M. Kassem, Ali Ashraf, Salma Jamal, Ensaf Hussein Mohamed

Published in: Social Network Analysis and Mining | Issue 1/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Sentiment analysis is a common and challenging task in natural language processing (NLP). It is a widely studied area of research; it facilitates capturing public opinions about a topic, product, or service. There is much research that tackles English sentiment analysis. However, the research in the Arabic language is behind other high-resource languages. Recently, models such as bidirectional encoder representations from transformers (BERT) and generative pre-trained transformer (GPT) have been widely used in many NLP tasks; it significantly improved performance in NLP tasks, especially sentiment analysis. However, Arabic was not a priority in their development. Several models focusing on Arabic have recently begun to pave the way for the latest technologies, such as ARBERT, MARBERT, and others. We used multiple datasets for training and testing-ASAD-A Twitter-based Benchmark Arabic Sentiment Analysis Dataset, ArSarcasm-v2, and SemEval-2017. We propose an ensemble learning approach that combines the multilingual model(XLM-T) and the monolingual model(MARBERT) to overcome the intricacies of the Arabic language that are difficult to address with a single model. It also addresses the problem of imbalanced data using a combination of focal loss and label smoothing. The experiments showed that our ensemble learning approach outperforms the state-of-the-art models on all the used datasets.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Abbes I, Zaghouani W, El-Hardlo O, Ashour F (2020) DAICT: a dialectal arabic irony corpus extracted from twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6265–6271. https://aclanthology.org/2020.lrec-1.768 Abbes I, Zaghouani W, El-Hardlo O, Ashour F (2020) DAICT: a dialectal arabic irony corpus extracted from twitter. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 6265–6271. https://​aclanthology.​org/​2020.​lrec-1.​768
go back to reference Abdelali A, Hassan S, Mubarak H, Darwish K, Samih Y (2021) Pre-Training BERT on Arabic Tweets: Practical Considerations. arXiv preprint arXiv:2102.10684 Abdelali A, Hassan S, Mubarak H, Darwish K, Samih Y (2021) Pre-Training BERT on Arabic Tweets: Practical Considerations. arXiv preprint arXiv:​2102.​10684
go back to reference Abdel-Salam Reem (2021) WANLP 2021 Shared-Task: Towards Irony and Sentiment Detection in Arabic Tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. In: Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 306–311. https://aclanthology.org/2021.wanlp-1.37 Abdel-Salam Reem (2021) WANLP 2021 Shared-Task: Towards Irony and Sentiment Detection in Arabic Tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. In: Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 306–311. https://​aclanthology.​org/​2021.​wanlp-1.​37
go back to reference Abdul-Mageed M, Elmadany A, Nagoudi E, Moatez B (2021) ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 7088–7105. https://doi.org/10.18653/v1/2021.acl-long.551 Abdul-Mageed M, Elmadany A, Nagoudi E, Moatez B (2021) ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 7088–7105. https://​doi.​org/​10.​18653/​v1/​2021.​acl-long.​551
go back to reference Abo MEM, Raj RG, Qazi A (2019) A review on Arabic sentiment analysis: state-of-the-art, taxonomy and open research challenges. IEEE Access 7(2019):162008–162024CrossRef Abo MEM, Raj RG, Qazi A (2019) A review on Arabic sentiment analysis: state-of-the-art, taxonomy and open research challenges. IEEE Access 7(2019):162008–162024CrossRef
go back to reference Alamro H, Alshehri M, Alharbi B, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2021) Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST Alamro H, Alshehri M, Alharbi B, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2021) Overview of the Arabic Sentiment Analysis 2021 Competition at KAUST
go back to reference Alayba AM, Palade V, England M, Iqbal R (2017) Arabic language sentiment analysis on health services, In 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) 1, 1, 114–118. https://doi.org/10.1109/ASAR.2017.8067771 Alayba AM, Palade V, England M, Iqbal R (2017) Arabic language sentiment analysis on health services, In 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR). 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR) 1, 1, 114–118. https://​doi.​org/​10.​1109/​ASAR.​2017.​8067771
go back to reference Alayba AM, Palade V, England M, Iqbal R (2018) A combined CNN and LSTM model for Arabic sentiment analysis. In: International Andreas H, Peter K, Min Tjoa A, Edgar W (eds) Machine Learning and Knowledge Extraction. Springer Publishing, Cham, pp 179–191CrossRef Alayba AM, Palade V, England M, Iqbal R (2018) A combined CNN and LSTM model for Arabic sentiment analysis. In: International Andreas H, Peter K, Min Tjoa A, Edgar W (eds) Machine Learning and Knowledge Extraction. Springer Publishing, Cham, pp 179–191CrossRef
go back to reference Alharbi AI, Lee M (2020) Combining character and word embeddings for affect in Arabic Informal social media microblogs. In: International Elisabeth M, Farid M, Helmut H, Philipp C (eds) Natural language processing and information systems. Springer Publishing, Cham, pp 213–224CrossRef Alharbi AI, Lee M (2020) Combining character and word embeddings for affect in Arabic Informal social media microblogs. In: International Elisabeth M, Farid M, Helmut H, Philipp C (eds) Natural language processing and information systems. Springer Publishing, Cham, pp 213–224CrossRef
go back to reference Alharbi B, Alamro H, Alshehri M, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2020) ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset Alharbi B, Alamro H, Alshehri M, Khayyat Z, Kalkatawi M, Jaber I I, Zhang X (2020) ASAD: A Twitter-based Benchmark Arabic Sentiment Analysis Dataset
go back to reference Al-Twairesh N, Al-Negheimish H (2019) Surface and deep features ensemble for sentiment analysis of arabic tweets. IEEE Access 7(2019):84122–84131CrossRef Al-Twairesh N, Al-Negheimish H (2019) Surface and deep features ensemble for sentiment analysis of arabic tweets. IEEE Access 7(2019):84122–84131CrossRef
go back to reference Antoun Wissam, Baly Fady, Hajj Hazem (2020) AraBERT: Transformer-based Model for Arabic Language Understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 9–15. https://aclanthology.org/2020.osact-1.2 Antoun Wissam, Baly Fady, Hajj Hazem (2020) AraBERT: Transformer-based Model for Arabic Language Understanding. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 9–15. https://​aclanthology.​org/​2020.​osact-1.​2
go back to reference Arazo E, Ortego D, Albert P, O’Connor N E, McGuinness K (2020) Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, online, 1–8 Arazo E, Ortego D, Albert P, O’Connor N E, McGuinness K (2020) Pseudo-Labeling and Confirmation Bias in Deep Semi-Supervised Learning. In: 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, online, 1–8
go back to reference Bahdanau Dzmitry, Cho Kyunghyun, Bengio Yoshua (2015) Neural Machine Translation by Jointly Learning to Align and Translate Bahdanau Dzmitry, Cho Kyunghyun, Bengio Yoshua (2015) Neural Machine Translation by Jointly Learning to Align and Translate
go back to reference Barbieri F, Anke LE, Camacho-Collados J (2021) Xlm-t: a multilingual language model toolkit for twitter Barbieri F, Anke LE, Camacho-Collados J (2021) Xlm-t: a multilingual language model toolkit for twitter
go back to reference Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Associat Computat Linguist 5(7):135–146CrossRef Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Associat Computat Linguist 5(7):135–146CrossRef
go back to reference Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747 Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2020) Unsupervised Cross-lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8440–8451. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​747
go back to reference Darwish K, Habash N, Abbas M, Al-Khalifa H, Al-Natsheh HT, Bouamor H, Bouzoubaa K, Cavalli-Sforza V, El-Beltagy SR, El-Hajj W et al (2021) A panoramic survey of natural language processing in the Arab world. Commun ACM 64(4):72–81CrossRef Darwish K, Habash N, Abbas M, Al-Khalifa H, Al-Natsheh HT, Bouamor H, Bouzoubaa K, Cavalli-Sforza V, El-Beltagy SR, El-Hajj W et al (2021) A panoramic survey of natural language processing in the Arab world. Commun ACM 64(4):72–81CrossRef
go back to reference Darwish K, Mubarak H (2016) Farasa: a new fast and accurate Arabic Word Segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, 1070–1074. https://aclanthology.org/L16-1170 Darwish K, Mubarak H (2016) Farasa: a new fast and accurate Arabic Word Segmenter. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), Portorož, Slovenia, 1070–1074. https://​aclanthology.​org/​L16-1170
go back to reference Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423 Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://​doi.​org/​10.​18653/​v1/​N19-1423
go back to reference DeYoung J, Jain S, Rajani N F, Lehman E, Xiong C, Socher R, Wallace B C (2020) ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4443–4458. https://doi.org/10.18653/v1/2020.acl-main.408 DeYoung J, Jain S, Rajani N F, Lehman E, Xiong C, Socher R, Wallace B C (2020) ERASER: A Benchmark to Evaluate Rationalized NLP Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4443–4458. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​408
go back to reference El Mahdaouy A, El Mekki A, Essefar K, El Mamoun N, Berrada I, Khoumsi A (2021) Deep multi-task model for sarcasm detection and sentiment analysis in Arabic Language. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 334–339. https://aclanthology.org/2021.wanlp-1.42 El Mahdaouy A, El Mekki A, Essefar K, El Mamoun N, Berrada I, Khoumsi A (2021) Deep multi-task model for sarcasm detection and sentiment analysis in Arabic Language. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 334–339. https://​aclanthology.​org/​2021.​wanlp-1.​42
go back to reference El-Beltagy S R, El Kalamawy M, Soliman A B (2017) NileTMRG at SemEval-2017 Task 4: Arabic sentiment analysis. in proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 790–795. https://doi.org/10.18653/v1/S17-2133 El-Beltagy S R, El Kalamawy M, Soliman A B (2017) NileTMRG at SemEval-2017 Task 4: Arabic sentiment analysis. in proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 790–795. https://​doi.​org/​10.​18653/​v1/​S17-2133
go back to reference Farha Ibrahim Abu, Magdy Walid (2019) Mazajak: An Online Arabic Sentiment Analyser. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Florence, Italy, 192–198. https://doi.org/10.18653/v1/W19-4621 Farha Ibrahim Abu, Magdy Walid (2019) Mazajak: An Online Arabic Sentiment Analyser. In: Proceedings of the Fourth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Florence, Italy, 192–198. https://​doi.​org/​10.​18653/​v1/​W19-4621
go back to reference Farha Ibrahim Abu, Magdy Walid (2020) From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 32–39. https://aclanthology.org/2020.osact-1.5 Farha Ibrahim Abu, Magdy Walid (2020) From Arabic Sentiment Analysis to Sarcasm Detection: The ArSarcasm Dataset. In: Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection. European Language Resource Association, Marseille, France, 32–39. https://​aclanthology.​org/​2020.​osact-1.​5
go back to reference Farha Ibrahim Abu, Magdy Walid (2021) Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 21–31. https://aclanthology.org/2021.wanlp-1.3 Farha Ibrahim Abu, Magdy Walid (2021) Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 21–31. https://​aclanthology.​org/​2021.​wanlp-1.​3
go back to reference Farha Ibrahim Abu, Zaghouani Wajdi, Magdy Walid (2021) Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 296–305. https://aclanthology.org/2021.wanlp-1.36 Farha Ibrahim Abu, Zaghouani Wajdi, Magdy Walid (2021) Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. In: Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 296–305. https://​aclanthology.​org/​2021.​wanlp-1.​36
go back to reference Gaanoun K, Benelallam I (2021) Sarcasm and sentiment detection in Arabic language a hybrid approach combining embeddings and rule-based features. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 351–356. https://aclanthology.org/2021.wanlp-1.45 Gaanoun K, Benelallam I (2021) Sarcasm and sentiment detection in Arabic language a hybrid approach combining embeddings and rule-based features. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 351–356. https://​aclanthology.​org/​2021.​wanlp-1.​45
go back to reference Ganaie MA, Hu M et al. (2021) Ensemble deep learning: A review Ganaie MA, Hu M et al. (2021) Ensemble deep learning: A review
go back to reference González José-Ángel, Pla F, Hurtado L-F (2017) ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 723–727. https://doi.org/10.18653/v1/S17-2121 González José-Ángel, Pla F, Hurtado L-F (2017) ELiRF-UPV at SemEval-2017 Task 4: Sentiment Analysis using Deep Learning. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 723–727. https://​doi.​org/​10.​18653/​v1/​S17-2121
go back to reference Goyal N, Du J, Ott M, Anantharaman G, Conneau A (2021) Larger-Scale transformers for multilingual masked language modeling. In: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). Association for Computational Linguistics, Online, 29–33. https://doi.org/10.18653/v1/2021.repl4nlp-1.4 Goyal N, Du J, Ott M, Anantharaman G, Conneau A (2021) Larger-Scale transformers for multilingual masked language modeling. In: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021). Association for Computational Linguistics, Online, 29–33. https://​doi.​org/​10.​18653/​v1/​2021.​repl4nlp-1.​4
go back to reference Gururangan S, Marasović A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith N A (2020) Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8342–8360. https://doi.org/10.18653/v1/2020.acl-main.740 Gururangan S, Marasović A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith N A (2020) Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8342–8360. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​740
go back to reference Hegazi MO, Al-Dossari Y, Al-Yahy A, Al-Sumari A, Hilal A (2021) Preprocessing Arabic text on social media. Heliyon 7(2):e06191CrossRef Hegazi MO, Al-Dossari Y, Al-Yahy A, Al-Sumari A, Hilal A (2021) Preprocessing Arabic text on social media. Heliyon 7(2):e06191CrossRef
go back to reference Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Proced Comput Sci 142(2018):114–122CrossRef Heikal M, Torki M, El-Makky N (2018) Sentiment analysis of Arabic tweets using deep learning. Proced Comput Sci 142(2018):114–122CrossRef
go back to reference Hinton G, Vinyals O, Dean J et al (2015) Distilling the knowledge in a neural network Hinton G, Vinyals O, Dean J et al (2015) Distilling the knowledge in a neural network
go back to reference Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computat 9(8):1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural computat 9(8):1735–1780CrossRef
go back to reference Htait A, Fournier S, Bellot P (2017) LSIS at SemEval-2017 Task 4: using adapted sentiment similarity seed words for english and arabic tweet polarity classification. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 718–722. https://doi.org/10.18653/v1/S17-2120 Htait A, Fournier S, Bellot P (2017) LSIS at SemEval-2017 Task 4: using adapted sentiment similarity seed words for english and arabic tweet polarity classification. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 718–722. https://​doi.​org/​10.​18653/​v1/​S17-2120
go back to reference Jabreel M, Moreno A (2017) SiTAKA at SemEval-2017 Task 4: sentiment analysis in twitter based on a rich set of features. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 694–699. https://doi.org/10.18653/v1/S17-2115 Jabreel M, Moreno A (2017) SiTAKA at SemEval-2017 Task 4: sentiment analysis in twitter based on a rich set of features. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 694–699. https://​doi.​org/​10.​18653/​v1/​S17-2115
go back to reference Jacovi A, Goldberg Y (2020) Towards faithfully interpretable NLP systems: how should we define and evaluate faithfulness?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4198–4205. https://doi.org/10.18653/v1/2020.acl-main.386 Jacovi A, Goldberg Y (2020) Towards faithfully interpretable NLP systems: how should we define and evaluate faithfulness?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4198–4205. https://​doi.​org/​10.​18653/​v1/​2020.​acl-main.​386
go back to reference James B, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(2):281–305MathSciNet James B, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(2):281–305MathSciNet
go back to reference Jurek A, Mulvenna MD, Bi Y (2015) Improved lexicon-based sentiment analysis for social media analytics. Sec Informat 4(1):1–13 Jurek A, Mulvenna MD, Bi Y (2015) Improved lexicon-based sentiment analysis for social media analytics. Sec Informat 4(1):1–13
go back to reference Kaushik C, Mishra A (2014) A scalable, lexicon based technique for sentiment analysis Kaushik C, Mishra A (2014) A scalable, lexicon based technique for sentiment analysis
go back to reference Khalil T, Halaby A, Hammad M, El-Beltagy S R (2015) Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis. In: 2015 First International Conference on Arabic Computational Linguistics (ACLing). IEEE, online, 86–93 Khalil T, Halaby A, Hammad M, El-Beltagy S R (2015) Which configuration works best? an experimental study on supervised Arabic twitter sentiment analysis. In: 2015 First International Conference on Arabic Computational Linguistics (ACLing). IEEE, online, 86–93
go back to reference Khan HU, Peacock D (2019) Possible effects of emoticon and emoji on sentiment analysis web services of work organisations. Int J Work Organisat Emot 10(2):130–161CrossRef Khan HU, Peacock D (2019) Possible effects of emoticon and emoji on sentiment analysis web services of work organisations. Int J Work Organisat Emot 10(2):130–161CrossRef
go back to reference Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, Melnikov A, Kliushkina N, Araya C, Yan S et al (2020) A unified and generic model interpretability library for pytorch, Captum Kokhlikyan N, Miglani V, Martin M, Wang E, Alsallakh B, Reynolds J, Melnikov A, Kliushkina N, Araya C, Yan S et al (2020) A unified and generic model interpretability library for pytorch, Captum
go back to reference Kudo Taku (2018) Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 66–75. https://doi.org/10.18653/v1/P18-1007 Kudo Taku (2018) Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 66–75. https://​doi.​org/​10.​18653/​v1/​P18-1007
go back to reference LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Computat 1(4):541–551CrossRef LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Computat 1(4):541–551CrossRef
go back to reference Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. IEEE, online, 2980–2988 Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. IEEE, online, 2980–2988
go back to reference Liu C, Fang F, Lin X, Cai T, Tan X, Liu J, Lu X (2021) Improving sentiment analysis accuracy with emoji embedding. J Safety Sci Resil 2(4):246–252CrossRef Liu C, Fang F, Lin X, Cai T, Tan X, Liu J, Lu X (2021) Improving sentiment analysis accuracy with emoji embedding. J Safety Sci Resil 2(4):246–252CrossRef
go back to reference Mahmoud A-A (2015) Essa Safa Bani, Alsmadi Izzat (2015) Lexicon-based sentiment analysis of arabic tweets. Int J Soc Network Min 2(2):101–114CrossRef Mahmoud A-A (2015) Essa Safa Bani, Alsmadi Izzat (2015) Lexicon-based sentiment analysis of arabic tweets. Int J Soc Network Min 2(2):101–114CrossRef
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space
go back to reference Mohammad A-S, Bashar T, Mahmoud A-A, Yaser J (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybernet 10(8):2163–2175CrossRef Mohammad A-S, Bashar T, Mahmoud A-A, Yaser J (2019) Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int J Mach Learn Cybernet 10(8):2163–2175CrossRef
go back to reference Morris J, Lifland E, Yoo J Y, Grigsby J, Jin D, Qi Y (2020) TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 119–126. https://doi.org/10.18653/v1/2020.emnlp-demos.16 Morris J, Lifland E, Yoo J Y, Grigsby J, Jin D, Qi Y (2020) TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 119–126. https://​doi.​org/​10.​18653/​v1/​2020.​emnlp-demos.​16
go back to reference Mubarak H, Hassan S, Chowdhury S A (2022) Emojis as anchors to detect Arabic offensive language and hate speech Mubarak H, Hassan S, Chowdhury S A (2022) Emojis as anchors to detect Arabic offensive language and hate speech
go back to reference Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P HS, Dokania P K (2020) Calibrating deep neural networks using focal loss Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P HS, Dokania P K (2020) Calibrating deep neural networks using focal loss
go back to reference Nabil M, Aly M, Atiya A (2015) ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for computational linguistics, Lisbon, Portugal, 2515–2519. https://doi.org/10.18653/v1/D15-1299 Nabil M, Aly M, Atiya A (2015) ASTD: Arabic sentiment tweets dataset. In Proceedings of the 2015 conference on empirical methods in natural language processing. Association for computational linguistics, Lisbon, Portugal, 2515–2519. https://​doi.​org/​10.​18653/​v1/​D15-1299
go back to reference Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. In: SICS Technical Report. Swedish Institute of Computer Science, online, p 1–59 Olsson F (2009) A literature survey of active machine learning in the context of natural language processing. In: SICS Technical Report. Swedish Institute of Computer Science, online, p 1–59
go back to reference Opitz David, Maclin Richard (1999) Popular ensemble methods: an empirical study. J Artific Intell R 11(1999):169–198 Opitz David, Maclin Richard (1999) Popular ensemble methods: an empirical study. J Artific Intell R 11(1999):169–198
go back to reference Oueslati Oumaima, Cambria Erik, HajHmida Moez Ben, Ounelli Habib (2020) A review of sentiment analysis research in Arabic language. Future Generat Comput Syst 112(2020):408–430CrossRef Oueslati Oumaima, Cambria Erik, HajHmida Moez Ben, Ounelli Habib (2020) A review of sentiment analysis research in Arabic language. Future Generat Comput Syst 112(2020):408–430CrossRef
go back to reference Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Informat Sci 46(4):544–559CrossRef Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Informat Sci 46(4):544–559CrossRef
go back to reference Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532–1543. https://doi.org/10.3115/v1/D14-1162 Pennington J, Socher R, Manning C (2014) GloVe: global vectors for word representation. In: proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532–1543. https://​doi.​org/​10.​3115/​v1/​D14-1162
go back to reference Rabbimov I, Mporas I, Simaki V, Kobilov S (2020) Investigating the effect of emoji in opinion classification of Uzbek movie review comments. In: International Conference on Speech and Computer. Springer, online, p 435–445 Rabbimov I, Mporas I, Simaki V, Kobilov S (2020) Investigating the effect of emoji in opinion classification of Uzbek movie review comments. In: International Conference on Speech and Computer. Springer, online, p 435–445
go back to reference Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9 Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9
go back to reference Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International Conference on Machine Learning. PMLR, online, p 5301–5310 Rahaman N, Baratin A, Arpit D, Draxler F, Lin M, Hamprecht F, Bengio Y, Courville A (2019) On the spectral bias of neural networks. In: International Conference on Machine Learning. PMLR, online, p 5301–5310
go back to reference Ribeiro M, Singh S, Guestrin C (2016) Why Should I Trust You?: explaining the predictions of any classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, San Diego, California, 97–101. https://doi.org/10.18653/v1/N16-3020 Ribeiro M, Singh S, Guestrin C (2016) Why Should I Trust You?: explaining the predictions of any classifier. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, San Diego, California, 97–101. https://​doi.​org/​10.​18653/​v1/​N16-3020
go back to reference Robert G, Jörn-Henrik J, Claudio M, Richard Z, Wieland B, Matthias B, Wichmann Felix A (2020) Shortcut learning in deep neural networks. Nature Mach Intell 2(11):665–673CrossRef Robert G, Jörn-Henrik J, Claudio M, Richard Z, Wieland B, Matthias B, Wichmann Felix A (2020) Shortcut learning in deep neural networks. Nature Mach Intell 2(11):665–673CrossRef
go back to reference Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 Task 4: sentiment analysis in twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 502–518. https://doi.org/10.18653/v1/S17-2088 Rosenthal S, Farra N, Nakov P (2017) SemEval-2017 Task 4: sentiment analysis in twitter. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Association for Computational Linguistics, Vancouver, Canada, 502–518. https://​doi.​org/​10.​18653/​v1/​S17-2088
go back to reference Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the fourteenth workshop on semantic evaluation. International Committee for Computational Linguistics, Barcelona (online), 2054–2059. https://doi.org/10.18653/v1/2020.semeval-1.271 Safaya A, Abdullatif M, Yuret D (2020) KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media. In: Proceedings of the fourteenth workshop on semantic evaluation. International Committee for Computational Linguistics, Barcelona (online), 2054–2059. https://​doi.​org/​10.​18653/​v1/​2020.​semeval-1.​271
go back to reference Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units
go back to reference Shekar BH, Dagnew G (2019) Grid search-based hyperparameter tuning and classification of microarray cancer data. In 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). IEEE, online, 1–8 Shekar BH, Dagnew G (2019) Grid search-based hyperparameter tuning and classification of microarray cancer data. In 2019 Second International Conference on Advanced Computational and Communication Paradigms (ICACCP). IEEE, online, 1–8
go back to reference Shiha M, Ayvaz S (2017) The effects of emoji in sentiment analysis. Int J Comput Electr Eng (IJCEE) 9(1):360–369CrossRef Shiha M, Ayvaz S (2017) The effects of emoji in sentiment analysis. Int J Comput Electr Eng (IJCEE) 9(1):360–369CrossRef
go back to reference Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Patwary M, Prabhat MR, Adams R (2015) Scalable bayesian optimization using deep neural networks. In International conference on machine learning. PMLR, online, 2171–2180 Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Patwary M, Prabhat MR, Adams R (2015) Scalable bayesian optimization using deep neural networks. In International conference on machine learning. PMLR, online, 2171–2180
go back to reference Soliman T-H, Elmasry MA, Hedar A, Doss MM (2014) Sentiment analysis of Arabic slang comments on facebook. Int J Comput Technol 12(5):3470–3478CrossRef Soliman T-H, Elmasry MA, Hedar A, Doss MM (2014) Sentiment analysis of Arabic slang comments on facebook. Int J Comput Technol 12(5):3470–3478CrossRef
go back to reference Song B, Pan C, Wang S, Luo Z (2021) DeepBlueAI at WANLP-EACL2021 task 2: a deep ensemble-based method for sarcasm and sentiment detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 390–394. https://aclanthology.org/2021.wanlp-1.52 Song B, Pan C, Wang S, Luo Z (2021) DeepBlueAI at WANLP-EACL2021 task 2: a deep ensemble-based method for sarcasm and sentiment detection in Arabic. In Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual), 390–394. https://​aclanthology.​org/​2021.​wanlp-1.​52
go back to reference Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, online, p 2818–2826 Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, online, p 2818–2826
go back to reference Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Computat Linguist 37(2):267–307CrossRef Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Computat Linguist 37(2):267–307CrossRef
go back to reference Tenney I, Wexler J, Bastings J, Bolukbasi T, Coenen A, Gehrmann S, Jiang E, Pushkarna M, Radebaugh C, Reif E, et al (2020) The language interpretability tool: extensible, interactive visualizations and analysis for NLP models. (2020) Tenney I, Wexler J, Bastings J, Bolukbasi T, Coenen A, Gehrmann S, Jiang E, Pushkarna M, Radebaugh C, Reif E, et al (2020) The language interpretability tool: extensible, interactive visualizations and analysis for NLP models. (2020)
go back to reference Utlu I, Yücesoy V, Koc A, Cukur T, Senel L-K (2018) Semantic structure and interpretability of word embeddings. IEEE/ACM Trans Audio, Speech Language Process 26(10):1769–1779CrossRef Utlu I, Yücesoy V, Koc A, Cukur T, Senel L-K (2018) Semantic structure and interpretability of word embeddings. IEEE/ACM Trans Audio, Speech Language Process 26(10):1769–1779CrossRef
go back to reference Wadhawan A(2021) Arabert and farasa segmentation based approach for sarcasm and sentiment detection in arabic tweets Wadhawan A(2021) Arabert and farasa segmentation based approach for sarcasm and sentiment detection in arabic tweets
go back to reference Wang J, Xu J, Wang X (2018) Combination of hyperband and Bayesian optimization for hyperparameter optimization in deep learning Wang J, Xu J, Wang X (2018) Combination of hyperband and Bayesian optimization for hyperparameter optimization in deep learning
go back to reference Wu Y, Schuster M, Chen Z, Le Q V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation Wu Y, Schuster M, Chen Z, Le Q V, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K et al (2016) Google’s neural machine translation system: Bridging the gap between human and machine translation
go back to reference Xue L, Gao M, Chen Z, Xiong C, Xu R (2021) Robustness evaluation of transformer-based form field extractors via form attacks Xue L, Gao M, Chen Z, Xiong C, Xu R (2021) Robustness evaluation of transformer-based form field extractors via form attacks
go back to reference Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P HS (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision. online, p 1529–1537 Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P HS (2015) Conditional random fields as recurrent neural networks. In: Proceedings of the IEEE international conference on computer vision. online, p 1529–1537
go back to reference Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artific Intell 137(1–2):239–263MathSciNetCrossRef Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artific Intell 137(1–2):239–263MathSciNetCrossRef
Metadata
Title
An ensemble transformer-based model for Arabic sentiment analysis
Authors
Omar Mohamed
Aly M. Kassem
Ali Ashraf
Salma Jamal
Ensaf Hussein Mohamed
Publication date
01-12-2023
Publisher
Springer Vienna
Published in
Social Network Analysis and Mining / Issue 1/2023
Print ISSN: 1869-5450
Electronic ISSN: 1869-5469
DOI
https://doi.org/10.1007/s13278-022-01009-0

Premium Partner