nach oben

Social Network Analysis and Mining

Erschienen in:

01.12.2024

Multilingual, monolingual and mono-dialectal transfer learning for Moroccan Arabic sentiment classification

verfasst von: Naaima Boudad, Rdouan Faizi, Rachid Oulad Haj Thami

Erschienen in: Social Network Analysis and Mining | Ausgabe 1/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Transfer learning has recently proven to be very powerful in diverse natural language processing (NLP) tasks such as Machine translation, Sentiment Analysis, and Question/Answering. In this work, we investigate the use of transfer learning (TL) in Dialectal Arabic sentiment classification. Our main objective is to enhance the performance of Sentiment classification and overcome the low resource issue of Arabic dialect. To this end, we use Bidirectional Encoder Representation from Transformers (BERT) to transfer contextual knowledge learned from language modeling task to sentiment classification. We particularly use the multilingual models mBert and XLM-Roberta, the specific Arabic models ARABERT, MARBERT, QARIB, CAMEL and the specific Moroccan dialect DarijaBert. After carrying out downstream fine-tuning experiments using different Moroccan SA datasets, we found that using TL significantly increases the performance of sentiment classification in Moroccan Arabic. Nevertheless, though specific Arabic models have proven to perform much better than multilingual and dialectal models, our experiments have demonstrated that multilingual models can be more effective in texts characterized by an extensive use of code-switching.

Vorheriger Artikel VGCas: distinguishing the cascade structure and the global structure in popularity prediction

Nächster Artikel An adaptive graph sampling framework for graph analytics

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

https://github.com/huggingface/transformers.

Abdaoui A, Berrimi M, Oussalah M, Moussaoui A (2021) Dziribert: a pre-trained language model for the algerian dialect. ArXiv Prepr. arXiv:2109.12346.

Abdelali A, Hassan S, Mubarak H, Darwish K, Samih Y (2021) Pre-training bert on arabic tweets: practical considerations. ArXiv Prepr.arXiv:2102.10684

Abdelfattah MF, Fakhr MW, Rizka MA (2023) ArSentBERT: fine-tuned bidirectional encoder representations from transformers model for Arabic sentiment classification. Bull Electr Eng Inform 12:1196–1202CrossRef

Abdul-Mageed M, Elmadany A, Nagoudi EMB (2020) ARBERT & MARBERT: deep bidirectional transformers for Arabic. ArXiv Prepr. arXiv:2101.01785

Alduailej A, Alothaim A (2022) AraXLNet: pre-trained language model for sentiment analysis of Arabic. J Big Data 9:1–21CrossRef

Almaliki M, Almars AM, Gad I, Atlam E-S (2023) ABMM: Arabic BERT-mini model for hate-speech detection on social media. Electronics 12:1048CrossRef

Ameri K, Hempel M, Sharif H, Lopez J Jr, Perumalla K (2021) CyBERT: cybersecurity claim classification by fine-tuning the BERT language model. J Cybersecurity Priv 1:615–637CrossRef

Antit C, Mechti S, Faiz R (2022) TunRoBERTa: a tunisian robustly optimized BERT approach model for sentiment analysis. Atlantis Press, Netherlands, pp 227–231

Antoun W, Baly F, Hajj H (2020) Arabert: transformer-based model for arabic language understanding. ArXiv Prepr.arXiv:2003.00104

Boudad N, Faizi R, Thami ROH, Chiheb R (2017) Sentiment classification of Arabic tweets: a supervised approach. J Mob Multimed 13:233–243

Boudad N, Ezzahid S, Faizi R, Thami ROH (2019) Exploring the use of word embedding and deep learning in arabic sentiment analysis. In: Presented at the international conference on advanced intelligent systems for sustainable development, Springer pp 243–253

Boujou E, Chataoui H, Mekki AE, Benjelloun S, Chairi I, Berrada I (2021) An open access NLP dataset for Arabic dialects: data collection, labeling, and model construction. ArXiv Prepr. arXiv:2102.11000

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901

Clark K, Luong M-T, Le QV, Manning CD (2020) Electra: pre-training text encoders as discriminators rather than generators. ArXiv Prepr. arXiv:2003.10555

Conneau A, Khandelwal K, Goyal N, Chaudhary V, Wenzek G, Guzmán F, Grave E, Ott M, Zettlemoyer L, Stoyanov V (2019) Unsupervised cross-lingual representation learning at scale. ArXiv Prepr. arXiv:1911.02116

de Vries W, van Cranenburgh A, Bisazza A, Caselli T, van Noord G, Nissim M (2019). Bertje: a dutch bert model. ArXiv Prepr.arXiv:1912.09582

Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. ArXiv Prepr.arXiv:1810.04805 ArXiv.

Dodge J, Ilharco G, Schwartz R, Farhadi A, Hajishirzi H, Smith N (2020) Fine-tuning pretrained language models: weight initializations, data orders, and early stopping. ArXiv Prepr.arXiv:2002.06305

Elouardighi A, Maghfour M, Hammia H (2017) Collecting and processing arabic facebook comments for sentiment analysis. Springer, Berlin, pp 262–274

Garouani M, Kharroubi J (2021) MAC: an open and free Moroccan Arabic corpus for sentiment analysis. In: Presented at the the proceedings of the international conference on smart city applications, Springer, pp. 849–858.

Garouani, M., Chrita, H., Kharroubi, J., 2021. Sentiment analysis of Moroccan tweets using text mining.

Ghaddar A, Wu Y, Rashid A, Bibi K, Rezagholizadeh M, Xing C, Wang Y, Xinyu D, Wang Z, Huai B (2021) JABER: junior Arabic BERt. ArXiv Prepr.arXiv:2112.04329 ArXiv.

Inoue G, Alhafni B, Baimukan N, Bouamor H, Habash N (2021) The interplay of variant, size, and task type in Arabic pre-trained language models. ArXiv Prepr. arXiv:2103.06678

Lan W, Chen Y, Xu W, Ritter A (2020) An empirical study of pre-trained transformers for Arabic information extraction. ArXiv Prepr. arXiv:2004.14519

Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2019) Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. ArXiv Prepr. arXiv:1910.13461

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: a robustly optimized bert pretraining approach. ArXiv Prepr.arXiv:1907.11692

Martin L, Muller B, Suárez PJO, Dupont Y, Romary L, de La Clergerie ÉV, Seddah D, Sagot B (2019) CamemBERT: a tasty French language model. ArXiv Prepr. arXiv:1911.03894

Messaoudi A, Cheikhrouhou A, Haddad H, Ferchichi N, BenHajhmida M, Korched A, Naski M, Ghriss F, Kerkeni A (2021) TunBERT: pretrained contextualized text representation for tunisian dialect. ArXiv Prepr.arXiv:2111.13138

Mohamed O, Kassem AM, Ashraf A, Jamal S, Mohamed EH (2022) An ensemble transformer-based model for Arabic sentiment analysis. Soc Netw Anal Min 13:11CrossRef

Oussous A, Benjelloun F-Z, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Inf Sci 46:544–559CrossRef

Pires T, Schlinger E, Garrette D (2019) How multilingual is multilingual BERT? ArXiv Prepr. arXiv:1906.01502ArXiv.

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI Blog 1:9

Safaya A, Abdullatif M, Yuret D (2020) Kuisail at semeval-2020 task 12: Bert-cnn for offensive speech identification in social media. pp. 2054–2059

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30.

Titel: Multilingual, monolingual and mono-dialectal transfer learning for Moroccan Arabic sentiment classification
verfasst von: Naaima Boudad
Rdouan Faizi
Rachid Oulad Haj Thami
Publikationsdatum: 01.12.2024
Verlag: Springer Vienna
Erschienen in: Social Network Analysis and Mining / Ausgabe 1/2024
Print ISSN: 1869-5450
Elektronische ISSN: 1869-5469
DOI: https://doi.org/10.1007/s13278-023-01159-9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 1/2024

Community detection in social networks by spectral embedding of typed graphs

Do users adopt extremist beliefs from exposure to hate subreddits?

Invasiveness, Intrusiveness and Influence: three new metrics to measure communication between political digital echo chambers

Semantic overlapping community detection with embedding multi-dimensional relationships and spatial context

Correction: Enhancing stance detection through sequential weighted multi-task learning

A comprehensive view of community detection approaches in multilayer social networks

Premium Partner