Skip to main content
Top
Published in: Neural Processing Letters 3/2023

23-01-2023

Improving the Polarity of Text through word2vec Embedding for Primary Classical Arabic Sentiment Analysis

Authors: Nour Elhouda Aoumeur, Zhiyong Li, Eissa M. Alshari

Published in: Neural Processing Letters | Issue 3/2023

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Over the past decade, Sentiment analysis has attracted significant researcher attention. Despite a huge number of studies in this field, Sentiment analysis of authors’ books (classical Arabic) with extracting the embedding features has not yet been done. The recent feature extraction of Arabic text depends on the frequency of the words within the corpus without extracting the relation between these words. This paper aims to create a new classical Arabic dataset CASAD from many art books by collecting sentences from several stories with human-expert labeling. Additionally, the feature extraction of those datasets is created by word embedding techniques equivalent to Word2vec that are able to extract the deep relation which means features of the formal Arabic language. These features are evaluated by several types of machine learning for classical Arabic, for example, support vector machines (SVM), Logistic Regression (LR), Naive Bayes (NB) K-Nearest Neighbors (KNN), Latent Dirichlet Allocation (LDA) and Classification And Regression Trees (CART). Moreover, statistical methods such as validation and reliability are applied to evaluate this dataset’s label. Finally, our experiments evaluated the classification rate of the feature-extraction matrices in two and three classes using six machine-learning algorithms for tenfold cross-validation that showed that the Logistic Regression with Word2Vec approach is the most accurate in predicting topic-polarity occurrence.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mejova Y (2009) Sentiment analysis: an overview. University of Iowa, Computer Science Department Mejova Y (2009) Sentiment analysis: an overview. University of Iowa, Computer Science Department
2.
go back to reference Min S, Park J (2019) Modeling narrative structure and dynamics with networks, sentiment analysis, and topic modeling. PloSone 14(12):e0226025CrossRef Min S, Park J (2019) Modeling narrative structure and dynamics with networks, sentiment analysis, and topic modeling. PloSone 14(12):e0226025CrossRef
3.
go back to reference Attik M, Missen MMS, Coustaty M, Choi GS, Alotaibi FS, Akhtar N, Husnain M (2019) OpinionML—opinion markup language for sentiment representation. Symmetry 11(4):545CrossRef Attik M, Missen MMS, Coustaty M, Choi GS, Alotaibi FS, Akhtar N, Husnain M (2019) OpinionML—opinion markup language for sentiment representation. Symmetry 11(4):545CrossRef
4.
go back to reference Chang YC, Yeh WC, Hsing YC, Wang CA (2019) Refined distributed emotion vector representation for social media sentiment analysis. Plosone 14(10):e0223317CrossRef Chang YC, Yeh WC, Hsing YC, Wang CA (2019) Refined distributed emotion vector representation for social media sentiment analysis. Plosone 14(10):e0223317CrossRef
5.
go back to reference Oueslati O, Cambria E, HajHmida MB, Ounelli H (2020) A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst 112:408–430CrossRef Oueslati O, Cambria E, HajHmida MB, Ounelli H (2020) A review of sentiment analysis research in Arabic language. Futur Gener Comput Syst 112:408–430CrossRef
6.
go back to reference Saxena D, Gupta S, Joseph J, Mehra R (2019) Sentiment analysis. Int J Eng Sci Math 8(3):46–51 Saxena D, Gupta S, Joseph J, Mehra R (2019) Sentiment analysis. Int J Eng Sci Math 8(3):46–51
7.
go back to reference Boudad N, Faizi R, Thami ROH, Chiheb R (2018) Sentiment analysis in Arabic: a review of the literature. Ain Shams Eng J 9(4):2479–2490CrossRef Boudad N, Faizi R, Thami ROH, Chiheb R (2018) Sentiment analysis in Arabic: a review of the literature. Ain Shams Eng J 9(4):2479–2490CrossRef
9.
go back to reference Ma Z, Nam J, Weihe, K (2016) Improve sentiment analysis of citations with author modeling. In: Proceedings of the 7th workshop on computational approaches to subjectivity, Sentiment and Social Media Analysis. pp 122–127 Ma Z, Nam J, Weihe, K (2016) Improve sentiment analysis of citations with author modeling. In: Proceedings of the 7th workshop on computational approaches to subjectivity, Sentiment and Social Media Analysis. pp 122–127
10.
go back to reference Marie-Sainte SL, Alalyani N, Alotaibi S, Ghouzali S, Abunadi I (2018) Arabic natural language processing and machine learning-based systems. IEEE Access 7:7011–7020CrossRef Marie-Sainte SL, Alalyani N, Alotaibi S, Ghouzali S, Abunadi I (2018) Arabic natural language processing and machine learning-based systems. IEEE Access 7:7011–7020CrossRef
11.
go back to reference Mountassir A, Benbrahim H, Berrada I (2012) An empirical study to address the problem of unbalanced data sets in sentiment classification. In: IEEE international conference on systems. s.l. : IEEE, pp 3298–3303 Mountassir A, Benbrahim H, Berrada I (2012) An empirical study to address the problem of unbalanced data sets in sentiment classification. In: IEEE international conference on systems. s.l. : IEEE, pp 3298–3303
12.
go back to reference Al-Badarneh A, Ali M, Ghaleb SM (2016) An improved classifier for arabic text. J Converg Inform Technol (JCIT) 11:69–84 Al-Badarneh A, Ali M, Ghaleb SM (2016) An improved classifier for arabic text. J Converg Inform Technol (JCIT) 11:69–84
13.
go back to reference Rushdi-Saleh M, Martín-Valdivia MT, Ureña-López LA, Perea-Ortega JM (2011) OCA: Opinion corpus for Arabic. J Am Soc Informa Sci Technol 62(10):2045–2054CrossRef Rushdi-Saleh M, Martín-Valdivia MT, Ureña-López LA, Perea-Ortega JM (2011) OCA: Opinion corpus for Arabic. J Am Soc Informa Sci Technol 62(10):2045–2054CrossRef
14.
go back to reference Shahina KK, Jyothsna PV, Prabha G, Premjith B, Soman KP (2019) A sequential labelling approach for the named entity recognition in Arabic language using deep learning algorithms. In: 2019 International conference on data science and communication (IconDSC). s.l. : IEEE, pp 1–6 Shahina KK, Jyothsna PV, Prabha G, Premjith B, Soman KP (2019) A sequential labelling approach for the named entity recognition in Arabic language using deep learning algorithms. In: 2019 International conference on data science and communication (IconDSC). s.l. : IEEE, pp 1–6
15.
go back to reference Duwairi R, Abushaqra F (2021) Syntactic-and morphology-based text augmentation framework for Arabic sentiment analysis. PeerJ Comput Sci 7:e469CrossRef Duwairi R, Abushaqra F (2021) Syntactic-and morphology-based text augmentation framework for Arabic sentiment analysis. PeerJ Comput Sci 7:e469CrossRef
16.
go back to reference Farha IA, Magdy W (2021) A comparative study of effective approaches for arabic sentiment analysis. Inform Process Manag 58(2):102438CrossRef Farha IA, Magdy W (2021) A comparative study of effective approaches for arabic sentiment analysis. Inform Process Manag 58(2):102438CrossRef
17.
go back to reference Harrat S, Meftouh K, Smaili K (2019) Machine translation for Arabic dialects (survey). Inform Process Manag 56(2):262–273CrossRef Harrat S, Meftouh K, Smaili K (2019) Machine translation for Arabic dialects (survey). Inform Process Manag 56(2):262–273CrossRef
18.
go back to reference Al-Azani S, El-Alfy ESM (2017) Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short arabic text. Procedia Comput Sci 109:359–366CrossRef Al-Azani S, El-Alfy ESM (2017) Using word embedding and ensemble learning for highly imbalanced data sentiment analysis in short arabic text. Procedia Comput Sci 109:359–366CrossRef
19.
go back to reference Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRefMATH Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRefMATH
20.
go back to reference Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Inform Sci 46(4):544–559CrossRef Oussous A, Benjelloun FZ, Lahcen AA, Belfkih S (2020) ASA: a framework for Arabic sentiment analysis. J Inform Sci 46(4):544–559CrossRef
21.
go back to reference Al-Ibrahim R, Duwairi RM (2020) Neural machine translation from Jordanian Dialect to modern standard Arabic. In: 2020 11th International conference on information and communication systems (ICICS). IEEE, pp 173–178 Al-Ibrahim R, Duwairi RM (2020) Neural machine translation from Jordanian Dialect to modern standard Arabic. In: 2020 11th International conference on information and communication systems (ICICS). IEEE, pp 173–178
22.
go back to reference Bataineh B, Duwairi R, Abdullah M (2019) ArDep: an Arabic lexicon for detecting depression. In: Proceedings of the 2019 3rd International conference on advances in artificial intelligence. pp 146–151 Bataineh B, Duwairi R, Abdullah M (2019) ArDep: an Arabic lexicon for detecting depression. In: Proceedings of the 2019 3rd International conference on advances in artificial intelligence. pp 146–151
23.
go back to reference Al-Sabbagh R, Girju R (2012) Yadac: Yet another dialectal arabic corpus. In: Proceedings of the eighth international conference on language resources and evaluation (LREC'12), pp 2882-2889 Al-Sabbagh R, Girju R (2012) Yadac: Yet another dialectal arabic corpus. In: Proceedings of the eighth international conference on language resources and evaluation (LREC'12), pp 2882-2889
24.
go back to reference Hadwan M, Al-Hagery M, Al-Sarem M, Saeed F (2022) Arabic sentiment analysis of users’ opinions of governmental mobile applications. Comput Mater Continua 72(3):4675–4689CrossRef Hadwan M, Al-Hagery M, Al-Sarem M, Saeed F (2022) Arabic sentiment analysis of users’ opinions of governmental mobile applications. Comput Mater Continua 72(3):4675–4689CrossRef
25.
go back to reference Alnawas A, Arici Nursal (2021) Effect of word embedding variable parameters on Arabic sentiment analysis performance. arXiv preprint arXiv:2101.02906. Alnawas A, Arici Nursal (2021) Effect of word embedding variable parameters on Arabic sentiment analysis performance. arXiv preprint arXiv:​2101.​02906.
26.
go back to reference Touahri I (2022) The construction of an accurate Arabic sentiment analysis system based on resources alteration and approaches comparison. Appl Comput Inform Touahri I (2022) The construction of an accurate Arabic sentiment analysis system based on resources alteration and approaches comparison. Appl Comput Inform
27.
go back to reference Al-Ayyoub M, Khamaiseh AA, Jararweh Y, Al-Kabi MN (2019) A comprehensive survey of arabic sentiment analysis. Inform process Manag 56(2):320–342CrossRef Al-Ayyoub M, Khamaiseh AA, Jararweh Y, Al-Kabi MN (2019) A comprehensive survey of arabic sentiment analysis. Inform process Manag 56(2):320–342CrossRef
28.
29.
go back to reference Pozzi F, Fersini E, Messina E, Liu B (2016) Sentiment analysis in social networks. Morgan Kaufmann, Burlington Pozzi F, Fersini E, Messina E, Liu B (2016) Sentiment analysis in social networks. Morgan Kaufmann, Burlington
30.
go back to reference Al-Rubaiee H, Qiu R, Li D (2016) Identifying Mubasher software products through sentiment analysis of Arabic tweets. In: 2016 International conference on industrial informatics and computer systems (CIICS). s.l. : IEEE, pp 1–6 Al-Rubaiee H, Qiu R, Li D (2016) Identifying Mubasher software products through sentiment analysis of Arabic tweets. In: 2016 International conference on industrial informatics and computer systems (CIICS). s.l. : IEEE, pp 1–6
31.
go back to reference Hamed AR, Qiu R, Li D (2015) Analysis of the relationship between Saudi twitter posts and the Saudi stock market. In: 2015 IEEE Seventh international conference on intelligent computing and information systems (ICICIS). s.l. : IEEE, pp 660–665 Hamed AR, Qiu R, Li D (2015) Analysis of the relationship between Saudi twitter posts and the Saudi stock market. In: 2015 IEEE Seventh international conference on intelligent computing and information systems (ICICIS). s.l. : IEEE, pp 660–665
32.
go back to reference Alwakid G, Osman T, Hughes-Roberts T (2017) Challenges in sentiment analysis for arabic social networks. Procedia Comput Sci 117:89–100CrossRef Alwakid G, Osman T, Hughes-Roberts T (2017) Challenges in sentiment analysis for arabic social networks. Procedia Comput Sci 117:89–100CrossRef
33.
go back to reference Elhawary M, Elfeky M (2010) Mining Arabic business reviews. In: 2010 IEEE international conference on data mining workshops . s.l. : IEEE, pp 1108–1113 Elhawary M, Elfeky M (2010) Mining Arabic business reviews. In: 2010 IEEE international conference on data mining workshops . s.l. : IEEE, pp 1108–1113
34.
go back to reference Aly M, Atiya A (2013) Labr: a large scale arabic book reviews dataset. In: Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 2: Short Papers). vol 2, pp 494–498 Aly M, Atiya A (2013) Labr: a large scale arabic book reviews dataset. In: Proceedings of the 51st annual meeting of the association for computational linguistics (Volume 2: Short Papers). vol 2, pp 494–498
35.
36.
go back to reference Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1996) A unified statistical approach for determining significant signals in images of cerebral activation. Hum Brain Mapp 4(1):58–73CrossRef Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1996) A unified statistical approach for determining significant signals in images of cerebral activation. Hum Brain Mapp 4(1):58–73CrossRef
37.
go back to reference Alksher MA, Azman A, Yaakob R, Kadir RA, Mohamed A, Alshari E (2017) A framework for idea mining evaluation. In: New trends in intelligent software methodologies, tools and techniques. IOS Press, pp 550–559 Alksher MA, Azman A, Yaakob R, Kadir RA, Mohamed A, Alshari E (2017) A framework for idea mining evaluation. In: New trends in intelligent software methodologies, tools and techniques. IOS Press, pp 550–559
38.
go back to reference Alnawas A, Arici N (2018) The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: a literature review. Politeknik Dergisi 21(2):461–470 Alnawas A, Arici N (2018) The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: a literature review. Politeknik Dergisi 21(2):461–470
39.
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781v3. In: 2013 Proceedings of the international conference on learning representations (ICLR 2013), pp 1–12. ISSN (15324435) ISBN (1532–4435). Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:​1301.​3781v3. In: 2013 Proceedings of the international conference on learning representations (ICLR 2013), pp 1–12. ISSN (15324435) ISBN (1532–4435).
40.
go back to reference Alshari EM, Azman A, Doraisamy S, Mustapha N, Alkeshr M (2017) Improvement of sentiment analysis based on clustering of Word2Vec features. In: 2017 28th international workshop on database and expert systems applications (DEXA). IEEE, pp 123–126CrossRef Alshari EM, Azman A, Doraisamy S, Mustapha N, Alkeshr M (2017) Improvement of sentiment analysis based on clustering of Word2Vec features. In: 2017 28th international workshop on database and expert systems applications (DEXA). IEEE, pp 123–126CrossRef
43.
go back to reference Guo S, Chen R, Li H (2017) Using knowledge transfer and rough set to predict the severity of Android test reports via text mining. Symmetry 9(8):161CrossRef Guo S, Chen R, Li H (2017) Using knowledge transfer and rough set to predict the severity of Android test reports via text mining. Symmetry 9(8):161CrossRef
44.
go back to reference Li N, Wu DD (2010) Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decis Supp Syst 48(2):354–368CrossRef Li N, Wu DD (2010) Using text mining and sentiment analysis for online forums hotspot detection and forecast. Decis Supp Syst 48(2):354–368CrossRef
Metadata
Title
Improving the Polarity of Text through word2vec Embedding for Primary Classical Arabic Sentiment Analysis
Authors
Nour Elhouda Aoumeur
Zhiyong Li
Eissa M. Alshari
Publication date
23-01-2023
Publisher
Springer US
Published in
Neural Processing Letters / Issue 3/2023
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-022-11111-1

Other articles of this Issue 3/2023

Neural Processing Letters 3/2023 Go to the issue