Skip to main content
Top
Published in: The Journal of Supercomputing 6/2021

12-11-2020

An application of MOGW optimization for feature selection in text classification

Authors: Razieh Asgarnezhad, S. Amirhassan Monadjemi, Mohammadreza Soltanaghaei

Published in: The Journal of Supercomputing | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Due to extensive web applications, sentiment classification (SC) has become a relevant issue of interest among text mining experts. The extensive online reviews prevent the application of effective models to be used in companies and in the decision making of individuals. Pre-processing greatly contributes in sentiment classification. The traditional bag-of-words approaches do not record multiple relationships among words. In this study, emphasis is on the pre-processing stage and data reduction techniques, which would make a big difference in sentiment classification efficiency. To classify opinions, a multi-objective-grey wolf-optimization algorithm is proposed where the two objectives aim for decreasing the error of Naïve Bayes and K-nearest neighbour classifiers and a neural network as the final classifier. In evaluating this proposed framework, three datasets are applied. By obtaining 95.76% precision, 95.75% accuracy, 95.99% recall, and 95.82% f-measure, it is evident that this framework outperforms its counterparts.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Gao H, Zeng X, Yao C (2019) Application of improved distributed naive Bayesian algorithms in text classification. J Supercomput 75(9):5831–5847CrossRef Gao H, Zeng X, Yao C (2019) Application of improved distributed naive Bayesian algorithms in text classification. J Supercomput 75(9):5831–5847CrossRef
3.
go back to reference Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, Barcelona, Spain, pp 271–278. https://doi.org/10.3115/1218955.1218990 Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, Barcelona, Spain, pp 271–278. https://​doi.​org/​10.​3115/​1218955.​1218990
7.
go back to reference Verma B, Thakur RS (2018) Sentiment analysis using lexicon and machine learning-based approaches: a survey. In: Proceedings of International Conference on Recent Advancement on Computer and Communication, Lecture Notes in Networks and Systems, Springer, Singapore. https://doi.org/10.1007/978-981-10-8198-9_46 Verma B, Thakur RS (2018) Sentiment analysis using lexicon and machine learning-based approaches: a survey. In: Proceedings of International Conference on Recent Advancement on Computer and Communication, Lecture Notes in Networks and Systems, Springer, Singapore. https://​doi.​org/​10.​1007/​978-981-10-8198-9_​46
9.
go back to reference Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M, Al-Kabi MN, Al-rifai S (2014) Towards improving the lexicon-based approach for arabic sentiment analysis. Int J Inf Technol Web Eng 9(3):55–71CrossRef Abdulla NA, Ahmed NA, Shehab MA, Al-Ayyoub M, Al-Kabi MN, Al-rifai S (2014) Towards improving the lexicon-based approach for arabic sentiment analysis. Int J Inf Technol Web Eng 9(3):55–71CrossRef
10.
go back to reference Nawaz A, Asghar S, Naqvi SHA (2019) A segregational approach for determining aspect sentiments in social media analysis. J Supercomput 75(5):2584–2602CrossRef Nawaz A, Asghar S, Naqvi SHA (2019) A segregational approach for determining aspect sentiments in social media analysis. J Supercomput 75(5):2584–2602CrossRef
11.
go back to reference Alnawas A, Arici N (2018) The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: a literature review. Politeknik Dergisi 21(2):461–470 Alnawas A, Arici N (2018) The corpus based approach to sentiment analysis in modern standard Arabic and Arabic dialects: a literature review. Politeknik Dergisi 21(2):461–470
12.
go back to reference Cruz L, Ochoa J, Roche M, Poncelet P (2017) Dictionary-based sentiment analysis applied to a specific domain. In: Proceeding of the 3rd. Annual Internacional Symposium on Information Management and Big Data, Communications in Computer and Information Science, Springer, Cham. https://doi.org/10.1007/978-3-319-55209-5_5 Cruz L, Ochoa J, Roche M, Poncelet P (2017) Dictionary-based sentiment analysis applied to a specific domain. In: Proceeding of the 3rd. Annual Internacional Symposium on Information Management and Big Data, Communications in Computer and Information Science, Springer, Cham. https://​doi.​org/​10.​1007/​978-3-319-55209-5_​5
15.
17.
go back to reference Shang L, Zhou Z, Liu X (2016) Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20(10):3821–3834CrossRef Shang L, Zhou Z, Liu X (2016) Particle swarm optimization-based feature selection in sentiment classification. Soft Comput 20(10):3821–3834CrossRef
20.
go back to reference Severyn A, Moschitti A, Uryupina O et al (2016) Multi-lingual opinion mining on YouTube. Inf Process Manag 52(1):46–60CrossRef Severyn A, Moschitti A, Uryupina O et al (2016) Multi-lingual opinion mining on YouTube. Inf Process Manag 52(1):46–60CrossRef
21.
go back to reference Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49CrossRef Poria S, Cambria E, Gelbukh A (2016) Aspect extraction for opinion mining with a deep convolutional neural network. Knowl-Based Syst 108:42–49CrossRef
23.
go back to reference Chaovalit P, Zhou L (2005) Movie review mining: a comparison between supervised and unsupervised classification approaches. In: Proceedings of the 38th Annual Hawaii International Conference on System Sciences. IEEE, Big Island, HI, USA, pp 1–9. https://doi.org/10.1109/HICSS.2005.445 Chaovalit P, Zhou L (2005) Movie review mining: a comparison between supervised and unsupervised classification approaches. In: Proceedings of the 38th Annual Hawaii International Conference on System Sciences. IEEE, Big Island, HI, USA, pp 1–9. https://​doi.​org/​10.​1109/​HICSS.​2005.​445
24.
go back to reference Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web. ACM, 2003, pp 519–528. https://doi.org/10.1145/775152.775226 Dave K, Lawrence S, Pennock DM (2003) Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on World Wide Web. ACM, 2003, pp 519–528. https://​doi.​org/​10.​1145/​775152.​775226
29.
30.
go back to reference Nakov P, Ritter A, Rosenthal S et al (2019) SemEval-2016 task 4: Sentiment analysis in Twitter, In: 10th International Workshop on Semantic Evaluation (SemEval-2016), Association for Computational Linguistics, San Diego, California, pp 1–18. https://doi.org/10.18653/v1/S16-1001 Nakov P, Ritter A, Rosenthal S et al (2019) SemEval-2016 task 4: Sentiment analysis in Twitter, In: 10th International Workshop on Semantic Evaluation (SemEval-2016), Association for Computational Linguistics, San Diego, California, pp 1–18. https://​doi.​org/​10.​18653/​v1/​S16-1001
33.
go back to reference Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, CambridgeCrossRef Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press, CambridgeCrossRef
34.
go back to reference Nguyen DQ, Nguyen Dat Q, Vu T et al (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceeding if the 5th Workshop on Computational Approaches to Subjectivity. Sentiment and Social Media Analysis, Baltimore, Maryland, pp 128–135. https://doi.org/10.3115/v1/W14-2621 Nguyen DQ, Nguyen Dat Q, Vu T et al (2014) Sentiment classification on polarity reviews: an empirical study using rating-based features. In: Proceeding if the 5th Workshop on Computational Approaches to Subjectivity. Sentiment and Social Media Analysis, Baltimore, Maryland, pp 128–135. https://​doi.​org/​10.​3115/​v1/​W14-2621
35.
go back to reference Cha S-H (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 4(1):300–307 Cha S-H (2007) Comprehensive survey on distance/similarity measures between probability density functions. Int J Math Models Methods Appl Sci 4(1):300–307
36.
go back to reference Alpaydin E (2014) Introduction to machine learning. MIT press, CambridgeMATH Alpaydin E (2014) Introduction to machine learning. MIT press, CambridgeMATH
Metadata
Title
An application of MOGW optimization for feature selection in text classification
Authors
Razieh Asgarnezhad
S. Amirhassan Monadjemi
Mohammadreza Soltanaghaei
Publication date
12-11-2020
Publisher
Springer US
Published in
The Journal of Supercomputing / Issue 6/2021
Print ISSN: 0920-8542
Electronic ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-020-03490-w

Other articles of this Issue 6/2021

The Journal of Supercomputing 6/2021 Go to the issue

Premium Partner