Top

Soft Computing

Published in:

11-03-2019 | Focus

WOA + BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network

Authors: Eslam. M. Hassib, Ali. I. El-Desouky, Labib. M. Labib, El-Sayed M. El-kenawy

Published in: Soft Computing | Issue 8/2020

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Nowadays, big data plays a substantial part in information knowledge analysis, manipulation, and forecasting. Analyzing and extracting knowledge from such big datasets are a very challenging task due to the imbalance of data distribution, which could lead to a biased classification results and wrong decisions. The standard classifiers are not capable of handling such datasets. Hence, a new technique for dealing with such datasets is required. This paper proposes a novel classification framework for big data that consists of three developed phases. The first phase is the feature selection phase, which uses the Whale optimization algorithm (WOA) for finding the best set of features. The second phase is the preprocessing phase, which uses the SMOTE algorithm and the LSH-SMOTE algorithm for solving the class imbalance problem. Lastly, the third phase is WOA + BRNN algorithm, which is using the Whale optimization algorithm for training a deep learning approach called bidirectional recurrent neural network for the first time. Our proposed algorithm WOA-BRNN has been tested against nine highly imbalanced datasets one of them is big dataset in terms of area under curve (AUC) against four of the most common use machine learning algorithms (Naïve Bayes, AdaBoostM1, decision table, random tree), in addition to GWO-MLP (training multilayer perceptron using Gray Wolf Optimizer), then we test our algorithm over four well-known datasets against GWO-MLP and particle swarm optimization (PSO-MLP), genetic algorithm (GA-MLP), ant colony optimization (ACO-MLP), evolution strategy (ES-MLP), and population-based incremental learning (PBIL-MLP) in terms of classification accuracy. Experimental results proved that our proposed algorithm WOA + BRNN has achieved promising accuracy and high local optima avoidance, and outperformed four of the most common use machine learning algorithms, and GWO-MLP in terms of AUC.

previous article Identity-based data storage scheme with anonymous key generation in fog computing

next article Automatic keyphrase extraction using word embeddings

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Ahmed E et al (2017) The role of big data analytics in Internet of Things. Comput Netw 129:459–471CrossRef

Al-Smadi M et al (2018) Deep recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews. J Comput Sci 27:386–393CrossRef

Ballabio D, Grisoni F, Todeschini R (2018) Multivariate comparison of classification performance measures. Chemometr Intell Lab Syst 174:33–44CrossRef

Barrow D, Kourentzes N (2018) The impact of special days in call arrivals forecasting: a neural network approach to modelling special days. Eur J Oper Res 264(3):967–977MathSciNetCrossRef

Bennin KE et al (2018) Mahakil: diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction. IEEE Trans Software Eng 44(6):534–550CrossRef

Chaudhary P, Gupta BB (2017) A novel framework to alleviate dissemination of XSS worms in online social network (OSN) using view segregation. Neural Netw World 27(1):5CrossRef

Chaudhary P, Gupta S, Gupta BB (2016) Auditing defense against XSS worms in online social network-based web applications. In: Gupta B, Agrawal DP, Yamaguchi S (eds) Handbook of research on modern cryptographic solutions for computer and cyber security. IGI Global, Pennsylvania, pp 216–245CrossRef

Chawla NV et al (2012) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357CrossRef

Din S et al (2018) Service orchestration of optimizing continuous features in industrial surveillance using big data based fog-enabled internet of things. IEEE Access 6:21582–21591CrossRef

Faris H, Aljarah I, Mirjalili S (2016) Training feedforward neural networks using multi-verse optimizer for binary classification problems. Appl Intell 45(2):322–332CrossRef

Goodfellow I et al (2016) Deep learning, vol 1. MIT Press, CambridgeMATH

Grover V et al (2018) Creating strategic business value from big data analytics: a research framework. J Manag Inf Syst 35(2):388–423CrossRef

Guan Y et al (2017) FPGA-based accelerator for long short-term memory recurrent neural networks. In: Design automation conference (ASP-DAC), 2017 22nd Asia and South Pacific. IEEE

Gupta BB (ed) (2018) Computer and cyber security: principles, algorithm, applications, and perspectives. CRC Press, New York

Haixiang G et al (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239CrossRef

Hassib EM et al (2018) LSH-SMOTE: a modified SMOTE algorithm for imbalanced data-sets. Ciência e Técnica Vitivinícola 33:50–65

Huang W et al (2015) Scalable Gaussian process regression using deep neural networks. In: IJCAI

Huang J et al (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE CVPR, vol 4

Kim JS, Jung S (2015) Implementation of the RBF neural chip with the back-propagation algorithm for on-line learning. Appl Soft Comput 29:233–244CrossRef

Li J et al (2017) Rare event prediction using similarity majority under-sampling technique. In: International conference on soft computing in data science. Springer, Singapore

Linggard R, Myers DJ, Nightingale C (eds) (2012) Neural networks for vision, speech and natural language, vol 1. Springer, BerlinMATH

Liu W et al (2017) A survey of deep neural network architectures and their applications. Neurocomputing 234:11–26CrossRef

Manogaran G, Thota C, Lopez D (2018) Human–computer interaction with big data analytics. In: Lopez D, Durai MA (eds) HCI challenges and privacy preservation in big data security. IGI Global, Pennsylvania, pp 1–22

Mirjalili S (2015) How effective is the Grey Wolf optimizer in training multi-layer perceptrons. Appl Intell 43(1):150–161CrossRef

Mirjalili S (2016) Dragonfly algorithm: a new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput Appl 27(4):1053–1073CrossRef

Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67CrossRef

Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61CrossRef

Pascanu R, Montufar G, Bengio Y (2013) On the number of response regions of deep feed forward networks with piece-wise linear activations. arXiv preprint arXiv:1312.6098

Piri S, Delen D, Liu T (2018) A synthetic informative minority over-sampling (SIMO) algorithm leveraging support vector machine to enhance learning from imbalanced datasets. Decis Support Syst 106:15–29CrossRef

Plageras AP et al (2017) Efficient large-scale medical data (ehealth big data) analytics in internet of things. In: 2017 IEEE 19th conference on business informatics (CBI), vol 2. IEEE

Plageras AP et al (2018) Efficient IoT-based sensor BIG Data collection—processing and analysis in smart buildings. Future Gener Comput Syst 82:349–357CrossRef

Pour SG, Girosi F (2016) Joint prediction of chronic conditions onset: comparing multivariate probits with multiclass support vector machines. In: Symposium on conformal and probabilistic prediction with applications. Springer, Cham

Qin P, Xu W, Guo J (2017) Designing an adaptive attention mechanism for relation classification. In: 2017 International joint conference on neural networks (IJCNN). IEEE

Rennie JD et al (2003) Tackling the poor assumptions of Naive Bayes text classifiers. In: Proceedings of the 20th international conference on machine learning (icml-03)

Rezaeianzadeh M et al (2014) Flood flow forecasting using ANN, ANFIS and regression models. Neural Comput Appl 25(1):25–37CrossRef

Sahoo RR, Ray M (2018) Metaheuristic techniques for test case generation: a review. J Inf Technol Res 11(1):158–171CrossRef

Salehinejad H et al (2017) Recent advances in recurrent neural networks. arXiv preprint arXiv:1801.01078

Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef

Schuster M, Paliwal KK, Hannun A, Case C, Casper J, Catanzaro B, Diamos G, Ryan EE (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681CrossRef

Sivakumar S, Sivakumar S (2017) Marginally stable triangular recurrent neural network architecture for time series prediction. IEEE Trans Cybern 48(10):2836–2850CrossRef

Sivarajah U et al (2017) Critical analysis of Big Data challenges and analytical methods. J Bus Res 70:263–286CrossRef

Song Q, Guo Y, Shepperd M (2018) A comprehensive investigation of the role of imbalanced learning for software defect prediction. IEEE Trans Software Eng. https://doi.org/10.1109/TSE.2018.2836442 CrossRef

Storey VC, Song I-Y (2017) Big data technologies and management: what conceptual modeling can do. Data Knowl Eng 108:50–67CrossRef

Voyant C et al (2017) Machine learning methods for solar radiation forecasting: a review. Renewable Energy 105:569–582CrossRef

Wang L, Zeng Y, Chen T (2015) Back propagation neural network with adaptive differential evolution algorithm for time series forecasting. Expert Syst Appl 42(2):855–863CrossRef

Wang Y, Kung LA, Byrd TA (2018) Big data analytics: understanding its capabilities and potential benefits for healthcare organizations. Technol Forecast Soc Chang 126:3–13CrossRef

Warde-Farley D (2018) Feedforward deep architectures for classification and synthesis

Zalesky A et al (2016) Connectome sensitivity or specificity: which is more important? Neuroimage 142:407–420CrossRef

Zhou L et al (2017) Machine learning on big data: opportunities and challenges. Neurocomputing 237:350–361CrossRef

Title: WOA + BRNN: An imbalanced big data classification framework using Whale optimization and deep neural network
Authors: Eslam. M. Hassib
Ali. I. El-Desouky
Labib. M. Labib
El-Sayed M. El-kenawy
Publication date: 11-03-2019
Publisher: Springer Berlin Heidelberg
Published in: Soft Computing / Issue 8/2020
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-019-03901-y

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 8/2020

IoT transaction processing through cooperative concurrency control on fog–cloud computing environment

Boolean lifting property in quantales

Research on key issues of gesture recognition for artificial intelligence

Cloud-assisted secure biometric identification with sub-linear search efficiency

Leveraging cloud computing for the semantic web: review and trends

Indoor Li-DAR 3D mapping algorithm with semantic-based registration and optimization

Premium Partner