Abstract
This research, a hybrid model to feature selection for classification of different heart disease (HD) dataset is introduced. At first and, a filter method has been utilized to select the relevant feature sets from the actual feature sets, in particular ANOVA. At that point, an evolutionary wrapper-based methodology using whale optimization (WO) to discover the optimal feature sets from the previous feature selection is proposed. The primary target of utilizing WO is to tune into three stages. To start with, whale calculations are used to identify the whole features to dispose of half of the less significant features. Secondary, the congestions of mutual are utilized to make the rest of the features a priority and arrange. Tertiary, WO determines the majority of the 10 best features that use forward features. The support vector machine, K-nearest neighbor, and Naïve Bays have been utilized in the selection of the optimal feature set of coronary disease results. The ANOVA-WO technique is tested with binary and multi-class HD datasets to perform a complete analysis study. Since outcome investigation, better classification accuracy with extensively fewer features than that of the benchmark plans has been obtained through the proposed approaches.
Similar content being viewed by others
References
Aalaei S, Shahraki H, Rowhanimanesh A, Eslami S (2016) Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran J Basic Med Sci 19(5):476
Ala’M AZ, Faris H, Alqatawna JF, Hassonah MA (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl Based Syst 153:91–104
Bashir S, Khan ZS, Khan FH, Anjum A, Bashir K (2019) Improving heart disease prediction using feature selection approaches. In: 2019 16th IEEE international Bhurban conference on applied sciences and technology (IBCAST), Islamabad, Pakistan, pp 619–623
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. KnowlInfSyst 34(3):483–519
Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Distributed feature selection: an application to microarray data classification. Appl Soft Comput 30:136–150
Chandrashekar G, Sahin F (2014) A survey on feature selection methods. ComputElectrEng 40(1):16–28
Ghaddar B, Naoum-Sawaya J (2018) High dimensional data classification and feature selection using support vector machines. Eur J Oper Res 265(3):993–1004
Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119
Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283
Karlekar NP, Gomathi N (2018) OW-SVM: Ontology and whale optimization-based support vector machine for privacy-preserved medical data classification in cloud. Int J CommunSyst 31(12):e3700
Kashef S, Nezamabadi-pour H (2015) An advanced ACO algorithm for feature subset selection. Neurocomputing 147:271–279
Keerthika T, Premalatha K (2019) An effective feature selection for heart disease prediction with aid of hybrid kernel SVM. Int J Bus Intell Data Min 15(3):306–326
Krawczuk J, Łukaszuk T (2016) The feature selection bias problem in relation to high-dimensional gene data. ArtifIntell Med 66:63–71
Kumar PM, Gandhi UD (2018) A novel three-tier internet of things architecture with machine learning algorithm for early detection of heart diseases. ComputElectrEng 65:222–235
Luo L, Ye L, Luo M, Huang D, Peng H, Yang F (2011) Methods of forward feature selection based on the aggregation of classifiers generated by single attribute. ComputBiol Med 41(7):435–441
Lyu H, Wan M, Han J, Liu R, Wang C (2017) A filter feature selection method based on the maximal information coefficient and Gram–Schmidt Orthogonalization for biomedical data mining. ComputBiol Med 89:264–274
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312
Moayedikia A, Ong KL, Boo YL, Yeoh WG, Jensen R (2017) Feature selection for high dimensional imbalanced class data using harmony search. EngApplArtifIntell 57:38–49
Moorthy U, Gandhi, UD (2018) A survey of big data analytics using machine learning algorithms. In: HCI challenges and privacy preservation in big data security, pp 95–123 (IGI Global)
Moorthy U, Gandhi UD (2019) Forest optimization algorithm-based feature selection using classifier ensemble. Comput Intell 35(4):1–18
Nagpal S, Arora S, Dey S (2017) Feature selection using gravitational search algorithm for biomedical data. ProcediaComputSci 115:258–265
Saidala RK, Devarakonda NR (2017) Bubble-net hunting strategy of whales based optimized feature selection for e-mail classification. In 2017 2nd ınternational conference for convergence in technology (I2CT). IEEE, pp 626–631
Sammut C, Webb GI (2017) Encyclopedia of machine learning and data mining. Springer Publishing Company Incorporated, Berlin
Sasikala S, Alias Balamurugan SA, Geetha S (2016) Multi filtration feature selection (MFFS) to improve discriminatory ability in clinical data set. Appl Comput Inform 12(2):117–127
Sathishkumar VE, Cho Y (2020) A rule-based model for Seoul Bike sharing demand prediction using weather data. Eur J Remote Sens 53(1):1–18
Sathishkumar VE, Cho Y, Park J (2020) Seoul bike trip duration prediction using data mining techniques. IET Intell Transport Syst 14(11):1–12
Sharbaf FV, Mosafer S, Moattar MH (2016) A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107(6):231–238
Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca Raton
Stylios G, Katsis CD, Christodoulakis D (2014) Using bio-inspired intelligence for web opinion mining. Int J Comput Appl 87(5):38–43
Vijayashree J, Sultana HP (2018) A machine learning framework for feature selection in heart disease classification using improved particle swarm optimization with support vector machine classifier. Program ComputSoftw 44(6):388–397
Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. ComputBiol Med 90:125–136
Zhang Y, Song XF, Gong DW (2017) A return-cost-based binary firefly algorithm for feature selection. InfSci 418:561–574
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Moorthy, U., Gandhi, U.D. A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. J Ambient Intell Human Comput 12, 3527–3538 (2021). https://doi.org/10.1007/s12652-020-02592-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02592-w