Skip to main content
Log in

A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

This research, a hybrid model to feature selection for classification of different heart disease (HD) dataset is introduced. At first and, a filter method has been utilized to select the relevant feature sets from the actual feature sets, in particular ANOVA. At that point, an evolutionary wrapper-based methodology using whale optimization (WO) to discover the optimal feature sets from the previous feature selection is proposed. The primary target of utilizing WO is to tune into three stages. To start with, whale calculations are used to identify the whole features to dispose of half of the less significant features. Secondary, the congestions of mutual are utilized to make the rest of the features a priority and arrange. Tertiary, WO determines the majority of the 10 best features that use forward features. The support vector machine, K-nearest neighbor, and Naïve Bays have been utilized in the selection of the optimal feature set of coronary disease results. The ANOVA-WO technique is tested with binary and multi-class HD datasets to perform a complete analysis study. Since outcome investigation, better classification accuracy with extensively fewer features than that of the benchmark plans has been obtained through the proposed approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig.4

Similar content being viewed by others

References

  • Aalaei S, Shahraki H, Rowhanimanesh A, Eslami S (2016) Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran J Basic Med Sci 19(5):476

    Google Scholar 

  • Ala’M AZ, Faris H, Alqatawna JF, Hassonah MA (2018) Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl Based Syst 153:91–104

    Article  Google Scholar 

  • Bashir S, Khan ZS, Khan FH, Anjum A, Bashir K (2019) Improving heart disease prediction using feature selection approaches. In: 2019 16th IEEE international Bhurban conference on applied sciences and technology (IBCAST), Islamabad, Pakistan, pp 619–623

  • Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. KnowlInfSyst 34(3):483–519

    Google Scholar 

  • Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A (2015) Distributed feature selection: an application to microarray data classification. Appl Soft Comput 30:136–150

    Article  Google Scholar 

  • Chandrashekar G, Sahin F (2014) A survey on feature selection methods. ComputElectrEng 40(1):16–28

    Google Scholar 

  • Ghaddar B, Naoum-Sawaya J (2018) High dimensional data classification and feature selection using support vector machines. Eur J Oper Res 265(3):993–1004

    Article  MathSciNet  Google Scholar 

  • Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119

    Article  Google Scholar 

  • Kabir MM, Islam MM, Murase K (2010) A new wrapper feature selection approach using neural network. Neurocomputing 73(16–18):3273–3283

    Article  Google Scholar 

  • Karlekar NP, Gomathi N (2018) OW-SVM: Ontology and whale optimization-based support vector machine for privacy-preserved medical data classification in cloud. Int J CommunSyst 31(12):e3700

    Google Scholar 

  • Kashef S, Nezamabadi-pour H (2015) An advanced ACO algorithm for feature subset selection. Neurocomputing 147:271–279

    Article  Google Scholar 

  • Keerthika T, Premalatha K (2019) An effective feature selection for heart disease prediction with aid of hybrid kernel SVM. Int J Bus Intell Data Min 15(3):306–326

    Google Scholar 

  • Krawczuk J, Łukaszuk T (2016) The feature selection bias problem in relation to high-dimensional gene data. ArtifIntell Med 66:63–71

    Google Scholar 

  • Kumar PM, Gandhi UD (2018) A novel three-tier internet of things architecture with machine learning algorithm for early detection of heart diseases. ComputElectrEng 65:222–235

    Google Scholar 

  • Luo L, Ye L, Luo M, Huang D, Peng H, Yang F (2011) Methods of forward feature selection based on the aggregation of classifiers generated by single attribute. ComputBiol Med 41(7):435–441

    Article  Google Scholar 

  • Lyu H, Wan M, Han J, Liu R, Wang C (2017) A filter feature selection method based on the maximal information coefficient and Gram–Schmidt Orthogonalization for biomedical data mining. ComputBiol Med 89:264–274

    Article  Google Scholar 

  • Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312

    Article  Google Scholar 

  • Moayedikia A, Ong KL, Boo YL, Yeoh WG, Jensen R (2017) Feature selection for high dimensional imbalanced class data using harmony search. EngApplArtifIntell 57:38–49

    Google Scholar 

  • Moorthy U, Gandhi, UD (2018) A survey of big data analytics using machine learning algorithms. In: HCI challenges and privacy preservation in big data security, pp 95–123 (IGI Global)

  • Moorthy U, Gandhi UD (2019) Forest optimization algorithm-based feature selection using classifier ensemble. Comput Intell 35(4):1–18

    MathSciNet  Google Scholar 

  • Nagpal S, Arora S, Dey S (2017) Feature selection using gravitational search algorithm for biomedical data. ProcediaComputSci 115:258–265

    Google Scholar 

  • Saidala RK, Devarakonda NR (2017) Bubble-net hunting strategy of whales based optimized feature selection for e-mail classification. In 2017 2nd ınternational conference for convergence in technology (I2CT). IEEE, pp 626–631

  • Sammut C, Webb GI (2017) Encyclopedia of machine learning and data mining. Springer Publishing Company Incorporated, Berlin

    Book  Google Scholar 

  • Sasikala S, Alias Balamurugan SA, Geetha S (2016) Multi filtration feature selection (MFFS) to improve discriminatory ability in clinical data set. Appl Comput Inform 12(2):117–127

    Article  Google Scholar 

  • Sathishkumar VE, Cho Y (2020) A rule-based model for Seoul Bike sharing demand prediction using weather data. Eur J Remote Sens 53(1):1–18

    Article  Google Scholar 

  • Sathishkumar VE, Cho Y, Park J (2020) Seoul bike trip duration prediction using data mining techniques. IET Intell Transport Syst 14(11):1–12

    Google Scholar 

  • Sharbaf FV, Mosafer S, Moattar MH (2016) A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization. Genomics 107(6):231–238

    Article  Google Scholar 

  • Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca Raton

    Book  Google Scholar 

  • Stylios G, Katsis CD, Christodoulakis D (2014) Using bio-inspired intelligence for web opinion mining. Int J Comput Appl 87(5):38–43

    Google Scholar 

  • Vijayashree J, Sultana HP (2018) A machine learning framework for feature selection in heart disease classification using improved particle swarm optimization with support vector machine classifier. Program ComputSoftw 44(6):388–397

    Article  Google Scholar 

  • Vivekanandan T, Iyengar NCSN (2017) Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. ComputBiol Med 90:125–136

    Article  Google Scholar 

  • Zhang Y, Song XF, Gong DW (2017) A return-cost-based binary firefly algorithm for feature selection. InfSci 418:561–574

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Usha Devi Gandhi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moorthy, U., Gandhi, U.D. A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. J Ambient Intell Human Comput 12, 3527–3538 (2021). https://doi.org/10.1007/s12652-020-02592-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-020-02592-w

Keywords

Navigation