Skip to main content

Advertisement

Log in

BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting

  • Technical Paper
  • Published:
Australasian Physical & Engineering Sciences in Medicine Aims and scope Submit manuscript

Abstract

Conventional clinical decision support systems are based on individual classifiers or simple combination of these classifiers which tend to show moderate performance. This research paper presents a novel classifier ensemble framework based on enhanced bagging approach with multi-objective weighted voting scheme for prediction and analysis of heart disease. The proposed model overcomes the limitations of conventional performance by utilizing an ensemble of five heterogeneous classifiers: Naïve Bayes, linear regression, quadratic discriminant analysis, instance based learner and support vector machines. Five different datasets are used for experimentation, evaluation and validation. The datasets are obtained from publicly available data repositories. Effectiveness of the proposed ensemble is investigated by comparison of results with several classifiers. Prediction results of the proposed ensemble model are assessed by ten fold cross validation and ANOVA statistics. The experimental evaluation shows that the proposed framework deals with all type of attributes and achieved high diagnosis accuracy of 84.16 %, 93.29 % sensitivity, 96.70 % specificity, and 82.15 % f-measure. The f-ratio higher than f-critical and p value less than 0.05 for 95 % confidence interval indicate that the results are extremely statistically significant for most of the datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. <http://archive.ics.uci.edu/ml/datasets.html> [last Accessed: Sep 25 2013].

  2. <http://archive.ics.uci.edu/ml/datasets.html> [last Accessed: Sep 25 2013].

  3. http://en.wikipedia.org/wiki/Rawalpindi_Institute_of_Cardiology [Last accessed on 8th December, 2014].

References

  1. Rajkumar A, Reena GS (2010) Diagnosis of heart disease using data mining algorithm. Glob J Comput Sci Technol 10(10):38

  2. Porter T, Green B (2009) Identifying diabetic patients: a data mining approach. In: Americas conference on information systems

  3. Panzarasa S et al. (2010) Data mining techniques for analyzing stroke care processes. In: Proceedings of the 13th world congress on medical informatics

  4. Li L, Tang H, Wu Z, Gong J, Gruidl M, Zou J Tockman M, Clark RA (2004) Data mining techniques for cancer detection using serum proteomic profiling. In: Artificial intelligence in medicine, Elsevier

  5. Das R, Turkoglu I, Sengur A (2009) Effective diagnosis of heart disease through neural networks ensembles. In: Expert Systems with Applications, Elsevier, pp. 7675–7680

  6. Srinivas K, Rani BK, Govrdhan A (2010) Applications of data mining techniques in healthcare and prediction of heart attacks. Int J Comput Sci Eng (IJCSE) 2:250–255

    Google Scholar 

  7. Shouman M, Turner T, Stocker R (2012) Using data mining techniques in heart disease diagnosis and treatment. 978-1-4673-0484-9/12, IEEE

  8. Zhang L, Zhou WD (2011) Sparse ensembles using weighted combination methods based on linear programming. Pattern Recognit 44:97–106

    Article  Google Scholar 

  9. Pattekari SA, Parveen A (2012) Prediction system for heart disease using Naïve Bayes. Int J Adv Computer Math Sci 3(3):290–294

    Google Scholar 

  10. Peter TJ, Somasundaram K (2012) An empirical study on prediction of heart disease using classification data mining techniques. In: IEEE-International conference on advances in engineering, science and management (ICAESM-2012)

  11. Ghumbre S, Patil C, Ghatol A (2011) Heart disease diagnosis using support vector machine. In: International conference on computer science and information technology (ICCSIT’) Pattaya

  12. Chitra R, Seenivasagam DV (2013) Heart disease prediction system using supervised learning classifier. Int J Softw Eng Soft Comput 3(1):01–07

    Article  Google Scholar 

  13. Chen AH, Huang SY, Hong PS, Cheng CH, Lin EJ (2011) HDPS: heart disease prediction system. In: Computing in cardiology

  14. Jabbar MA, Chandra P, Deekshatulu BL (2012) Heart disease prediction system using associative classification and genetic algorithm. In: International conference on emerging trends in electrical, electronics and communication technologies-ICECIT

  15. Valente G, Castellanos AL, Vanacor EG, Formisan OE (2014) Multivariate linear regression of high-dimensional fMRI data with multiple target variables. Hum brain mapp 35(2):2163–2177

  16. Rizk-Jackson A, Stoffers D, Sheldon S, Kuperman J, Dale A, Goldstein J, Corey-Bloom J, Poldrack RA, Aron AR (2011) Evaluating imaging biomarkers for neurodegeneration in pre-symptomatic Huntington’s disease using machine learning techniques. NeuroImage 56(2):788–796

    Article  PubMed  Google Scholar 

  17. Maroco J, Silva D, Rodrigues A, Guerreiro M, Santana I, Mendonça AD (2011) Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests. BMC Res Notes 4(1):299

    Article  PubMed Central  PubMed  Google Scholar 

  18. Helmy T, Rahman SM, Hossain MI, Abdelraheem A (2013) Non-linear heterogeneous ensemble model for permeability prediction of oil reservoirs. Arab J Sci Eng 38:1379–1395

    Article  CAS  Google Scholar 

  19. Saha S, Ekbal A (2013) Combining multiple classifiers using vote based classifier ensemble technique for named entity recognition. Data Knowl Eng 85:15–39

    Article  Google Scholar 

  20. Mokeddem S, Atmani B, Mokaddem M (2013) Supervised feature selection for diagnosis of coronary artery disease based on genetic algorithm. In: First international conference on computational science and engineering (CSE-2013)

  21. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324

    Article  Google Scholar 

  22. Patil RR (2014) Heart disease prediction system using Naive Bayes and Jelinek-mercer smoothing. Int J Adv Res Comput Commun Eng

  23. Palaniappan S, Awang R (2008) Intelligent heart disease prediction system using data mining techniques. In: International conference on computer system and applications. AICCSA, pp 108–115

  24. Mehra A (2003) Statistical sampling and regression: simple linear regression. PreMBA analytical methods. Columbia Business School and Columbia University

  25. Weiss SM, Kulikowski CA (1991) Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems. Morgan Kaufman, San Mateo

    Google Scholar 

  26. STAT55-Data mining (2014) The Pennsylvania State University

  27. Uguroglu S, Carbonell J, Doyle M, Biederman R (2012) Cost-sensitive risk stratification in the diagnosis of heart disease. In: Proceedings of the twenty-fourth innovative applications of artificial intelligence conference

  28. Breiman L (1994) Bagging Predictors, Technical Report 421, Department of Statistics, University of California, Berkeley

  29. Jain M, Dua P, Lukiw WJ (2013) Data adaptive rule-based classification system for Alzheimer classification. J Comput Sci Syst Biol 6:291–297

    Article  Google Scholar 

  30. Peter TJ, Somasundaram K (2012) An empirical study on prediction of heart disease using classification data mining techniques, In: IEEE-international conference on advances in engineering, science and management

  31. Tu MC, Shin D, Shin D (2009) Effective diagnosis of heart disease through Bagging approach. In: 2nd international conference on biomedical engineering and informatics

  32. Pai P, Li L, Hung W (2014) Using ADABOOST and rough set theory for Debris flow disaster. Water Resour Manag 28(4):1143–1155

  33. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference and prediction, 2nd edn. Springer series in statistics

  34. BLA (2009) Sensitivity, specificity, accuracy and the relationship between them. Bioinformatics

  35. Palaniappan S, Awang R (2008) Intelligent heart disease prediction system using data mining techniques. IJCSNS Int J Comput Sci Netw Secur, 8(8)

  36. Gelman A (2008) Variance, analysis of. The new Palgrave dictionary of economics, 2nd edn. Palgrave Macmillan, Basingstoke, Hampshire New York

    Google Scholar 

  37. Yuan G, Ho C, Lin C (2012) Recent advances of large-scale linear classification. Proc IEEE 100(9):2584–2603

    Article  Google Scholar 

  38. Shouman M, Turner T, Stocker R (2011) Using decision tree for diagnosing heart disease patients. In: Proceedings of the 9th Australasian data mining conference, Ballarat, Australia

  39. Tu MC, Shin D et al (2009) Effective diagnosis of heart disease through bagging approach. In: 2nd international conference on biomedical engineering and informatics. IEEE, pp 1–4

  40. Shouman M, Turner T, Stocker R (2013) Integrating clustering with different data mining techniques in the diagnosis of heart disease. J Comput Sci Eng 20(1)

  41. Shouman M, Turner T, Stocker R (2012) Integrating Naive Bayes and K-means clustering with different initial centroid selection methods in the diagnosis of heart disease patients. Glob J Comput Sci Technol 125–137

  42. Chaurasia V, Pal S (2013) Early prediction of heart diseases using data mining techniques. Caribb J Sci Technol 1:208–217

    Google Scholar 

  43. Sunday NA, Latha PP (2013) Performance analysis of classification data mining techniques over heart disease database. Int J Eng Sci Adv Technol 2(3):470–478

    Google Scholar 

  44. Soni J, Ansari U, Sharma D (2011) Intelligent and effective heart disease prediction system using weighted associative classifiers. Int J Computer Sci Eng (IJCSE) 3(6):2385–2392

    Google Scholar 

Download references

Acknowledgments

We are grateful to Rawalpindi Institute of Cardiology for their support in using the proposed DSS for research purposes only under the strict supervision of a team of medical experts and their information technology team.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farhan Hassan Khan.

Appendix

Appendix

See Table 14.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bashir, S., Qamar, U. & Khan, F.H. BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting. Australas Phys Eng Sci Med 38, 305–323 (2015). https://doi.org/10.1007/s13246-015-0337-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13246-015-0337-6

Keywords

Navigation