Skip to main content

Advertisement

Log in

A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data

  • Transactional Processing Systems
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Coronary artery disease (CAD) is caused by atherosclerosis in coronary arteries and results in cardiac arrest and heart attack. For diagnosis of CAD, angiography is used which is a costly time consuming and highly technical invasive method. Researchers are, therefore, prompted for alternative methods such as machine learning algorithms that could use noninvasive clinical data for the disease diagnosis and assessing its severity. In this study, we present a novel hybrid method for CAD diagnosis, including risk factor identification using correlation based feature subset (CFS) selection with particle swam optimization (PSO) search method and K-means clustering algorithms. Supervised learning algorithms such as multi-layer perceptron (MLP), multinomial logistic regression (MLR), fuzzy unordered rule induction algorithm (FURIA) and C4.5 are then used to model CAD cases. We tested this approach on clinical data consisting of 26 features and 335 instances collected at the Department of Cardiology, Indira Gandhi Medical College, Shimla, India. MLR achieves highest prediction accuracy of 88.4 %.We tested this approach on benchmarked Cleaveland heart disease data as well. In this case also, MLR, outperforms other techniques. Proposed hybridized model improves the accuracy of classification algorithms from 8.3 % to 11.4 % for the Cleaveland data. The proposed method is, therefore, a promising tool for identification of CAD patients with improved prediction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Wong, N.D., Epidemiological studies of CHD and the evolution of preventive cardiology. Nat. Rev. Cardiol. 11(5):276–289, 2014.

    Article  PubMed  Google Scholar 

  2. http://www.who.int/mediacentre/factsheets/fs317/en/ (Accessed on January 2016).

  3. Tsipouras, M.G., Exarchos, T.P., Fotiadis, D.I., Kotsia, A.P., Vakalis, K.V., Naka, K.K., and Michalis, L.K., Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling. IEEE Trans. Inf. Technol. Biomed. 12(4):447–458, 2008.

    Article  PubMed  Google Scholar 

  4. http://heartdiseaseonline.com (Accessed on November 2015).

  5. Acharya, U.R., Faust, O., Sree, V., Swapna, G., Martis, R.J., Kadri, N.A., and Suri, J.S., Linear and nonlinear analysis of normal and CAD-affected heart rate signals. Comput. Methods Prog. Biomed. 113(1):55–68, 2014.

    Article  Google Scholar 

  6. Giri, D., Acharya, U.R., Martis, R.J., Sree, S.V., Lim, T.C., Ahamed, T., and Suri, J.S., Automated diagnosis of coronary artery disease affected patients using LDA, PCA, ICA and discrete wavelet transform. Knowl.-Based Syst. 37:274–282, 2013.

    Article  Google Scholar 

  7. http://www.nhlbi.nih.gov/health/health-topics/topics/cad (Accessed on February 2016).

  8. Alizadehsani, R., Hosseini, M. J., Sani, Z. A., Ghandeharioun, A., & Boghrati, R., Diagnosis of coronary artery disease using cost-sensitive algorithms. In Data Mining Workshops (ICDMW), 2012 I.E. 12th International Conference on (pp. 9–16). IEEE, 2012.

  9. Arafat, S., Dohrmann, M., & Skubic, M., Classification of coronary artery disease stress ECGs using uncertainty modeling. In Computational Intelligence Methods and Applications, 2005 ICSC Congress on (pp. 4-pp). IEEE, 2005.

  10. Lee, H. G., Noh, K. Y., & Ryu, K. H., A data mining approach for coronary heart disease prediction using HRV features and carotid arterial wall thickness. In BioMedical Engineering and Informatics, 2008. BMEI 2008. International Conference on (Vol. 1, pp. 200–206). IEEE, 2008.

  11. Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Molinari, F., Saba, L., Ho, S.Y.S., and Suri, J.S., Atherosclerotic risk stratification strategy for carotid arteries using texture-based features. Ultrasound Med. Biol. 38(6):899–915, 2012.

    Article  PubMed  Google Scholar 

  12. Acharya, U.R., Mookiah, M.R.K., Sree, S.V., Afonso, D., Sanches, J., Shafique, S., and Suri, J.S., Atherosclerotic plaque tissue characterization in 2D ultrasound longitudinal carotid scans for automated classification: a paradigm for stroke risk assessment. Med. Biol. Eng. Comput. 51(5):513–523, 2013.

    Article  PubMed  Google Scholar 

  13. Zhao, Z., & Ma, C., An intelligent system for noninvasive diagnosis of coronary artery disease with EMD-TEO and BP neural network. In Education Technology and Training, 2008. and 2008 International Workshop on Geoscience and Remote Sensing. ETT and GRS 2008. International Workshop on (Vol. 2, pp. 631–635). IEEE, 2008.

  14. Acharya, U.R., Sree, S.V., Krishnan, M.M.R., Krishnananda, N., Ranjan, S., Umesh, P., and Suri, J.S., Automated classification of patients with coronary artery disease using grayscale features from left ventricle echocardiographic images. Comput. Methods Prog. Biomed. 112(3):624–632, 2013.

    Article  Google Scholar 

  15. Kim, W. S., Jin, S. H., Park, Y. K., & Choi, H. M., A study on development of multi-parametric measure of heart rate variability diagnosing cardiovascular disease. In World Congress on Medical Physics and Biomedical Engineering 2006 (pp. 3480–3483). Springer: Berlin Heidelberg, 2007.

  16. Patidar, S., Pachori, R.B., and Acharya, U.R., Automated diagnosis of coronary artery disease using tunable-Q wavelet transform applied on heart rate signals. Knowl.-Based Syst. 82:1–10, 2015.

    Article  Google Scholar 

  17. Xing, Y., Wang, J., Zhao, Z., & Gao, Y., Combination data mining methods with new medical data to predicting outcome of coronary heart disease. In Convergence Information Technology, 2007. International Conference on (pp. 868–872). IEEE, 2007.

  18. Alizadehsani, R., Habibi, J., Hosseini, M.J., Mashayekhi, H., Boghrati, R., Ghandeharioun, A., and Sani, Z.A., A data mining approach for diagnosis of coronary artery disease. Comput. Methods Prog. Biomed. 111(1):52–61, 2013.

    Article  Google Scholar 

  19. Karaolis, M.A., Moutiris, J.A., Hadjipanayi, D., and Pattichis, C.S., Assessment of the risk factors of coronary heart events based on data mining with decision trees. IEEE Trans. Inf. Technol. Biomed. 14(3):559–566, 2010.

    Article  PubMed  Google Scholar 

  20. Ordonez, C., Association rule discovery with the train and test approach for heart disease prediction. IEEE Trans. Inf. Technol. Biomed. 10(2):334–343, 2006.

    Article  PubMed  Google Scholar 

  21. Srinivas, K., Rao, G. R., & Govardhan, A., Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. In Computer Science and Education (ICCSE), 2010 5th International Conference on (pp. 1344–1349). IEEE, 2010.

  22. Palaniappan, S., & Awang, R., Intelligent heart disease prediction system using data mining techniques. In Computer Systems and Applications, 2008. AICCSA 2008. IEEE/ACS International Conference on (pp. 108–115). IEEE, 2008.

  23. Melillo, P., Izzo, R., Orrico, A., Scala, P., Attanasio, M., Mirra, M., and Pecchia, L., Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PLoS One. 10(3):e0118504, 2015.

    Article  PubMed  PubMed Central  Google Scholar 

  24. Acharya, U.R., Faust, O., Sree, S.V., Molinari, F., Saba, L., Nicolaides, A., and Suri, J.S., An accurate and generalized approach to plaque characterization in 346 carotid ultrasound scans. IEEE Trans. Instrum. Meas. 61(4):1045–1053, 2012.

    Article  Google Scholar 

  25. Lin, K.C., and Hsieh, Y.H., Classification of medical datasets using SVMs with hybrid evolutionary algorithms based on endocrine-based particle swarm optimization and artificial bee Colony algorithms. J. Med. Syst. 39(10):1–9, 2015.

    CAS  Google Scholar 

  26. Subanya, B., & Rajalaxmi, R. R., Feature selection using Artificial Bee Colony for cardiovascular disease classification. In Electronics and Communication Systems (ICECS), 2014 International Conference on (pp. 1–6). IEEE, 2014.

  27. Amin, S. U., Agarwal, K., & Beg, R., Genetic neural network based data mining in prediction of heart disease using risk factors. In Information & Communication Technologies (ICT), 2013 I.E. Conference on (pp. 1227–1231). IEEE, 2013.

  28. Kumar, R., Negi, P.C., Bhardwaj, R., Kandoria, A., Asotra, S., Ganju, N., and Marwah, R., Clinical and non-invasive predictors of the presence and extent of coronary artery disease. Indian Heart J. 66:S28, 2014.

    Article  Google Scholar 

  29. Eom, J.H., Kim, S.C., and Zhang, B.T., AptaCDSS-E: a classifier ensemble-based clinical decision support system for cardiovascular disease level prediction. Expert Syst. Appl. 34(4):2465–2479, 2008.

    Article  Google Scholar 

  30. Yeh, D.Y., Cheng, C.H., and Chen, Y.W., A predictive model for cerebrovascular disease using data mining. Expert Syst. Appl. 38(7):8970–8977, 2011.

    Article  Google Scholar 

  31. Kupusinac, A., Stokic, E., and Kovacevic, I., Hybrid EANN-EA system for the primary estimation of Cardiometabolic risk. J. Med. Syst. 40(6):1–9, 2016.

    Article  Google Scholar 

  32. Le Cessie, S., & Van Houwelingen, J. C., Ridge estimators in logistic regression. Applied statistics, 191–201, 1992.

  33. Cohen, W. W., Fast effective rule induction. In Proceedings of the twelfth international conference on machine learning (pp. 115–123), 1995.

  34. Hühn, J., and Hüllermeier, E., FURIA: an algorithm for unordered fuzzy rule induction. Data Min. Knowl. Disc. 19(3):293–319, 2009.

    Article  Google Scholar 

  35. Quinlan, J. R., C4. 5: Program for machine learning Morgan Kaufmann. San Mateo, CA, 1993.

  36. Melillo, P., De Luca, N., Bracale, M., and Pecchia, L., Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability. IEEE J. Biomed. Health Inform. 17(3):727–733, 2013.

    Article  PubMed  Google Scholar 

  37. Novaković, J., Štrbac, P., & Bulatović, D., Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav Journal of Operations Research ISSN: 0354-0243 EISSN: 2334-6043, 21(1), 2011.

  38. Guyon, I., and Elisseeff, A., An introduction to variable and feature selection. J. Mach. Learn. Res. 3:1157–1182, 2003.

    Google Scholar 

  39. Piramuthu, S., Evaluating feature selection methods for learning in data mining applications. Eur. J. Oper. Res. 156(2):483–494, 2004.

    Article  Google Scholar 

  40. Hall, M. A., Correlation-based feature selection for machine learning (Doctoral dissertation, The University of Waikato), 1999.

  41. Babaoglu, İ., Findik, O., and Ülker, E., A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Expert Syst. Appl. 37(4):3177–3183, 2010.

    Article  Google Scholar 

  42. Ebenhart, R., Kennedy. Particle swarm optimization. In Proceeding IEEE Inter Conference on Neural Networks, Perth, Australia, Piscat-away (Vol. 4, pp. 1942–1948), 1995.

  43. Xue, B., Zhang, M., and Browne, W.N., Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6):1656–1671, 2013.

    Article  PubMed  Google Scholar 

  44. http://www.cs.waikato.ac.nz/ml/weka/index.html (Accessed on October 2015).

  45. Purwar, A., and Singh, S.K., Hybrid prediction model with missing value imputation for medical data. Expert Syst. Appl. 42(13):5621–5631, 2015.

    Article  Google Scholar 

  46. Kahramanli, H., and Allahverdi, N., Design of a hybrid system for the diabetes and heart diseases. Expert Syst. Appl. 35(1):82–89, 2008.

    Article  Google Scholar 

  47. Peter, T. J., & Somasundaram, K., An empirical study on prediction of heart disease using classification data mining techniques. InAdvances in Engineering, Science and Management (ICAESM), 2012 International Conference on (pp. 514–518). IEEE, 2012.

  48. Bouali, H., & Akaichi, J., Comparative Study of Different Classification Techniques: Heart Disease Use Case. In Machine Learning and Applications (ICMLA), 2014 13th International Conference on (pp. 482–486). IEEE, 2014.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangeet Srivastava.

Additional information

This article is part of the Topical Collection on Transactional Processing Systems

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Verma, L., Srivastava, S. & Negi, P.C. A Hybrid Data Mining Model to Predict Coronary Artery Disease Cases Using Non-Invasive Clinical Data. J Med Syst 40, 178 (2016). https://doi.org/10.1007/s10916-016-0536-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-016-0536-z

Keywords

Navigation