Skip to main content

2024 | OriginalPaper | Buchkapitel

Ensemble Classification with Lazy Predict on Three Diabetes Datasets: A Comparative Study with Resampling Techniques

verfasst von : Afshan Hashmi, Md Tabrez Nafis, Sameena Naaz, Imran Hussain

Erschienen in: Proceedings of Third International Conference on Computing and Communication Networks

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Millions of people throughout the world suffer from the chronic illness diabetes mellitus. Effective diabetes care and complication avoidance depend on early diabetes prediction and diagnosis. Using the three distinct datasets—the PIMA India dataset, the NHANES dataset, and Mendeley's diabetes dataset—we give a thorough analysis of diabetic prediction in this study. Lazy Predict enables us to efficiently evaluate a wide range of classifiers on each dataset, providing valuable insights into model performance. The top-performing model on each dataset is selected as the best individual model. Furthermore, ensembles are created by combining the predictions of the top ten models without any resampling and with resampling techniques. Random forest achieved the highest accuracy of 79% on the PIMA dataset, XGB achieved the highest accuracy of 99% on Mendeley’s dataset, and the dummy classifier attained the highest accuracy of 88%. for the NHANES dataset. However, the ensembles without oversampling consistently outperformed their counterparts with resampling. Surprisingly, the ensemble without oversampling exhibited the highest accuracy overall, followed by the ensemble with oversampling, challenging the common notion that resampling always leads to improved performance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat ElSeddawy, A.I., Karim, F.K., Hussein, A.M., Khafaga, D.S.: Predictive analysis of diabetes risk with class imbalance. Comput. Intell. Neurosci. 2022, 3078025 (2022)CrossRef ElSeddawy, A.I., Karim, F.K., Hussein, A.M., Khafaga, D.S.: Predictive analysis of diabetes risk with class imbalance. Comput. Intell. Neurosci. 2022, 3078025 (2022)CrossRef
3.
Zurück zum Zitat Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015). Author, F., Author, S.: Title of a proceedings paper. In: Editor, F., Editor, S. (eds.) Conference 2016, LNCS, vol. 9999, pp. 1–13. Springer, Heidelberg. (2016) Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems, pp. 2962–2970 (2015). Author, F., Author, S.: Title of a proceedings paper. In: Editor, F., Editor, S. (eds.) Conference 2016, LNCS, vol. 9999, pp. 1–13. Springer, Heidelberg. (2016)
4.
Zurück zum Zitat Kaleem, H., Liaqat, S., Hassan, M.T., Mehmood, A., Ahmad, U., Ditta, A.: An Intelligent Healthcare system for detecting diabetes using machine learning algorithms. Lahore Garrison Univ. Res. J. Comput. Sci. Inf. Technol. 6(03), 1–11 (2022) Kaleem, H., Liaqat, S., Hassan, M.T., Mehmood, A., Ahmad, U., Ditta, A.: An Intelligent Healthcare system for detecting diabetes using machine learning algorithms. Lahore Garrison Univ. Res. J. Comput. Sci. Inf. Technol. 6(03), 1–11 (2022)
5.
Zurück zum Zitat Voinsky, I., Fridland, O.Y., Aran, A., Frye, R.E., Gurwitz, D.: Machine learning-based blood RNA signature for diagnosis of autism spectrum disorder. Int. J. Mol. Sci. 24(3), 2082 (2023)CrossRef Voinsky, I., Fridland, O.Y., Aran, A., Frye, R.E., Gurwitz, D.: Machine learning-based blood RNA signature for diagnosis of autism spectrum disorder. Int. J. Mol. Sci. 24(3), 2082 (2023)CrossRef
6.
Zurück zum Zitat Romero, R.A.A., Deypalan, M.N.Y., Mehrotra, S. et al.: Benchmarking AutoML frameworks for disease prediction using medical claims. BioData Mining 15, 15 (2022) Romero, R.A.A., Deypalan, M.N.Y., Mehrotra, S. et al.: Benchmarking AutoML frameworks for disease prediction using medical claims. BioData Mining 15, 15 (2022)
7.
Zurück zum Zitat Yang, T., Zhang, L., Yi, L., Feng, H., Li, S., Chen, H., Zhu, J., Zhao, J., Zeng, Y., Liu, H.: Ensemble learning models based on noninvasive features for type 2 diabetes screening: model development and validation. JMIR Med. Inform. 8(6), e15431 (2020)CrossRef Yang, T., Zhang, L., Yi, L., Feng, H., Li, S., Chen, H., Zhu, J., Zhao, J., Zeng, Y., Liu, H.: Ensemble learning models based on noninvasive features for type 2 diabetes screening: model development and validation. JMIR Med. Inform. 8(6), e15431 (2020)CrossRef
8.
Zurück zum Zitat Kulkarni, G.N., Ambesange, S., Vijayalaxmi, A., Sahoo, A.: Comparision of diabetic prediction AutoML model with customized model. In: 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, pp. 842–847 (2021) Kulkarni, G.N., Ambesange, S., Vijayalaxmi, A., Sahoo, A.: Comparision of diabetic prediction AutoML model with customized model. In: 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, pp. 842–847 (2021)
9.
Zurück zum Zitat Petinrin, O.O., Saeed, F., Toseef, M., Liu, Z., Basurra, S., Muyide, I.O., ... , Wong, K.C.: Machine learning in metastatic cancer research: Potentials, possibilities, and prospects. Comput. Struct. Biotechnol. J. (2023) Petinrin, O.O., Saeed, F., Toseef, M., Liu, Z., Basurra, S., Muyide, I.O., ... , Wong, K.C.: Machine learning in metastatic cancer research: Potentials, possibilities, and prospects. Comput. Struct. Biotechnol. J. (2023)
10.
Zurück zum Zitat da Silva, P.M.M., Vieira, E.R., Morya, E., Azevêdo, F.: 5.4 Artigo 04: Machine learning-based on type 2 diabetes detection using spatiotemporal and pressure distribution gait parameters. Aprendizagem de máquina aplicada à execução da marcha em diabéticos tipo 2, 96 (2023) da Silva, P.M.M., Vieira, E.R., Morya, E., Azevêdo, F.: 5.4 Artigo 04: Machine learning-based on type 2 diabetes detection using spatiotemporal and pressure distribution gait parameters. Aprendizagem de máquina aplicada à execução da marcha em diabéticos tipo 2, 96 (2023)
11.
Zurück zum Zitat Nasution, A.K., Wijaya, S.H., Gao, P., Islam, R.M., Huang, M., Ono, N., ... , Altaf-Ul-Amin, M.: Prediction of potential natural antibiotics plants based on jamu formula using random forest classifier. Antibiotics 11(9), 1199 (2022) Nasution, A.K., Wijaya, S.H., Gao, P., Islam, R.M., Huang, M., Ono, N., ... , Altaf-Ul-Amin, M.: Prediction of potential natural antibiotics plants based on jamu formula using random forest classifier. Antibiotics 11(9), 1199 (2022)
12.
Zurück zum Zitat Rashed, A.E.E., Elmorsy, A.M., Atwa, A.E.M.: Comparative evaluation of automated machine learning techniques for breast cancer diagnosis. Biomed. Signal Process. Control 86, 105016 (2023)CrossRef Rashed, A.E.E., Elmorsy, A.M., Atwa, A.E.M.: Comparative evaluation of automated machine learning techniques for breast cancer diagnosis. Biomed. Signal Process. Control 86, 105016 (2023)CrossRef
13.
Zurück zum Zitat Singh, A., Dhillon, A., Kumar, N., Hossain, M.S., Muhammad, G., Kumar, M.: EDiaPredict: An ensemble-based framework for diabetes prediction. ACM Trans. Multimid. Comput. Commun. Appl. 17(2s), 1–26 (2021) Singh, A., Dhillon, A., Kumar, N., Hossain, M.S., Muhammad, G., Kumar, M.: EDiaPredict: An ensemble-based framework for diabetes prediction. ACM Trans. Multimid. Comput. Commun. Appl. 17(2s), 1–26 (2021)
Metadaten
Titel
Ensemble Classification with Lazy Predict on Three Diabetes Datasets: A Comparative Study with Resampling Techniques
verfasst von
Afshan Hashmi
Md Tabrez Nafis
Sameena Naaz
Imran Hussain
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-0892-5_37