Abstract
Kidney transplantation outcome prediction is very significant and doesn’t require emphasis. This will grant the selection of the best available kidney donor and the best immunosuppressive treatment for patients. Survival prediction before treatment could simplify patient’s decision making and boost survival by altering clinical practice. This paper proposes a new novel prediction method based on data mining techniques to predict five-year graft survival after transplantation. This new proposed prediction method composes of three stages: data preparation stage (DPS), feature selection stage (FSS), and prediction stage (PS). The new proposed prediction method merges information gain with naïve Bayes and k-nearest neighbor. Initially, it uses information gain to select the essential features, uses naïve Bayes to select the most essential features. These two methods are combined in a new hybrid feature selection method which chooses the minimum number of features that produce highest accuracy. Finally, it uses k-nearest neighbor for graft survival prediction classification. The proposed prediction method has been evaluated against recent techniques. Experimental results have proven that the proposed prediction method outperforms the recent techniques as it attains the maximum accuracy and F-measure with minimal errors. This prediction method can also be used in other transplant datasets.
Similar content being viewed by others
References
Akl A, Ismail AM, Ghoneim M (2008) Prediction of graft survival of living-donor kidney transplantation: nomograms or artificial neural networks? Transplantation 86(10):1401–1406
Akl A, Mostafa A, Ghoneim MA (2008) Nomogram that predicts graft survival probability following living-donor kidney transplant. Exp Clin Transplant 6(1):30–36
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Statistics Surveys 4:40–79
Atallah DM, Eldesoky AI, Amira Y, Ghoneim MA (2014) One-year renal graft survival prediction using a weighted decision tree classifier. International Journal of Engineering & Technology 3(3):327
Ben-Bassat M (1982) Pattern recognition and reduction of dimensionality. Handbook of Statistics 2(1982):773–910
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1-2):245–271
Breiman L (2017) Classification and regression trees. Routledge, Abingdon
Brier ME, Ray PC, Klein JB (2003) Prediction of delayed renal allograft function using an artificial neural network. Nephrol Dial Transplant 18(12):2655–2659
Brown TS, Elster EA, Stevens K, Graybill JC, Gillern S, Phinney S, Salifu MO, Jindal RM (2012) Bayesian modeling of pretransplant variables accurately predicts kidney graft survival. Am J Nephrol 36(6):561–569
Cawley GC, Talbot NL (2010) On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res 11:2079–2107
Dag A, Oztekin A, Yucel A, Bulur S, Megahed FM (2017) Predicting heart transplantation outcomes through data analytics. Decis Support Syst 94:42–52
Dag A, Topuz K, Oztekin A, Bulur S, Megahed FM (2016) A probabilistic data-driven framework for scoring the preoperative recipient-donor heart transplant survival. Decis Support Syst 86:1–12
Das S (2001) Filters, wrappers and a boosting-based hybrid for feature selection. In: Icml, pp 74-81
Dash M, Liu H (1997) Feature selection for classification. Intelligent Data Analysis 1(3):131–156
Doak J (1992) CSE-92-18-an evaluation of feature selection methodsand their application to computer security
Doyle HR, Dvorchik I, Mitchell S, Marino IR, Ebert FH, McMichael J, Fung JJ (1994) Predicting outcomes after liver transplantation. A connectionist approach. Ann Surg 219(4):408
Duch W, Adamczak R, Grabczewski K (2001) A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Trans Neural Netw 12(2):277–306
Dy JG, Brodley CE (2000) Feature subset selection and order identification for unsupervised learning. In: ICML. Citeseer, pp 247-254
Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2-3):131–163
Ghoneim MA, Bakr MA, Refaie AF, Akl AI, Shokeir AA, El-Dein S, Ahmed B, Ammar HM, Ismail AM (2013) Sheashaa HA (2013) Factors affecting graft survival among patients receiving kidneys from live donors: a single-center experience. Biomed Res Int
Goldfarb-Rumyantzev AS, Scandling JD, Pappas L, Smout RJ, Horn S (2003) Prediction of 3-yr cadaveric graft survival based on pre-transplant variables in a large national dataset. Clin Transpl 17(6):485–497
Grinyó JM (2013) Why is organ transplantation clinically important? Cold Spring Harbor Perspectives in Medicine 3(6):a014985
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hariharan S, Johnson CP, Bresnahan BA, Taranto SE, McIntosh MJ, Stablein D (2000) Improved graft survival after renal transplantation in the United States, 1988 to 1996. N Engl J Med 342(9):605–612
Heldal K, Hartmann A, Grootendorst DC, de Jager DJ, Leivestad T, Foss A, Midtvedt K (2009) Benefit of kidney transplantation beyond 70 years of age. Nephrol Dial Transplant 25(5):1680–1687
Hoot N, Aronsky D (2005) Using Bayesian networks to predict survival of liver transplant patients. In: AMIA annual symposium proceedings. American Medical Informatics Association, p 345
Inza I, Larrañaga P, Etxeberria R, Sierra B (2000) Feature subset selection by Bayesian network-based optimization. Artif Intell 123(1-2):157–184
Kaplan B, Schold J (2009) Transplantation: neural networks for predicting graft survival. Nat Rev Nephrol 5(4):190
Kim Y, Street WN, Menczer F (2000) Feature selection in unsupervised learning via evolutionary search. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 365-369
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, vol 2. Montreal, pp 1137-1145
Krikov S, Khan A, Baird BC, Barenbaum LL, Leviatov A, Koford JK, Goldfarb-Rumyantzev AS (2007) Predicting kidney transplant survival using tree-based modeling. ASAIO J 53(5):592–600
Kusiak A, Dixon B, Shah S (2005) Predicting survival time for kidney dialysis patients: a data mining approach. Comput Biol Med 35(4):311–327
Lin RS, Horn SD, Hurdle JF, Goldfarb-Rumyantzev AS (2008) Single and multiple time-point prediction models in kidney transplant outcomes. J Biomed Inform 41(6):944–952
Liu H, Motoda H (1998) Feature extraction, construction and selection: A data mining perspective, vol 453. Springer Science & Business Media, Berlin
Martín-Valdivia MT, Díaz-Galiano MC, Montejo-Raez A, Urena-Lopez L (2008) Using information gain to improve multi-modal information retrieval systems. Inf Process Manag 44(3):1146–1158
Mitra P, Murthy C, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24(3):301–312
Mukras R, Wiratunga N, Lothian R, Chakraborti S, Harper D (2007) Information gain feature selection for ordinal text classification using probability re-distribution. In: Proceedings of the Textlink workshop at IJCAI, p 16
Nakayama N, Oketani M, Kawamura Y, Inao M, Nagoshi S, Fujiwara K, Tsubouchi H, Mochida S (2012) Algorithm to determine the outcome of patients with acute liver failure: a data-mining analysis using decision trees. J Gastroenterol 47(6):664–677
Ojo AO, Hanson JA, Meier-Kriesche H-U, Okechukwu CN, Wolfe RA, Leichtman AB, Agodoa LY, Kaplan B, Port FK (2001) Survival in recipients of marginal cadaveric donor kidneys compared with other recipients and wait-listed transplant candidates. J Am Soc Nephrol 12(3):589–597
Ojo AO, Wolfe RA, Agodoa LY, Held PJ, Port FK, Leavey SF, Callard SE, Dickinson DM, Schmouder RL, Leichtman AB (1998) Prognosis after primary renal transplant failure and the beneficial effects of repeat transplantation: Multivariate Analyses from the United States Renal Data System1, 2. Transplantation 66(12):1651–1659
Oztekin A, Al-Ebbini L, Sevkli Z, Delen D (2018) A decision analytic approach to predicting quality of life for lung transplant recipients: A hybrid genetic algorithms-based methodology. Eur J Oper Res 266(2):639–651
Parmanto B, Doyle H (2001) Recurrent neural networks for predicting outcomes after liver transplantation: representing temporal sequence of clinical observations. Methods Inf Med 40(05):386–391
Poli F, Scalamogna M, Cardillo M, Porta E, Sirchia G (2000) An algorithm for cadaver kidney allocation based on a multivariate analysis of factors impacting on cadaver kidney graft survival and function. Transpl Int 13(1):S259–S262
Port FK, Bragg-Gresham JL, Metzger RA, Dykstra DM, Gillespie BW, Young EW, Delmonico FL, Wynn JJ, Merion RM, Wolfe RA (2002) Donor characteristics associated with reduced graft survival: an approach to expanding the pool of kidney donors1. Transplantation 74(9):1281–1286
Qiang G (2010) An effective algorithm for improving the performance of Naïve Bayes for text classification. In: 2010 Second International Conference on Computer Research and Development
Quinlan JR (2014) C4. 5: programs for machine learning. Elsevier, Amsterdam
Raji C, Chandra SV (2016) Graft survival prediction in liver transplantation using artificial neural network models. J Comput Sci 16:72–78
Rana A, Gruessner A, Agopian VG, Khalpey Z, Riaz IB, Kaplan B, Halazun KJ, Busuttil RW, Gruessner RW (2015) Survival benefit of solid-organ transplant in the United States. JAMA surgery 150(3):252–259
Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, pp 532-538
Rish I (2001) An empirical study of the naive Bayes classifier. In: IJCAI 2001 workshop on empirical methods in artificial intelligence, vol 22. IBM, pp 41-46
Shih DT, Kim SB, Chen VC, Rosenberger JM, Pilla VL (2014) Efficient computer experiment-based optimization through variable selection. Ann Oper Res 216(1):287–305
Siedlecki W, Sklansky J (1988) On automatic feature selection. Int J Pattern Recognit Artif Intell 2(02):197–220
Talavera L (1999) Feature selection as a preprocessing step for hierarchical clustering. In: ICML. Citeseer, pp 389-397
Tang H, Hurdle JF, Poynton M, Hunter C, Tu M, Baird BC, Krikov S, Goldfarb-Rumyantzev AS (2011) Validating prediction models of kidney transplant outcome using single center data. ASAIO J 57(3):206–212
Topuz K, Uner H, Oztekin A, Yildirim MB (2018) Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network. Ann Oper Res 263(1-2):479–499
Topuz K, Zengul FD, Dag A, Almehmi A, Yildirim MB (2018) Predicting graft survival among kidney transplant recipients: A Bayesian decision support model. Decis Support Syst 106:97–109
Tseng W-T, Chiang W-F, Liu S-Y, Roan J, Lin C-N (2015) The application of data mining techniques to oral cancer prognosis. J Med Syst 39(5):59
Webb GI (2011) Naïve bayes. In: Encyclopedia of Machine Learning. Springer, pp 713-714
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
Wyse N, Dubes R, Jain AK (1980) A critical evaluation of intrinsic dimensionality algorithms. Pattern Recognition in Practice:415–425
Yang C-H, Chuang L-Y, Yang CH (2010) IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. Journal of Medical and Biological Engineering 30(1):23–28
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Atallah, D.M., Badawy, M., El-Sayed, A. et al. Predicting kidney transplantation outcome based on hybrid feature selection and KNN classifier. Multimed Tools Appl 78, 20383–20407 (2019). https://doi.org/10.1007/s11042-019-7370-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7370-5