Abstract
This paper extends the existing literature on empirical research in the field of credit risk default for Small Medium Enterprizes (SMEs). We propose a non-parametric approach based on Random Survival Forests (RSF) and we compare its performance with a standard logit model. To the authors’ knowledge, no studies in the area of credit risk default for SMEs have used a variety of statistical methodologies to test the reliability of their predictions and to compare their performance against one another. As for the in-sample results, we find that our non-parametric model performs much better that the classical logit model. As for the out-of-sample performances, the evidence is just the opposite, and the logit performs better than the RSF model. We explain this evidence by showing how error in the estimates of default probabilities can affect classification error when the estimates are used in a classification rule.
Similar content being viewed by others
References
Akaike H (1974) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory. IEEE, Piscataway, pp 267–281
Altman E (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4):589–609
Altman E, Sabato G (2006) Modeling credit risk for SMEs: evidence from the US market. ABACUS 19(6):716–723
Altman EI, Haldeman R, Narayanan P (1977) ZETA analysis: a new model to identify bankruptcy risk of corporations. J Bank Financ 1:29–54
Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12:387–415
Basel Committee on Banking Supervision (2005) Amendment to the capital accord to incorporate market risks. Basel Committee on Banking Supervision, Basel
Beaver W (1967) Financial ratios predictors of failure. J Acc Res 4:71–111 (Suppl)
Bernardo JM, Smith AFM (1994) Bayesian theory. Wiley, New York
Bharath ST, Shumway T (2004) Forecasting default with the KMV-Merton model. Working Paper
Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53:603–618
Burnham KP, Anderson DR (1998) Model selection and inference: a practical information-theoretic approach. Springer, Berlin Heidelberg New York
Chava S, Jarrow R (2004) Bankruptcy prediction with industry effects. Rev Financ 8(4):537–569
DeLong E, DeLong D, Clarke-Pearson D (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845
Eisenbeis RA (1977) Pitfalls in the application of discriminant analysis inbusiness, finance, and economics. J Finance 32:875–900
Engelmann B, Hayden E, Tasche D (2003) Measuring the discriminative power of rating systems. Banking and Financial Supervision, Deutsche Bundesbank Discussion paper N. 01/2003
Fantazzini D, Figini S (2007) Bayesian panels models to predict credit default for SMEs. Working paper 28, University of Pavia, Department of Statistics and Applied Ecomics “L. Lenti”
Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38–49
Fisher LD, Lin DY (1999) Time-dependent covariates in the cox proportional- hazards regression model. Annu Rev Public Health 20:145–157
Frerichs H, Wahrenburg M (2003) Evaluating internal credit rating systems depending on bank size. Working Paper Series: Finance and Accounting 115. Department of Finance, Goethe University Frankfurt
Friedman JH (1997) On bias, variance, 0/1Loss, and the curse-of-dimensionality. Data Min Knowl Discov 1:55–77
Fuertes AM, Kalotychou E (2006) Early warning systems for sovereign debt crises: the role of heterogeneity. Comput Stat Data Anal 51:1420–1441
Goin JE (1982) ROC curve estimation and hypothesis testing: applications to breast cancer detection. J Pattern Recogn Soc 15:263–269
Hand DJ (2006) Classifier technology and the illusion of progress (with discussion). Stat Sci 21:1–34
Hand DJ, Henley WE (1997a) Some developments in statistical credit scoring. In: Nakhaeizadeh N, Taylor C (eds) Machine learning and statistics: the interface. Wiley, New York, pp 221–237
Hand DJ, Henley WE (1997b) Statistical classification methods in consumer credit scoring: a review. J R Stat Soc Ser A 160:523–541
Hand DJ, Niall AM (2000) Defining attributes for scorecard construction in credit scoring. J Appl Stat 27–5:527–540
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin Heidelberg New York
Henebry KL (1996) A test of the temporal stability of proportional hazards models for predicting bank failure. J Financ Strateg Decis 10(3), Fall 1997
Ishwaran H (2007) Variable importance in binary trees. Cleveland Clinic Technical Report
Ishwaran H, Kogalur UB (2006) Random survival forests for R. Rnews 7/2:25–31
Laviola S, Marullo-Reedtz P, Trapanese M (1999) Forecasting bank fragility: the evidence from Italy. Research in Financial Services: Private and Public Policy 11. JAI, Greenwich
Lee SH, Urrutia JL (1996) Analysis and prediction of insolvency in the property- liability insurance industry: a comparison of logit and hazard models. J Risk Insur 63(1):121–130
Merton RC (1974) On the pricing of corporate debt: the risk structure of interest rates. J Finance 29:449–470
Metz CE, Kronman HB (1980) Statistical significance tests for binormal ROC curves. J Math Psychol 22:218–243
Plattner D (2002) Why firms go bankrupt. The influence of key financial figures and other factors on the insolvency probability of small and medium sized enterprises. KfWResearch 28:37–51
Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing classifiers. In: Proceedings of the fifteenth international conference on machine learning, (ICML-98), Madison, 24–27 July 1998
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Sobehart JR, Keenan SC (2001) A practical review and test of default prediction models. RMA J 54–59
Vapnik V (1998) Statistical learning theory. Wiley, New York
Whalen G (1991) A proportional hazards model of bank failure: an examination of its usefulness as an early warning tool. Econom Rev 27:21–31
Zweig MH, Campbell G (1993) Receiver-Operating Characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fantazzini, D., Figini, S. Random Survival Forests Models for SME Credit Risk Measurement. Methodol Comput Appl Probab 11, 29–45 (2009). https://doi.org/10.1007/s11009-008-9078-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11009-008-9078-2