Skip to main content
Log in

Random Survival Forests Models for SME Credit Risk Measurement

  • Published:
Methodology and Computing in Applied Probability Aims and scope Submit manuscript

Abstract

This paper extends the existing literature on empirical research in the field of credit risk default for Small Medium Enterprizes (SMEs). We propose a non-parametric approach based on Random Survival Forests (RSF) and we compare its performance with a standard logit model. To the authors’ knowledge, no studies in the area of credit risk default for SMEs have used a variety of statistical methodologies to test the reliability of their predictions and to compare their performance against one another. As for the in-sample results, we find that our non-parametric model performs much better that the classical logit model. As for the out-of-sample performances, the evidence is just the opposite, and the logit performs better than the RSF model. We explain this evidence by showing how error in the estimates of default probabilities can affect classification error when the estimates are used in a classification rule.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akaike H (1974) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory. IEEE, Piscataway, pp 267–281

    Google Scholar 

  • Altman E (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Finance 23(4):589–609

    Article  Google Scholar 

  • Altman E, Sabato G (2006) Modeling credit risk for SMEs: evidence from the US market. ABACUS 19(6):716–723

    Google Scholar 

  • Altman EI, Haldeman R, Narayanan P (1977) ZETA analysis: a new model to identify bankruptcy risk of corporations. J Bank Financ 1:29–54

    Article  Google Scholar 

  • Bamber D (1975) The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 12:387–415

    Article  MathSciNet  MATH  Google Scholar 

  • Basel Committee on Banking Supervision (2005) Amendment to the capital accord to incorporate market risks. Basel Committee on Banking Supervision, Basel

  • Beaver W (1967) Financial ratios predictors of failure. J Acc Res 4:71–111 (Suppl)

    Article  Google Scholar 

  • Bernardo JM, Smith AFM (1994) Bayesian theory. Wiley, New York

    MATH  Google Scholar 

  • Bharath ST, Shumway T (2004) Forecasting default with the KMV-Merton model. Working Paper

  • Buckland ST, Burnham KP, Augustin NH (1997) Model selection: an integral part of inference. Biometrics 53:603–618

    Article  MATH  Google Scholar 

  • Burnham KP, Anderson DR (1998) Model selection and inference: a practical information-theoretic approach. Springer, Berlin Heidelberg New York

    MATH  Google Scholar 

  • Chava S, Jarrow R (2004) Bankruptcy prediction with industry effects. Rev Financ 8(4):537–569

    MATH  Google Scholar 

  • DeLong E, DeLong D, Clarke-Pearson D (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44:837–845

    Article  MATH  Google Scholar 

  • Eisenbeis RA (1977) Pitfalls in the application of discriminant analysis inbusiness, finance, and economics. J Finance 32:875–900

    Article  Google Scholar 

  • Engelmann B, Hayden E, Tasche D (2003) Measuring the discriminative power of rating systems. Banking and Financial Supervision, Deutsche Bundesbank Discussion paper N. 01/2003

  • Fantazzini D, Figini S (2007) Bayesian panels models to predict credit default for SMEs. Working paper 28, University of Pavia, Department of Statistics and Applied Ecomics “L. Lenti”

  • Fielding AH, Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ Conserv 24:38–49

    Article  Google Scholar 

  • Fisher LD, Lin DY (1999) Time-dependent covariates in the cox proportional- hazards regression model. Annu Rev Public Health 20:145–157

    Article  Google Scholar 

  • Frerichs H, Wahrenburg M (2003) Evaluating internal credit rating systems depending on bank size. Working Paper Series: Finance and Accounting 115. Department of Finance, Goethe University Frankfurt

  • Friedman JH (1997) On bias, variance, 0/1Loss, and the curse-of-dimensionality. Data Min Knowl Discov 1:55–77

    Article  Google Scholar 

  • Fuertes AM, Kalotychou E (2006) Early warning systems for sovereign debt crises: the role of heterogeneity. Comput Stat Data Anal 51:1420–1441

    Article  MathSciNet  Google Scholar 

  • Goin JE (1982) ROC curve estimation and hypothesis testing: applications to breast cancer detection. J Pattern Recogn Soc 15:263–269

    Article  MATH  Google Scholar 

  • Hand DJ (2006) Classifier technology and the illusion of progress (with discussion). Stat Sci 21:1–34

    Article  MathSciNet  MATH  Google Scholar 

  • Hand DJ, Henley WE (1997a) Some developments in statistical credit scoring. In: Nakhaeizadeh N, Taylor C (eds) Machine learning and statistics: the interface. Wiley, New York, pp 221–237

    Google Scholar 

  • Hand DJ, Henley WE (1997b) Statistical classification methods in consumer credit scoring: a review. J R Stat Soc Ser A 160:523–541

    Article  Google Scholar 

  • Hand DJ, Niall AM (2000) Defining attributes for scorecard construction in credit scoring. J Appl Stat 27–5:527–540

    Google Scholar 

  • Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin Heidelberg New York

    MATH  Google Scholar 

  • Henebry KL (1996) A test of the temporal stability of proportional hazards models for predicting bank failure. J Financ Strateg Decis 10(3), Fall 1997

  • Ishwaran H (2007) Variable importance in binary trees. Cleveland Clinic Technical Report

  • Ishwaran H, Kogalur UB (2006) Random survival forests for R. Rnews 7/2:25–31

    Google Scholar 

  • Laviola S, Marullo-Reedtz P, Trapanese M (1999) Forecasting bank fragility: the evidence from Italy. Research in Financial Services: Private and Public Policy 11. JAI, Greenwich

  • Lee SH, Urrutia JL (1996) Analysis and prediction of insolvency in the property- liability insurance industry: a comparison of logit and hazard models. J Risk Insur 63(1):121–130

    Article  Google Scholar 

  • Merton RC (1974) On the pricing of corporate debt: the risk structure of interest rates. J Finance 29:449–470

    Article  Google Scholar 

  • Metz CE, Kronman HB (1980) Statistical significance tests for binormal ROC curves. J Math Psychol 22:218–243

    Article  MATH  Google Scholar 

  • Plattner D (2002) Why firms go bankrupt. The influence of key financial figures and other factors on the insolvency probability of small and medium sized enterprises. KfWResearch 28:37–51

    Google Scholar 

  • Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing classifiers. In: Proceedings of the fifteenth international conference on machine learning, (ICML-98), Madison, 24–27 July 1998

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MATH  Google Scholar 

  • Sobehart JR, Keenan SC (2001) A practical review and test of default prediction models. RMA J 54–59

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    MATH  Google Scholar 

  • Whalen G (1991) A proportional hazards model of bank failure: an examination of its usefulness as an early warning tool. Econom Rev 27:21–31

    Google Scholar 

  • Zweig MH, Campbell G (1993) Receiver-Operating Characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Silvia Figini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fantazzini, D., Figini, S. Random Survival Forests Models for SME Credit Risk Measurement. Methodol Comput Appl Probab 11, 29–45 (2009). https://doi.org/10.1007/s11009-008-9078-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11009-008-9078-2

Keywords

AMS 2000 Subject Classification

Navigation