Comparison procedure of predicting the time to default in behavioural scoring
Introduction
Credit scoring is a process of determining how likely applicants are to default with their repayments (Hand & Henley, 1997). It includes various decision models and tools that assist credit managers in making credit decisions. The aim is to assess the risk of default associated with a credit product/decision. Credit scoring models can be divided into two types: (i) credit scoring in targeting and application, which deals with new applicants and (ii) credit scoring in managing existing accounts called behaviour scoring. Behaviour scoring models are used for making on-going decisions on open accounts. It is used to describe the behaviour of existing customers by using behavioural scoring variables and also to predict future purchasing behaviour or credit status of existing customers (Hsieh, 2004). They are based on dynamic account performance information and are used to manage limits during the lifetime of an account, to increase account usage, to set risk-based collections and recovery strategies, to retain future profitable customers, in retention and incentive programs, to predict accounts that are likely to close, become inactive or settle early, to identify short-life accounts, to predict future response and help to improve targeting of offers, to pre-screen mailing lists of new customers, to gain market share and long term profitable accounts, to optimize telemarketing operations, to predict fraudulent activity, and to manage recovery debt (Hand and Henley, 1997, McNab and Wynn, 2000). Behaviour scoring models take into account the current and recent behavioural data of a client (consumer, corporate, and SME) and predict future risk. They can also be built to predict other dimensions of future behaviour, such as revenue, propensity, usage, attrition, contractibility, and fraudulent activity. In such a way they can influence decisions across the whole credit cycle (McNab & Wynn, 2000).
With the inclusion of the dynamic component, credit scoring modelling is getting more and more important in default prediction due to several reasons (Baesens, Gastel, Stepanova, Poel, & Vanthienen, 2005): (i) the ability to compute the client’s profitability over a lifetime; (ii) the ability to estimate the default levels over time which enables debt provisioning; (iii) the ability to decide upon the term of the loan; and (iv) the ability to incorporate changes in economic conditions. If a bank can predict default times for different periods (30, 60, 90 days, etc.) for each client, it enables it to take certain actions in order to prevent undesirable behaviour and therefore to protect itself from potential borrowers with high default risks in timely manner (Lim & Sohn, 2007). The model for predicting ‘when’ may provide financial companies with an estimate of the default levels over time (Noh, Roh, & Han, 2005).
In 2005 the Bank for International Settlement produced a revised framework of the International convergence of capital measurement and capital standards – Basel 2. According to this new accord, banks are allowed to determine their own calculation of risk parameters, such as probability of default, loss given default, exposures at default, and maturity in order to calculate risk weighted assets. Probability of default is derived from application credit scoring for new clients, and from behavioural scoring for existing clients. This makes credit scoring modelling even more important than ever before. Fifth quantitative impact study (BIS, 2006) shows that banks in the EU are expected to reduce capital requirements up to 26.6% due to the new accord if they use internal ratings-based (IRB) approach. This is a great motivation for banks to develop their own credit risk models. The better models they have the greater advantages they will accomplish. The new accord influences consumer credit modelling in three ways (Thomas, Oliver, & Hand, 2005): (i) minimum requirements that a bank has to fulfil in order to use the IRB approach; (ii) validation of risk parameters; and (iii) models for credit risk portfolio.
The paper deals with personal open-end accounts where clients can make different kinds of payments to the accounts and withdrawals from the accounts, as well as use revolving credits. The main purpose of the paper was to model time to default using survival analysis and neural networks, as well as to establish a procedure for selecting one model over another according to model accuracy and its efficiency for the bank. In order to accomplish these purposes, the models predicting defaults in 30, 60, 90, 120, 150, and 180 days have been created using a dataset collected from a Croatian bank. Separate neural network models were designed for each period of default using the radial basis function algorithm. Survival analysis that deals with censored observations is used to estimate hazard functions for each period of default. On the basis of the results of all developed models, a selection procedure is suggested and the best model is discussed.
The structure of the paper is as follows. A description of variables and data is given in the next section. Modelling and results section provides a description of methodology used including survival analysis and neural networks, followed by the results of survival analysis and neural networks as well as their comparison. Finally, some important features of credit scoring modelling are discussed.
Section snippets
Review of previous research
Researchers and practitioners in behavioural credit scoring modelling use different methods and develop different scoring models in order to produce the best model possible. They often develop several models, compare them and then decide which one to use. A classical survival analysis has been frequently used in behaviour scoring modelling for more than two decades. One of the first reported research was by Narain (1999) who used survival analysis in credit scoring. Banasik, Crook, and Thomas
Variables and data
Data sample was collected randomly at a Croatian bank covering the period of 12 months – from January 1 to December 31. An observation point is settled in the middle of the period, on June 30. A period preceding this point is the performance period and the characteristics of the performance in this period are used in developing scoring models. On the basis of client’s performance in the period of 6 months after the observation point, a client is defined as ‘good’ or ‘bad’ (Thomas, Ho, &
Methodology and modelling
Survival analysis is used to describe the time until an open-end account defaults – survival time T. It is supposed that the survival time is a random variable and, for the purpose of credit scoring modelling, we are primarily interested in measuring the risk that an open-end account defaults before the prescribed time t. To do this, we use the distribution function of the random variable T, i.e. F(t) = P{T ⩽ t} or the survivor function S(t) = 1 − F(t). If we have the model for one of them (S(t) or F(t
Model comparison
In the preceding section the two models were described that can be used to predict time to default for the given open-end account. Both of them were satisfactory from the modelling point of view. It is not uncommon to find that multiple scores or classifiers are available. However, the decision of choosing one over another or combining them in a decision process is not an easy task (Hand, 2005). Another question is if there is any chance to gain something new by combining them.
In order to
Conclusion and discussion
Recent research in credit scoring emphasizes the importance of not only distinguishing ‘good’ customers from ‘bad’ ones, but also predicting the time when a customer defaults or pays off early. By predicting the time to default a bank can take certain actions in order to prevent undesirable behaviour and therefore protect itself from potential borrowers with high default risks in timely manner.
The paper investigates survival analysis and neural networks in predicting the time of default, and
References (26)
An integrated data mining and behavioral scoring model for analyzing bank customers
Expert Systems with Applications
(2004)- et al.
Cluster-based dynamic scoring model
Expert Systems with Applications
(2007) - et al.
Prognostic personal credit risk model considering censored information
Expert System with Applications
(2005) European generic scoring models using survival analysis
Journal of Operational Research Society
(2006)- et al.
Modelling the purchase propensity: Analysis of a revolving store card
Journal of Operational Research Society
(2005) - Andreeva, G., Ansell, J., & Crook, J. N. (2004). Credit scoring in the context of the European integration: Assessing...
- et al.
Neural network survival analysis for personal loan data
Journal of the Operational Research Society
(2005) - et al.
Not if but when will borrowers default
Journal of Operational Research Society
(1999) - et al.
Managing credit risk
(1998) - et al.
A comparison of neural network and linear scoring models in credit union environment
European Journal of Operational Research
(1996)
Regression modelling strategies
New uses of statistics in retail banking
The final version of this paper appeared in Journal of Mathematical and Management Sciences
Good practice in retail credit scorecard assessment
Journal of the Operational Research Society
Cited by (34)
Predicting loss given default of unsecured consumer loans with time-varying survival scores
2023, Pacific Basin Finance JournalA conservative approach for online credit scoring
2021, Expert Systems with ApplicationsCitation Excerpt :Despite their popularity, credit scoring models can only provide an estimate of the lifetime probability of default for a loan but cannot identify the existence of cures and/or other competing transitions and their relationship to loan-level and macro covariates, and do not provide insight on the timing of default, the cure from the default, the time since default, and time to collateral repossession (Lessmann et al., 2015; Chamboko & Bravo, 2020). Survival models incorporating time-varying covariates such as macroeconomic conditions which affect performance on loan payment over time and the ability to forecast event occurrence (default, recovery, prepayment, foreclosure) in the next instant of time, given that the event has not occurred until that time, have proven to overperform traditional methods in empirical studies (see, e.g., Noh, Roh, & Han, 2005; Sarlija, Bensic, & Zekic-Susac, 2009; Tong, Mues, & Thomas, 2012; Bellotti & Crook, 2013; Castro, 2013; Chamboko & Bravo, 2016, 2019a, 2019b). A handful of studies have also used the same to model foreclosure on mortgages (Gerardi, Shapiro, & Willen, 2007) and cure from delinquency to current (Chamboko & Bravo, 2016, 2019a, 2019b; Ha, 2010; Ho Ha & Krishnan, 2012).
Applications of Artificial Intelligence in commercial banks – A research agenda for behavioral finance
2020, Journal of Behavioral and Experimental FinanceCitation Excerpt :In the context of credit scoring, only a single paper was found in which a statistical model outperformed an AI algorithm (Xiong et al., 2013). Also, in the context of predicting the time until a borrower defaults on a loan, the Cox model outperforms neural networks (Sarlija et al., 2009). In the context of credit scoring, the hybrid model proposed by Li et al. (2016) outperforms the logistic regression model.
A Novel behavioral scoring model for estimating probability of default over time in peer-to-peer lending
2018, Electronic Commerce Research and ApplicationsA framework for data transformation in Credit Behavioral Scoring applications based on Model Driven Development
2017, Expert Systems with ApplicationsPredicting creditworthiness in retail banking with limited scoring data
2016, Knowledge-Based SystemsCitation Excerpt :CCNN can avoid Multilayer Perceptrons Neural Network's drawbacks, such as the design and specification of the number of hidden layers and the number of units in these layers [19,27]. Various scoring models’ evaluation criteria including receiver operating characteristic (ROC) curves and Gini coefficients are widely used and serve to assess the predictive capabilities of scoring models [2,4,11,18,20,46]. World-wide evolution of thought and practice in credit scoring can be substantially attributed to increasingly rigorous models of personal and corporate finance, increasingly powerful and discriminating statistical techniques and enormously more potent and economic processing capacity.