A hybrid approach based on the combination of variable selection using decision trees and case-based reasoning using the Mahalanobis distance: For bankruptcy prediction
Introduction
Diagnosing a firm’s credit risk level for possible bankruptcy has been a major problem to both scholars and practitioners. The case of Korea is a good example for illustrating the importance of credit risk management problems during the foreign currency crisis in 1990s. During the period, the whole Korean industry has gone through business restructuring resulting in a massive bankruptcy of incompetent firms nationwide. The total balance of bank loan to Korean firms was estimated to be US$279 billion as of January, 2006, and 285 firms, on average, went bankrupt each month in 2005 (Bank of Korea, 2006).
Improvement in the prediction accuracy associated with evaluating the default risk of firms may result in a considerable amount of savings for an economy. The sources of savings include cost reduction in credit analysis, better monitoring, and an increased debt collection rate. Traditionally, banks have used the internally developed credit risk scoring systems in which both quantitative and qualitative aspects are evaluated (Treacy & Carey, 2000). To quantitatively assess the credit risk level, more systematic models were developed in the areas of statistics and machine learning techniques. Multiple discriminant analysis was, to the best of our knowledge, a pioneering statistical approach in discovering the factors influencing bankruptcy. Logistic regression model became popular due to its relatively relaxed assumptions. In the 1990s, a number of studies attempted to apply artificial intelligence (AI) techniques to credit risk management (Dimitras, Zanakis, & Zopounidis, 1996).
The purpose of this study is to propose a case-based reasoning approach that incorporates the covariance structure of variables and variable weights in locating the nearest neighbors. The prediction for a given firm is made by applying a voting algorithm on the bankruptcy status of k-nearest neighbors. According to our literature review, most former studies were not interested in investigating the effect of input variable selection processes on the stability of model performance. Thereby, we evaluate the prediction accuracy of our model by applying two AI-based input variable selection strategies, as well as two other strategies using stepwise regression combined with expert opinions.
Section snippets
Literature review
The bankruptcy prediction measures whether a firm will go bankrupt or not. In the areas of statistics and artificial intelligence, bankruptcy measuring techniques often estimate the probability of bankruptcy. In these techniques, the prediction of bankruptcy is made if the estimated probability is greater than a threshold value. Discriminant analysis, logistic regression, neural networks, and decision trees methods adopt this probabilistic approach. Linear or non-linear equations that capture
Variable selection method
Most widely used techniques to select input variables of the model when many variables are available for analysis include stepwise regressions and field expert selection methods. In the credit management problems, it is common to have even more than 100 variables. Even after applying a stepwise selection technique, we often end up with a few dozens of significant variables. This is the reason why field expert selection methods are jointly used with statistical selection techniques. Financial
Data and input variable selection methods
For the experiment, we used the yearly financial data consisting of 1000 Korean manufacturing firms with an asset size of US$1 million to US$7 million in the fiscal year 2000–2002. The number of bankrupt firms and the number of healthy (non-bankrupt) firms are equally balanced as 500:500 in order to allow the learning occur within the AI techniques such as neural networks and decision trees. One hundred and thirty financial variables were available in total. To reduce the number of variables
Conclusion and discussion
The current research considered corporate bankruptcy problem and suggested a CBR method. The former CBR studies have used the Euclidean distance in measuring the proximity between two records. The ideal situation for using the Euclidean distance is that all variables are statistically independent of each other and thus the correlation coefficients of all pairs are zeros. This rarely happens in the real world data analysis. Thus, we introduce the Mahalanobis distance which incorporates the
References (35)
- et al.
Credit scoring and rejected instances reassigning through evolutionary computation techniques
Expert Systems with Applications
(2003) - et al.
A comparison of neural networks and linear scoring models in the credit union environment
European Journal of Operational Research
(1996) - et al.
A survey of business failures with an emphasis on prediction methods and industrial applications
European Journal of Operational Research
(1996) Bankruptcy support system: Taking advantage of information retrieval and case-based reasoning
Expert Systems With Applications
(2000)- et al.
Forecasting with neural networks
Information and Management
(1993) - et al.
Fuzzy indexing and retrieval in case-based system
Expert Systems with Applications
(1995) - et al.
Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis
Expert Systems with Applications
(1997) - et al.
A comparison of supervised and unsupervised neural networks in predicting bankruptcy of Korean firms
Expert Systems with Applications
(2005) - et al.
A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction
Expert Systems with Applications
(2002) A threshold-varying artificial neural network approach for classification and its application to bankruptcy prediction problem
Computers and Operations Research
(2005)
Credit risk rating at large US banks
Journal of Banking and Finance
Bankruptcy prediction with neural logic networks by means of grammar-guided genetic programming
Expert Systems with Applications
Artificial neural networks in bankruptcy prediction: General framework and cross-validation analysis
European Journal of Operational Research
Financial ratios, discriminant analysis and the prediction of corporate bankruptcy
Journal of Finance
Bankruptcy prediction of financially stressed firms: An examination of the predictive accuracy of artificially neural networks
International Journal of Intelligent Systems in Accounting, Finance and Management
Bankruptcy prediction for credit risk using neural networks: A survey and new results
IEEE Transactions on Neural Networks
Cited by (100)
Definition of new stopping criteria for the characterization of permanent deformation of granular materials
2024, Transportation GeotechnicsThe impact of heterogeneous distance functions on missing data imputation and classification performance
2022, Engineering Applications of Artificial Intelligence