An improved boosting based on feature selection for corporate bankruptcy prediction

doi:10.1016/j.eswa.2013.09.033

Expert Systems with Applications

Volume 41, Issue 5, April 2014, Pages 2353-2361

https://doi.org/10.1016/j.eswa.2013.09.033 Get rights and content

Highlights

•
There is no overall best method has been used in predicting corporate bankruptcy.
•
A new and improved Boosting, FS-Boosting, is proposed to predict corporate bankruptcy.
•
Two datasets are selected to demonstrate the effectiveness and feasibility of FS-Boosting.
•
Experimental results reveal that FS-Boosting could be used as an alternative method.

Abstract

With the recent financial crisis and European debt crisis, corporate bankruptcy prediction has become an increasingly important issue for financial institutions. Many statistical and intelligent methods have been proposed, however, there is no overall best method has been used in predicting corporate bankruptcy. Recent studies suggest ensemble learning methods may have potential applicability in corporate bankruptcy prediction. In this paper, a new and improved Boosting, FS-Boosting, is proposed to predict corporate bankruptcy. Through injecting feature selection strategy into Boosting, FS-Booting can get better performance as base learners in FS-Boosting could get more accuracy and diversity. For the testing and illustration purposes, two real world bankruptcy datasets were selected to demonstrate the effectiveness and feasibility of FS-Boosting. Experimental results reveal that FS-Boosting could be used as an alternative method for the corporate bankruptcy prediction.

Introduction

Predicting corporation bankruptcy is an important management science problem and its main goal is to differentiate those corporations with a probability of distress from healthy corporations. Moreover, incorrect decision-making in financial institutions may run into financial difficulty or distress and cause many social costs affecting owners or shareholders, managers, government, etc. As a result, how to predict corporate bankruptcy has become a hot topic for both industrial application and academic research (Li et al., 2012, Olson et al., 2012, Zhou et al., 2014).

As there are no mature theories of corporate bankruptcy, studies in corporate bankruptcy have largely been based on trial and error iterative processes of selecting features and predictive models (Li and Sun, 2009, Zhou et al., 2014). With the development of statistics, artificial intelligence (AI), some statistical methods and intelligent methods have been proposed for corporate bankruptcy prediction. The statistical methods applied in corporate bankruptcy prediction mainly include Linear Discriminant Analysis (LDA), Multivariate Discriminant Analysis (MDA), Quadratic Discriminant Analysis (QDA), Logistic Regression Analysis (LRA), and Factor Analysis (FA) (Li and Sun, 2009, Zmijewski, 1984). However, the problem with applying these statistical techniques to corporate bankruptcy prediction is that some assumptions, such as the multivariate normality assumptions for independent variables, are frequently violated in the practice, which makes these techniques theoretically invalid for finite samples (Shin & Lee, 2002). In recent years, many studies have demonstrated that intelligent techniques such as Artificial Neural Networks (ANN), Decision Tree (DT), Case-Based Reasoning (CBR), Support Vector Machine (SVM) can be used as alternative methods for corporate bankruptcy prediction (Olson et al., 2012, Tsai and Wu, 2008). In contrast with statistical techniques, intelligent techniques do not assume certain data distributions and automatically extract knowledge from training samples (Wang, Ma, Huang, & Xu, 2012).

However, there is still no overall best intelligent methods used in predicting corporate bankruptcy. The performance of prediction depends on the details of the problem, the data structure, the used characteristics, the extent to which it is possible to segregate the classes by using those characteristics, and the objective of the classification (Duéñez-Guzmán & Vose, 2013). Recently, integrating multiple predictors into an aggregated output, i.e. ensemble methods, has been demonstrated to be an efficient strategy for achieving high prediction performance, especially when the component predictors have different structures that lead to independent prediction errors (Breiman, 1996, Polikar, 2006). Moreover, latest studies have shown that such ensemble techniques have performed better than single intelligent technique in financial distress prediction (Deligianni and Kotsiantis, 2012, Sun and Li, 2012). In this paper, a novel and improved Boosting, FS-Boosting, is proposed to predict corporate bankruptcy. Through injecting feature selection strategy into Boosting, FS-Booting can get better performance as base learners in FS-Boosting could get more accuracy and diversity. For the testing and illustration purposes, two real world bankruptcy datasets were selected to demonstrate the effectiveness and feasibility of FS-Boosting. Among eight methods, FS-Boosting gets the best prediction accuracy on two datasets. Experimental results reveal that FS-Boosting can be used as an alternative method for the corporate bankruptcy prediction.

The remainder of the paper is organized as follows. In Section 2, we review the related work of corporate bankruptcy prediction. In Section 3, an improved boosting, FS-Boosting, is proposed for corporate bankruptcy prediction. In Section 4, we present the details of experiment design. Sections 5 Experimental results, 6 Discussion summarize and analyze empirical results and discussion. Based on the results and observations of these experiments, Section 7 draws conclusions and future research directions.

Section snippets

Related work

Many techniques have been proposed by prior research. In this study, we classified these techniques into statistical techniques and intelligent techniques.

Feature selection

Feature selection has been an active research area in machine learning and data mining communities (Liu & Motoda, 1998). The main idea of feature selection is to choose a subset of input variables by eliminating features with little or no predictive information. Feature selection reduces the dimensionality of feature space, and removes redundant, irrelevant, or noisy data. It brings the immediate effects for application: speeding up an algorithm, improving the data quality and thereof the

Real world bankruptcy dataset

Two real world datasets were used to test the performance of proposed method. The first dataset was collected by Pietruszkiewicz (2008). It contains 240 companies including 112 failed companies. The time period of dataset is from 1997 to 2001 before bankruptcy toke place. A total of 30 financial variables were used for prediction. The particular explanations of these financial variables are listed in Table 1.

Then second dataset is selected from the CD-ROM database (Shmueli, Patel, & Bruce, 2011

Experimental results

In this paper, an improved Boosting, FS-Boosting, is proposed to predict bankruptcy and reduce the loss of financial institutions. Table 4 summarizes the performance indicators of different methods on the two bankruptcy datasets, where the values following “±” are standard deviations.

Firstly, we consider the results of bankruptcy I dataset. As shown in Table 4, FS-Boosting gets the highest average accuracy, 81.50%. Two other ensemble methods, i.e. Bagging and Boosting, also get the higher

Discussion

In order to ensure that the assessment does not happen by chance, we tested the significance of above results by means of the paired t-test. The null hypothesis is “Model A’s mean of Average Accuracy / Type I Error / Type II Error = Model B’s mean Average Accuracy / Type I Error / Type II Error”. The alternative hypothesis is “Model A’s mean Average Accuracy / Type I Error / Type II Error ≠ Model B’s mean Average Accuracy / Type I Error / Type II Error”. The column ‘improvement’ gives the relative

Conclusions and future directions

Owing to recent financial crisis and European debt crisis, bankruptcy prediction has become an increasingly important issue for financial institutions. Meanwhile ensemble learning is a powerful machine learning paradigm which has exhibited apparent advantages in many applications. In this paper, an improved Boosting, FS-Boosting, is proposed to predict bankruptcy and reduce the loss of financial institutions. FS-Boosting integrates the advantage of Boosting and feature selection to enhance the

Acknowledgments

This work is partially supported by the National Natural Science Foundation of China (Nos. 71071045, 71131002, 71101042), Specialized Research Fund for the Doctoral Program of Higher Education (20110111120014), the China Postdoctoral Science Foundation (2011M501041, 2013T60611), Special Fund of AnHui Province Key Research Institute of Humanities and Social Sciences at Universities (SK2013B400).

References (48)

E. Alfaro et al.
Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks
Decision Support Systems
(2008)
A.L. Blum et al.
Selection of relevant features and examples in machine learning
Artificial Intelligence
(1997)
M. Dash et al.
Feature selection for classification
Intelligent Data Analysis
(1997)
Z. Fu et al.
Diversification for better classification trees
Computers & Operations Research
(2006)
R. Kohavi et al.
Wrappers for feature subset selection
Artificial Intelligence
(1997)
H. Li et al.
Gaussian case-based reasoning for business failure prediction with empirical data in China
Information Sciences
(2009)
J.H. Min et al.
Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters
Expert Systems with Applications
(2005)
L. Nanni et al.
An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring
Expert Systems with Applications
(2009)
D.L. Olson et al.
Comparative analysis of data mining methods for bankruptcy prediction
Decision Support Systems
(2012)
F. Sánchez-Lasheras et al.
A hybrid device for the solution of sampling bias problems in the forecasting of firms’ bankruptcy
Expert Systems with Applications
(2012)

K. Shin et al.

A case-based approach using inductive indexing for corporate bond rating

Decision Support Systems

(2001)

K.S. Shin et al.

A genetic algorithm application in bankruptcy prediction modeling

Expert Systems with Applications

(2002)

J. Sun et al.

Financial distress prediction using support vector machines: Ensemble vs. individual

Applied Soft Computing

(2012)

T.C. Tang et al.

Neural networks analysis in business failure prediction of Chinese importers: A between-countries approach

Expert Systems with Applications

(2005)

C.-F. Tsai et al.

Using neural network ensembles for bankruptcy prediction and credit scoring

Expert Systems with Applications

(2008)

G. Wang et al.

A comparative assessment of ensemble learning for credit scoring

Expert Systems with Applications

(2011)

G. Wang et al.

Two credit scoring models based on dual strategy ensemble trees

Knowledge-Based Systems

(2012)

R.C. West

A factor-analytic approach to bank condition

Journal of Banking & Finance

(1985)

Alfaro-Cid, E., Castillo, P. A., Esparcia, A., Sharman, K., Merelo, J. J., Prieto, A., Mora, A. M., & Laredo, J. L. J....

E.I. Altman

Financial ratios, discriminant analysis and the prediction of corporate bankruptcy

The journal of finance

(1968)

W.H. Beaver

Financial ratios as predictors of failure

Journal of Accounting Research

(1966)

L. Breiman

Bagging predictors

Machine Learning

(1996)

L. Breiman

Random forests

Machine Learning

(2001)

P. Buta

Mining for financial knowledge with CBR

AI Expert

(1994)

Cited by (0)

View full text

An improved boosting based on feature selection for corporate bankruptcy prediction

Highlights

Abstract

Introduction

Section snippets

Related work

Feature selection

Real world bankruptcy dataset

Experimental results

Discussion

Conclusions and future directions

Acknowledgments

Decision Support Systems

Artificial Intelligence

Intelligent Data Analysis

Computers & Operations Research

Artificial Intelligence

Information Sciences

Expert Systems with Applications

Expert Systems with Applications

Decision Support Systems

Expert Systems with Applications

Decision Support Systems

Expert Systems with Applications

Applied Soft Computing

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Knowledge-Based Systems

Journal of Banking & Finance

Financial ratios, discriminant analysis and the prediction of corporate bankruptcy

The journal of finance

Financial ratios as predictors of failure

Journal of Accounting Research

Bagging predictors

Machine Learning

Random forests

Machine Learning

Mining for financial knowledge with CBR

AI Expert