Skip to main content
Erschienen in: Vietnam Journal of Computer Science 2/2018

Open Access 28.05.2018 | Regular Paper

Three local search-based methods for feature selection in credit scoring

verfasst von: Dalila Boughaci, Abdullah Ash-shuayree Alkhawaldeh

Erschienen in: Vietnam Journal of Computer Science | Ausgabe 2/2018

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Credit scoring is a crucial problem in both finance and banking. In this paper, we tackle credit scoring as a classification problem where three local search-based methods are studied for feature selection. The feature selection is an interesting technique that can be launched before the data classification task. It permits to keep only the relevant variables and eliminate the redundant ones which enhances the classification accuracy. We study the local search method (LS), the stochastic local search method (SLS) and the variable neighborhood search method (VNS) for feature selection. Then, we combine these methods with the support vector machine (SVM) classifier to find the best described model from a dataset with the correct class variable. The proposed methods (LS+SVM, SLS+SVM and VNS+SVM) are evaluated on both German and Australian credit datasets and compared with some well-known classifiers. The numerical results are promising and show a good performance in favor of our methods.
Hinweise

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Introduction
Credit scoring (CS) is an important process for banks as they have to be able to distinguish between good and bad applicants in terms of their creditworthiness. CS is the process of evaluating the creditworthiness of applicants to decide if the credit will be granted or not [33]. The evaluation process is usually based on some variables related to applicants such as historical payments, guarantees, default rates, etc.
Several CS models are proposed in literature [40]. Among them we find the following ones: Linear regression statistical methods [20] that permit to analyze data and verify if the credit can be granted to a given applicant or not. Discriminant analysis and logistic regression, which are one of the most broadly established statistical techniques used to classify applicants as "good" or "bad" [44]. Decision trees [42], CART (Classification and Regression Trees) [4] and Bayesian networks [16, 26] are used to classify data in credit scoring models.
More sophisticated methods based on computational intelligence are also studied for developing credit scoring models. As examples, we give: the neural networks [15, 38], the k-nearest neighbor classifier [21], the support vector machines (SVM) [3, 24], the ensemble classifiers [2], the genetic programming [1] and the evolution strategies [31]. In [27], authors propose an interesting quantification method for credit scoring. They use a categorical canonical correlation analysis to determine the relationship between categorical variables. In [22], authors propose a feature selection method based on quadratic unconstrained binary optimization (QUBO) algorithm. In [6] authors propose a cooperative classification system based on agents for CS.
On the other hand, meta-heuristics are a kind of computational techniques that have been used successfully for solving several optimization problems in several areas. The meta-heuristic approaches can be divided into two main categories: population-based methods and single solution-oriented methods [8]. The population-based methods called also evolutionary methods maintain and evolve a population of solutions while the single solution-oriented methods work on a current single solution. Among the evolutionary approaches for optimization problems, we mention the well-known genetic algorithms [17], evolutionary computation [34], and harmony search [36, 45]. Among the single solution-oriented methods, we cite stochastic local search (SLS) [25], simulated annealing (SA) [28], tabu search (TS) [18] and variable neighborhood search (VNS) [23, 32].
In this paper, we are interested in feature selection for credit scoring (CS). We tackle CS as a classification problem where three single solution-oriented meta-heuristic methods are studied for the feature selection. The feature selection is a technique that permits to eliminate the redundant variables and keep only the relevant ones. This manner can reduce the size of the dataset and simplify the data analysis. The feature selection has been applied in data classification to enhance the classifier performance and to reduce data noise [37, 41].
We propose to study local search, stochastic local search and variable neighborhood search for feature selection in credit scoring. The proposed feature selection is then combined with a support vector machine to classify the input data. The three variants of the proposed approach (LS+SVM, SLS+SVM and VNS+SVM) are implemented and evaluated on two well-known datasets which are: Australian and German Credit datasets.
The rest of this paper is organized as follows: Sect. 2 gives a background on some concepts used in this study. Sect. 3 discusses the proposed methods for credit scoring. Sect. 4 gives some experimental results. Finally, Sect. 5 concludes and gives some perspectives.

1 Background

The aim of this section is to explain the credit scoring problem and give an overview of feature selection and some basic concepts on support vector machine used in this study.

1.1 Problem definition and formulation

Credit scoring is an important issue in computational finance. CS is a set of decision models that can help lenders in the granting of applicant credit. Based on such models, lenders can decide whether applicant is eligible for credit or not [33]. To build CS models, we often exploit information about the applicant, such as: the age, the number of previous loans, default rates, etc. This information is called variables, attributes or features. The CS models may allow lenders to distinguish between "good" and "bad" applicants. It can give also an estimation of the probability of default.
More precisely, the CS problem can be stated as follows [22, 33]:
Let us consider a set of variables where each variable may be: a numeric or a category. For instance, the variable bank balances is a numerical feature that can be represented as an integer or a numeric. Other examples of numerical variables in CS can be the applicant age, the interest rates and so on. Categorical variables are called qualitative variables. We give as example a credit history or a geographic region code. The qualitative variables may also include "missing" (not specified values).
An instance or a sample is defined as an observation for each variable and an outcome represented the label class.
The classification is the problem of discovering the class of an observation. We have as input a set of independent variables and as output the label class. The objective is then to maximize the classification accuracy rate.
In CS, the variables are the set of features that describe the applicants profile and the financial data. The credit data can be divided into two main classes: "good" applicants (where the label class is Y=1) who paid their loan back, and "bad" applicants (Y=0) who defaulted on their loans.
The credit data is then a set of applicants to be classified into two classes: "bad" (Y=0) or "good" (Y=1).
According to [22], the CS problem can be formulated as follows:
  • The credit data can be organized as a matrix D of m rows and n columns where n is the number of features and m is the number of past applicants.
    $$\begin{aligned} \mathbf {D} = \begin{bmatrix} a_{11}&a_{12}&\cdots&a_{1n} \\ a_{21}&a_{22}&\cdots&a_{2n} \\ \vdots&\vdots&\ddots&\vdots \\ a_{m1}&a_{m2}&\cdots&a_{mn} \end{bmatrix} \end{aligned}$$
    For example, the first row \(a_{11}, a_{12}, \ldots a_{1n}\) represents the specific data values of the first applicant where \(a_{1i}\) is the value of feature i of the applicant number 1.
  • The creditworthiness of new applicant can be determined by using the data on past applicants recorded in the matrix D.
  • The decision is then represented as a vector Y with m elements where each element \(y_{i}\) has two possible values 0 or 1. An element \(y_{i}\) receives the value 1 when the applicant i is accepted, 0 otherwise.
    $$\begin{aligned} \mathbf {Y} = \begin{bmatrix} y_1\\ y_2\\ y_3\\ \vdots \\ y_n \end{bmatrix} \end{aligned}$$
  • The classification then is the problem of determining the decision vector Y that indicates the accepted applicants (\(y_{i}\)=1) or the rejected ones (\(y_{i}\)=0).
Classification plays an important role in CS. However, before launching the classification task, a pre-processing is needed. Data preparation is an interesting step. It permits to prepare properly and accurately the data. This allows getting efficient models that can help the creditor in making a correct decision. In this study, we are interested in feature selection for CS. The aim is to select from the original set of n variables a subset of K variables to be used in the decision-making. The feature selection is a pre-processing that can be launched before classification task. The details about this technique are given in the next subsection.

1.2 Feature selection

Feature selection called also attribute selection or variable selection is the process of removing the redundant attributes that are deemed irrelevant to the data mining task. It is an important step that may be launched before classification to eliminate irrelevant variables. This process can improve the classification performance and accelerate the search process [10, 12, 3537, 41].
Several methods have been studied for feature selection. These methods can be divided in two main methods: the wrapper methods [29] and the filter methods [30].
  • The wrapper methods use data mining algorithm for searching the optimal set of attributes while filter methods eliminate and filter out the undesirable attributes before starting the classification task.
  • The filter methods usually use heuristics instead of machine-learning algorithms used by the wrapper methods [29]. The machine learning algorithm selects the optimal set of attributes with high classification accuracy. However, the wrapper methods are time consuming compared to filter methods because the machine-learning algorithm is run iteratively while searching the set of best attributes.

1.3 Support vector machine

In this study, we are interested in supervised learning technique that finds the best described computer model from a dataset with the correct class variable. Support vector machine is one of the most well-known machine-learning techniques. The technique was proposed by Vladimir Vapnik for classification and regression [11, 16, 24].
SVM classification method learns from a training dataset and attempts to generalize and make correct predictions on novel data.
Let us consider a test sample or a novel data to be classified. The problem is to predict whether the test data belongs to one of the considered classes. The training data is a set of examples of the form \(\{x_{i}, y_{i}\}( i=1,...,l)\). \(x_{i}\) are called input vector. Each input vector has a number of features. \( y_{i} \in \{0, 1\}\). \( y_{i}\) are the response variables called also labels. These input vectors are paired with corresponding labels to find the correct class variable.
As shown in Fig. 1, in two-class supervised learning and when data are linearly separable, SVM can separate the data points into two distinct classes where, in class "good", \(y=1\) and in class "bad", \(y=0\). h(x) is the decision function.
Support vector machine are also called Kernel methods, where the kernel represents similarity measures of examples. We find the following kernel functions: Linear, polynomial, Laplacian, sigmoid and Gaussian called also radial basis function (RBF) [16, 24]. An interesting library for support vector machines (LIBSVM) with open source code can be found available online [13, 14, 43].

2 Proposed approaches for feature selection

In this section, we propose three local search-based methods for feature selection. In the following, we start with the feature vector solution, the accuracy measure and then we give details on the three local search methods for feature selection. The feature selection is combined then with a support vector machine to classify data.

2.1 The feature vector solution representation

The aim of the feature selection is to search for an optimal set of variables or features to be used with the SVM classifier in the classification task.
A solution can be represented as a binary vector which denote the variables present in the dataset, with the length of the vector equals to n, where n is the number of variables. More precisely, a solution is a set of selected variables. To represent such a solution we use the following assignment: if a variable is selected in a solution, the value 1 is assigned to it, a value 0 is assigned to it otherwise. For example, Fig. 2 represents a vector solution. We have a dataset of nine variables where the third, the fourth, the fifth and the sixth variables are selected (bits with value 1).

2.2 Accuracy measure

We used the classification accuracy to measure the quality of a solution called also fitness. We used also the cross-validation standard way to measure the accuracy of a learning scheme on a dataset. The classification accuracy is computed as the ratio of number of correctly classified instances to the total number of instances using the formula (1)
$$\begin{aligned} {{\text{ Fitness }}}= \text{ Accuracy } = \frac{tp +tn}{tp+fn+fp+tn} \end{aligned}$$
(1)
where
  • tp is the true positive and tn is the true negative,
  • fp is the false positive and fn is the false negative.

2.3 Feature selection step

In this work, we study three local search-based meta-heuristics for feature selection. The first is a local search (LS). The second is a stochastic local search (SLS) and the third is a variable neighborhood search (VNS). The local search-based feature selection searches for the best variables set. Then, the support vector machines (SVM) classifies the input data in the reduced dataset, corresponding to the subset of selected variables represented by the feature vector solution generated by the local search method. As already mentioned, the support vector machine (SVM) is a machine-learning classifier that permits to find an optimal separating hyper-plane. It uses a linear hyper-plane to create a classifier with a maximum margin [26]. In the following, we give details on LS, SLS and VNS for feature selection.

2.3.1 Local search method

The local search method (LS) is a hill-climbing technique [25]. LS starts with a random solution x and tries to find better solutions in the current neighborhood. The neighboring solution \(x'\) of the solution x is obtained by modifying one bit. Neighborhood solutions are generated by randomly adding or deleting a feature from the solution vector. For example, if n the number of variable equals to 7 and the current solution vector is x = 1111111, then the possible neighbor solutions can be : {0111111, 1011111,1101111,1110111,1111011,1111101,1111110}.
Among the neighbors, we select for the next iteration the one with the best accuracy. The process is repeated for a certain maximum number of iterations fixed empirically. LS method is sketched in Algorithm 1.

2.3.2 Stochastic local search method

The stochastic local search (SLS) used here is inspired from the one used in [7]. SLS is a local search meta-heuristic which has been already studied for several optimization problem such as satisfiability and optimal winner determination problem (WDP) in combinatorial auctions [8, 9]. SLS starts with an initial solution generated randomly. Then, it performs a certain number of local steps that combines diversification and intensification strategies to locate good solutions.
  • Step 1: The diversification phase selects a random neighbor solution .
  • Step 2: The intensification phase selects a best neighbor solution according to the accuracy measure.
The diversification phase is applied with a fixed probability \(wp>0\) and the intensification phase with a probability \(1-wp\). The wp is a probability fixed empirically. The process is repeated until a certain number of iterations called \(max\_iterations\) is reached. The SLS+SVM method for classification is sketched in Algorithm 2. The proposed SLS+SVM for feature selection starts with a randomly initial solution and then tries to find a good solution in the whole neighborhood in an iterative manner. The SVM classifier is built for each candidate solution constructed by SLS method. The solution is evaluated by the cross-validation method. The SLS process permits the selection of potential attributes that lead to good prediction accuracy. The objective is to find the optimal subsets of attributes by finding optimal combinations of variables from the dataset.

2.3.3 Variable neighborhood search method

The variable neighborhood search (VNS) is a local search meta-heuristic proposed in 1997 by Mladenovic and Hansen. Various variants of VNS have been proposed since then, but the basic idea is a systematic change of neighborhood combined with a local search [23, 32]. In this work, we used four structures of neighborhood which are N1, N2, N3 and N4. At each iteration, we select among the four structures one randomly to create neighbor solutions.
  • N1: where the neighbor solution \(x'\) of the solution x is obtained by modifying one bit as done with local search method. For example, if n the number of feature equals to 12 and x = 111111111111 is a current solution vector, then the possible neighbors in N1 can be as follows: {011111111111, 101111111111, 110111111111, 111011111111, 111101111111, 111110111111, 111111011111, 111111101111, 111111110111, 111111111011, 111111111101, 111111111110}.
  • N2: where the neighbor solution \(x'\) of the solution x is obtained by modifying two bits simultaneously. For example, if x = 111111111111 is a current solution vector, then the possible neighbors in N2 can be as follows: {0 01111111111, 10 0111111111, 110 011111111, 1110 01111111, 11110 0111111, 111110 011111, 1111110 01111, 11111110 0111, 111111110 011, 1111111110 01, 11111111110 0} .
  • N3: where neighboring solution \(x'\) of the solution x is obtained by modifying three bits simultaneously. For example, if we take the same x = 111111111111 as a current solution vector, then the possible neighbors in N3 can be : {0 0 0111111111, 10 0 011111111, 110 0 01111111, 1110 0 0111111, 11110 0 011111, 111110 0 01111, 1111110 0 0111, 11111110 0 011, 111111110 0 01, 1111111110 0 0}.
  • N4: where neighboring solution \(x'\) of the solution x is obtained by modifying randomly one bit.
Like SLS, VNS starts with a randomly initial solution and then tries to find a good solution in the whole neighborhood in an iterative manner. The SVM classifier is called for each candidate solution constructed by VNS method to evaluate the accuracy rate. The process is repeated until a certain number of iterations called \(max\_iterations\) is reached.
Table 1
An instance of the german.data-numeric dataset
Variable number
Value
Scaled value
Variable number
Value
Scaled value
1
1.000000
− 1
13
1.000000
− 1
2
6.000000
− 0.941176
14
2.000000
1
3
4.000000
1
15
1.000000
− 1
4
12.000000
− 0.89011
16
0.000000
− 1
5
5.000000
1
17
0.000000
− 1
6
5.000000
1
18
1.000000
1
7
3.000000
0.333333
19
0.000000
− 1
8
4.000000
1
20
0.000000
− 1
9
1.000000
− 1
21
1.000000
1
10
67.000000
0.714286
22
0.000000
− 1
11
3.000000
1
23
0.000000
− 1
12
2.000000
− 0.333333
24
1.000000
1
Class
    
0
The best solution with a best accuracy rate is selected. The VNS algorithm for feature selection is sketched in Algorithm 3.

3 Experiments

All experiments were run on an Intel Core(TM) i5-2217U CPU@1.70 GHz with 6 GB of RAM under Windows 8—64 bits, processor x64.

3.1 The dataset normalization

The dataset normalization called also feature scaling is a mandatory preprocessing step before staring the classification task. This step is used to avoid variables in greater numeric ranges to dominate those in smaller numeric ranges. The feature values are linearly scaled to the range \([-1,+1]\) or [0, 1] using formula (2), where X denotes the original value; X denotes the scaled value. \(MAX_{a}\) is the upper bound of the feature value a, and \(MIN_{a}\) is the lower bound of the feature value a.
Table 2
An instance of Australian dataset with scaled values
Variable number
Value
Scaled value
A1
1
1
A2
22.08
− 0.749474
A3
11.46
− 0.181429
A4
2
 
A5
4
0.538462
A6
4
− 0.25
A7
1.585
− 0.888772
A8
0
− 1
A9
0
− 1
A10
0
− 1
A11
1
1
A12
2
 
A13
100
− 0.9
A14
1213
− 0.97576
Class
 
0
Table 3
The results of 50 run of LS+SVM on German dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
77.200
17
2
77.200
12
3
77.100
5
4
77.400
12
5
77.400
12
6
77.400
10
7
77.400
14
8
77.000
9
9
77.600
11
10
77.600
10
11
77.100
14
12
77.200
10
13
77.100
12
14
77.400
10
15
77.300
10
16
77.300
9
17
77.400
7
18
77.400
14
19
77.100
11
20
77.100
13
21
77.100
14
22
77.300
9
23
77.300
12
24
77.500
12
25
77.500
11
26
77.300
15
27
77.100
12
28
77.200
12
29
77.300
12
30
77.300
15
31
77.500
11
32
77.000
11
33
77.400
13
34
77.500
11
35
77.400
13
36
77.200
16
37
77.700
13
38
77.200
15
39
77.200
14
40
77.300
10
41
77.400
14
42
77.600
13
43
77.500
16
44
77.700
15
45
77.000
9
46
77.300
10
47
77.200
9
48
77.200
12
49
77.200
12
50
77.500
13
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy \(\%\)
77.00
77.20
77.30
77.31
77.40
77.70
Number of selected variables :
5.0
10.0
12.0
11.9
14.0
17.0
Bold values represent the best result
In our study, we scaled the different feature values to the range \([-1, +1]\).
$$\begin{aligned} X ^{'} = \left( \begin{array}{c} {\frac{X -MIN_{a}}{MAX_{a} -MIN_{a}}} \end{array} \right) \times 2 -1. \end{aligned}$$
(2)

3.2 The dataset description

To evaluate the performance of the proposed methods for credit scoring, we considered both German and Australian credit datasets from UCI (University of California at Irvine) Machine Learning Repository1. The descriptions of the two credit datasets are given as follows:
1.
The German credit dataset is a credit dataset proposed by the Professor Hans Hofmann from Universit"at Hamburg. The dataset consists of 1000 instances. There are two classes: class 1 (worthy, 700 instances) and class 0 (unworthy, 300 instances). We find on UCI, two versions of German dataset:
  • The original dataset german.data that contains categorical/symbolic variables. The number of variable is equal to 20 where 7 are numerical and 13 categorical.
  • The "german.data-numeric" dataset provided by Strathclyde University to be used with algorithms which cannot cope with categorical variables. The number of attributes is equals to 24 numerical attributes. In our experiments, we worked on the "german.data-numeric" dataset version.
An example of an instance of "german.data-numeric" before and after the scaling process is given in Table 1. We note that an instance describe the profile of a given applicant.
 
2.
The Australian Credit Approval is proposed by Quinlan [39]. It concerns credit card applications. The dataset consists of 690 instances of loan applicants. There are two classes: class 1 (worthy, 307 instances) and class 0 (unworthy, 384 instances). The number of variables is equal to 14. There are 6 numerical and 8 categorical variables. An example of an instance of "Australian" is given in Table 2.
 

3.3 Numerical results

Due to the non-deterministic nature of the proposed methods, 50 runs have been considered for each dataset and for each method. In the following, we give the results obtained with LS+SVM, SLS+SVM and VNS+SVM methods. We give the accuracy rate for each run for each method on each dataset.
We compute some summary statistics on accuracy and the number of selected variables. We give the minimum (Min), the mean, the median, the first quartile (first Qu.), the third quartile (third Qu.) and the maximum (Max). We give also the best solution found with the best accuracy for each dataset. The results are given in Tables 3, 4, 5, 6, 7, 8.
Table 4
The results of 50 run of LS+SVM on Australian dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
86.376
9
2
86.086
7
3
85.942
13
4
86.086
7
5
86.086
10
6
85.797
9
7
86.086
7
8
85.942
5
9
86.086
11
10
86.086
7
11
86.086
4
12
86.086
9
13
86.086
6
14
86.376
11
15
86.231
7
16
86.086
8
17
86.231
6
18
86.231
7
19
86.231
7
20
86.231
8
21
86.086
9
22
86.231
6
23
86.086
7
24
86.231
6
25
85.942
7
26
86.086
8
27
86.086
7
28
86.086
6
29
86.086
8
30
86.231
9
31
85.942
7
32
86.231
8
33
86.086
8
34
86.231
7
35
86.231
8
36
86.231
6
37
86.231
8
38
86.231
10
39
85.942
6
40
86.231
9
41
86.231
8
42
86.086
3
43
86.086
9
44
86.086
10
45
86.231
8
46
86.086
8
47
86.086
7
48
86.086
8
49
86.231
7
50
86.231
10
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy
85.80
86.09
86.09
86.13
86.23
86.38
Number of selected variables
3.000
7.000
8.000
7.735
9.000
13.000
Bold values represent the best result
Table 5
The results of 50 run of SLS+SVM on German dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
77.300
15
2
77.400
9
3
77.700
12
4
77.700
12
5
77.500
17
6
77.500
9
7
77.600
13
8
77.600
10
9
77.400
11
10
77.000
16
11
77.200
10
12
77.200
12
13
77.400
15
14
77.300
9
15
77.300
10
16
77.900
12
17
77.400
11
18
77.300
15
19
77.700
16
20
77.200
9
21
77.200
13
22
77.200
10
23
77.400
12
24
77.300
15
25
77.300
9
26
77.500
13
27
77.200
10
28
77.800
13
29
77.400
10
30
77.100
17
31
77.200
15
32
77.300
11
33
77.200
11
34
77.800
10
35
77.500
14
36
77.600
15
37
77.500
8
38
77.600
10
39
77.200
10
40
77.400
10
41
77.200
13
42
77.600
13
43
77.500
11
44
77.100
10
45
77.200
11
46
77.500
15
47
77.300
16
48
77.600
10
49
77.400
10
50
77.400
12
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy
77.0
77.2
77.4
77.4
77.5
77.9
Number of selected variables :
8.00
10.00
11.50
12.00
13.75
17.00
Bold values represent the best result
Table 6
The results of 50 run of SLS+SVM on Australian dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
86.086
4
2
86.086
9
3
86.086
7
4
86.086
9
5
86.231
6
6
86.231
9
7
86.231
11
8
86.086
6
9
86.086
11
10
86.086
7
11
86.376
10
12
86.086
8
13
86.231
9
14
86.376
5
15
86.086
6
16
85.942
9
17
86.231
9
18
86.086
7
19
86.086
5
20
86.086
6
21
86.086
7
22
86.231
7
23
86.231
9
24
86.086
11
25
86.376
9
26
86.231
10
27
86.086
8
28
86.086
7
29
86.231
5
30
86.086
6
31
85.942
7
32
86.231
7
33
86.376
5
34
86.086
5
35
86.231
6
36
86.231
8
37
86.086
9
38
86.231
7
39
85.942
9
40
86.231
9
41
86.231
4
42
86.231
8
43
86.376
9
44
86.231
5
45
86.086
9
46
86.086
9
47
86.086
11
48
86.086
9
49
86.086
9
50
86.086
8
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy
85.94
86.09
86.09
86.16
86.23
86.38
Number of selected variables
4.0
6.0
8.0
7.7
9.0
11.0
Bold values represent the best result
Table 7
The results of 50 run of VNS+SVM on German dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
77.400
14
2
77.700
14
3
77.300
11
4
77.600
12
5
77.500
13
6
77.300
13
7
77.400
12
8
77.300
12
9
78.000
16
10
77.600
10
11
77.300
15
12
77.800
9
13
77.200
11
14
77.500
10
15
77.400
11
16
77.400
13
17
77.300
13
18
77.200
15
19
77.500
14
20
77.500
9
21
77.400
13
22
77.300
14
23
77.800
14
24
77.800
14
25
77.200
12
26
77.500
8
27
77.300
12
28
77.300
7
29
77.800
13
30
77.200
11
31
77.200
13
32
77.400
12
33
77.300
9
34
77.600
10
35
77.900
11
36
77.500
11
37
77.400
14
38
77.400
15
39
77.400
9
40
77.400
15
41
77.300
13
42
77.200
14
43
77.500
14
44
77.800
13
45
77.600
10
46
77.700
15
47
77.500
13
48
77.600
12
49
77.300
13
50
77.400
17
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy
77.20
77.30
77.40
77.46
77.60
78.00
Number of selected variables
7.00
11.00
13.00
12.36
14.00
17.00
Bold values represent the best result
Table 8
The results of 50 run of VNS+SVM on Australian dataset
Run number
Accuracy \(\%\)
Number of selected variables
Run number
Accuracy \(\%\)
Number of selected variables
1
86.811
8
2
86.376
7
3
86.376
9
4
86.521
8
5
86.521
10
6
86.376
5
7
86.521
6
8
86.521
7
9
86.521
10
10
86.667
7
11
86.376
9
12
86.521
8
13
86.521
9
14
86.232
8
15
86.376
5
16
86.521
6
17
86.667
4
18
86.521
4
19
86.521
8
20
86.521
8
21
86.667
7
22
86.521
9
23
86.521
5
24
86.667
8
25
86.376
6
26
86.811
8
27
86.232
9
28
86.521
12
29
86.376
4
30
86.521
7
31
86.376
7
32
86.521
7
33
86.376
9
34
86.667
8
35
86.376
6
36
86.521
6
37
86.811
9
38
86.667
9
39
86.376
8
40
86.521
8
41
86.521
8
42
86.376
7
43
86.376
7
44
86.521
6
45
86.521
9
46
86.376
6
47
86.376
6
48
86.521
11
49
86.376
9
50
86.376
3
 
Min.
First Qu.
Median
Mean
Third Qu.
Max.
Summary on accuracy
86.23
86.38
86.52
86.50
86.52
86.81
Number of selected variables
3.0
6.0
8.0
7.4
9.0
12.0
Bold values represent the best result
From Tables 3, 4, 5, 6, 7, 8 we observe that the obtained results can have the same number of selected variables but different accuracy on different runs. The local search-based feature selection methods do not lead to the same solution when applied to the same problem. This is due to the non-deterministic nature of these methods. In addition, some variables have a significant effect on the solution quality which leads to improvements in accuracy when such variables are selected. We can conclude that the generated solutions are not unique.
We can obtain solutions with the same number of variables but with different accuracy rate because the selected variables are not always the same. For example: Table 8 shows that the solutions with eight selected variables found in run 1, run 4 and run 39 are not the same in spite of the same number of selected variables. The accuracy rates are 86.811, 86.521 and 86.376, respectively.
For instance, the following two solutions have 8 selected variables. The solution: "0 1 1 1 1 1 1 0 0 1 0 0 1 0 " has an accuracy rate equals to 86.811%. The selected variables are A2, A3, A4, A5, A6, A7, A10 and A13. But the solution: "1 1 1 0 0 1 0 1 1 0 1 0 0 1" has an accuracy rate equals to 86.521%. The selected variables are: A1, A2, A3, A6, A8, A9, A11 and A14. This means that for the Australian dataset, the set of variables {A2, A3, A4, A5, A6, A7, A10 and A13} is more significant than the set of variables {A1, A2, A3, A6, A8, A9, A11 and A14}.
According to the numerical results, we can say that the three methods succeed in finding good results for the two considered datasets. However, we see a slight performance in favor of the variable neighborhood search (VNS). The latter is able to find better solution compared to LS and SLS. Hence, we can conclude that the VNS method with the four different neighbor structures is effective for feature selection and classification.
The superiority of VNS is due to the good combination of intensification and diversification which permits to explore the search space effectively and locate good solutions.
In addition to the numerical results given in the different Tables 3, 4, 5, 6, 7, 8, we draw the boxplots given in Figs. 3 and 4 to better visualize the distribution of values of the classification accuracy.
From the box diagram depicted in Figs. 3 and 4,we visualized the distribution of classification accuracy on the 50 runs for each algorithm and for both Australian and German dataset. This diagram shows clearly that in general VNS is able to produce good solutions. The results are promising and demonstrate the benefit of the proposed technique in feature selection. To further demonstrate the effectiveness of the proposed technique in credit scoring, we give further comparisons in the next subsection.

3.4 A comparison with a pure SVM

In this section, we compare the three proposed methods LS+SVM, SLS+SVM and VNS+SVM with a pure SVM on both German and Australian datasets. The aim is to show the impact of the feature selection in the classification task.
Table 9 gives the results obtained with SVM, LS+SVM, SLS+SVM and VNS+SVM methods. We give the best accuracy rate and the number of best variables set (significant) returned by each method.
As we can see from Table 9 that the three methods are better than the pure SVM. The three proposed methods are able to find good results for the two considered datasets. SLS and LS are comparable and succeed in improving the accuracy rate of SVM.
Further, VNS+SVM method is more effective on both Australian and German datasets compared to both LS+SVM and SLS+SVM. We draw Fig. 5 (respectively Fig. 6) to compare a pure SVM with our approach in term of accuracy rate (respectively in term of the number of selected variables) point of view. The performance of our approach compared to SVM is shown clearly in Figs. 5 and 6. We note that:
  • LS returns 13 significant selected variables for the German dataset which are: A2, A4, A10, A12, A13, A15, A17, A18, A19, A20, A22, A23 and A24. The accuracy rate is equal to 77.70 %. The significant variables returned by LS for the Australian dataset are: A2, A3, A4, A6, A7, A9, A10, A13 and A14. The accuracy rate is equal to 86.38% and the number of selected variables s is 9.
  • SLS returns 12 significant selected variables for the German dataset which are: A1, A3, A10, A13, A15, A16, A17, A18, A19, A20, A22, A23 and A24. The accuracy rate is equal to 77.90%. The significant variables found by SLS for the Australian dataset are: A2, A4, A6, A7, A9, A10, A11, A13 and A14. The accuracy rate is equal to 86.38% and the number of selected variables is 9.
  • The 16 significant selected variables returned by VNS for the German dataset are: A2, A4, A5, A6, A9, A11, A12, A13, A14, A15, A16, A17, A18, A19, A22 and A24 where the accuracy rate is equals to 78% and the. For the Australian dataset, VNS returns 8 significant variables which are: A2, A3, A4, A5, A6, A7, A10, A13. The best accuracy rate is equal to 86.81%.
In this section, we compared the three proposed methods with feature selection to a pure SVM to measure the effectiveness of the additional feature selection method. As shown in Table 9, the proposed methods perform better than the pure SVM on both German and Australian datasets.
Further, we remark that fewer features are selected in the model to be used by SVM compared to the initial feature number of the dataset. This implies that some features in the dataset are redundant and should be eliminated to enhance the classification accuracy.

3.5 Further comparison

To show the performance of the proposed approaches in credit scoring, we evaluated them against some well-known classifiers. Several classifiers can be found on the WEKA Data mining software package [43].
We compared our approaches with some popular classifiers which are: the rule-learning scheme (PART), ZeroR, JRip, BayesNet, NaiveBayes, adaBoost, attributeSelectedClassifier, Bagging, RandomForst, RandomTree and J48. These eleven classifiers from WEKA [43] were used in this study by means of their default parameters originally set in WEKA.
We add also a comparison with two well-known filtering methods. We choose the best-first search (CFS) and the ranking filter information gain methods (IGRF). We note that CFS is a correlation based feature selection that can be used to select a set of variables. However, CFS is unable to select all relevant variables when there are strong dependencies between variables. The IGRF ranking filter permits to select a set of variables from the original dataset using score or weights [19, 37]. We combined these two feature selection methods (CFS and IGRF) with SVM to classify data.
Table 10 compares the three proposed methods (LS+SVM, SLS+SVM and VNS+SVM), the eleven classifiers from WEKA, CFS+SVM and IGRF+SVM on the two considered datasets: Australian and German. The comparison is in term of the average classification accuracy rates.
As shown in Table 10, the three proposed approaches (LS+SVM, SLS+SVM and VNS+SVM) are comparable to the well-known classifiers. We can see a slight performance in favor of our VNS+SVM method. The proposed method (VNS+SVM) gives the highest average classification accuracy compared to PART, JRip, BayesNet, NaiveBayes, adaBoost, attributeSelectedClassifier, Bagging, RandomForst, RandomTree and J48 on both Australian and German datasets.
Further, we remark that OneR and VNS+SVM classifiers are comparable on Australian dataset. However, VNS+SVM is better than OneR on German dataset. OneR gives an average accuracy equal to 86.6% on Australian dataset but it fails on German dataset where the average accuracy rate value given by OneR is equal to 60.8%. The VNS+SVM method succeeds in finding good results for both Australian and German datasets. For Australian dataset, VNS+SVM gives an average accuracy value equal to 86.50% when VNS is used as a feature selection method within SVM classifier. For German dataset, VNS+SVM gives the best average accuracy value equals to 77.46% compared to the all considered classifiers.
When we compare the feature selection methods (CFS, IGRF and our three local search methods), we can see that our approaches provide good results compared to both CFS and IGRF ranking methods. For example, for German dataset, SVM with CFS gives an average accuracy value equals to 72.70% when the CFS is used as a feature selection method while SVM with IGRF gives an average accuracy equal to 75.6%. The results are much better when we use our proposed approaches in particular when we consider VNS with SVM. As already said, the resulting method VNS+SVM gives the best average accuracy value which is equal to 77.46% for the German dataset. This performance is also confirmed on Australian dataset with an average accuracy value equal to 86.50%.
Table 9
SVM .vs. LS+SVM .vs. SLS+SVM .vs. VNS+SVM
Method
 
Australian
German
SVM
Accuracy %
85.50
75.6
Number of attributes
14
24
LS+SVM
Max Accuracy %
86.38
77.70
Number of significant selected attributes
9
13
(best solution found)
  
SLS+SVM
Max Accuracy %
86.38
77.90
Number of significant selected attributes
9
12
(best solution found)
  
VNS+SVM
Max Accuracy %
86.81
78.00
Number of selected significant attributes
8
16
(best solution found)
  
Bold values represent the best result
Table 10
A comparison according to the average classification accuracy rates
Method
Australian
German
PART
84.1
69.6
ZeroR
30.8
49
OneR
86.6
60.8
JRip
85.7
69.4
BayesNet
86.1
74.6
NaiveBayes
79.2
74.3
adaBoost
84.4
66.1
attributeSelctedClassifier
83.7
69.3
Bagging
85.5
73.2
RandomForest
86.2
75.1
RandomTree
76.9
66.7
J48
86.1
68.7
CFS+SVM
73.19
72,70
IGRF +SVM
85.5
75.6
LS+SVM
86.13
77.31
SLS+SVM
86.16
77.40
VNS+SVM
86.50
77.46
Bold values represent the best result
In conclusion, we can say that the three proposed approaches (LS+SVM, SLS+SVM and VNS+SVM) are comparable. However, promising results are obtained when combining SVM with the VNS-based feature selection method. This improvement can be shown for the two considered datasets which proves the ability of VNS+SVM as a good classifier in credit scoring.

4 Conclusion

This paper studied three local search feature-based methods combined with SVM model for credit scoring. The proposed model finds the best set of features (called also significant attributes or variables) by removing irrelevant variables and keeping only appropriate ones. The set of features is used then with SVM classifier to classify data. We studied three variants of local search-based feature selection: the local search hill climbing, the stochastic local search and the variable neighborhood search. The three variants combined with SVM are evaluated on two well-known German and Australian credit scoring datasets. The proposed methods have good accuracy performance with fewer features. The proposed VNS+SVM method performs better on both German and Australian datasets compared to SVM, LS+SVM, SLS+SVM and other well-known classifiers. We plan to improve our work by optimizing the SVM parameters. Further, it would be nice to study the impact of feature selection-based method on other machine-learning techniques.

Acknowledgements

The authors would like to thank the developers of the Library for support vector machines (LIBSVM) for the provision of the open source code. The authors would like to thank also the developers of Waikato Environment for Knowledge Analysis (WEKA).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat Abdou, H.A.: Genetic programming for credit scoring: the case of Egyptian public sector banks. Expert Syst. Appl. 36, 11402–11417 (2009)CrossRef Abdou, H.A.: Genetic programming for credit scoring: the case of Egyptian public sector banks. Expert Syst. Appl. 36, 11402–11417 (2009)CrossRef
2.
Zurück zum Zitat Abelln, J., Mantas, C.J.: Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 41, 3825–3830 (2014)CrossRef Abelln, J., Mantas, C.J.: Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 41, 3825–3830 (2014)CrossRef
3.
Zurück zum Zitat Bellotti, T., Crook, J.: Support vector machines for credit scoring and discovery of significant features. Expert Syst. Appl. 2009(36), 3302–3308 (2009)CrossRef Bellotti, T., Crook, J.: Support vector machines for credit scoring and discovery of significant features. Expert Syst. Appl. 2009(36), 3302–3308 (2009)CrossRef
4.
Zurück zum Zitat Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees 1984. Wadsworth, Belmont (1984)MATH Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees 1984. Wadsworth, Belmont (1984)MATH
5.
Zurück zum Zitat Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(1998), 121–167 (1998)CrossRef Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov. 2(1998), 121–167 (1998)CrossRef
6.
Zurück zum Zitat Boughaci, D., Alkhawaldeh, A.A.K.: A cooperative classification system for credit scoring. In: Proceedings of AUEIRC 2017. Springer (2017) (to appear) Boughaci, D., Alkhawaldeh, A.A.K.: A cooperative classification system for credit scoring. In: Proceedings of AUEIRC 2017. Springer (2017) (to appear)
7.
Zurück zum Zitat Boughaci, D., Benhamou, B., Drias, H.: A memetic algorithm for the optimal winner determination problem. Soft Comput. 13, 905–917 (2009)CrossRef Boughaci, D., Benhamou, B., Drias, H.: A memetic algorithm for the optimal winner determination problem. Soft Comput. 13, 905–917 (2009)CrossRef
8.
Zurück zum Zitat Boughaci, D.: Meta-heuristic approaches for the winner determination problem in combinatorial auction. In: Yang XS. (ed.) Artificial Intelligence, Evolutionary Computing and Metaheuristics, Studies in Computational Intelligence, vol. 427, pp. 775–791. Springer, Berlin, Heidelberg (2013) Boughaci, D.: Meta-heuristic approaches for the winner determination problem in combinatorial auction. In: Yang XS. (ed.) Artificial Intelligence, Evolutionary Computing and Metaheuristics, Studies in Computational Intelligence, vol. 427, pp. 775–791. Springer, Berlin, Heidelberg (2013)
10.
Zurück zum Zitat Caruana, R., Freitag, D.: Greedy attribute selection. In: Proceedings of the eleventh international conference on machine learning. (ICML 1994, New Brunswick, New Jersey). Morgan Kauffmann, pp. 28–36 (1994) Caruana, R., Freitag, D.: Greedy attribute selection. In: Proceedings of the eleventh international conference on machine learning. (ICML 1994, New Brunswick, New Jersey). Morgan Kauffmann, pp. 28–36 (1994)
11.
Zurück zum Zitat Campbell, C., Ying, Y.: Learning with Support Vector Machines. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool, San Rafael (2011)MATH Campbell, C., Ying, Y.: Learning with Support Vector Machines. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan and Claypool, San Rafael (2011)MATH
12.
Zurück zum Zitat Chakraborty, B.: Genetic algorithm with fuzzy fitness function for feature selection. In: Proceedings of the IEEE international symposium on industrial electronics vol. 1, pp. 315–319 (2002) Chakraborty, B.: Genetic algorithm with fuzzy fitness function for feature selection. In: Proceedings of the IEEE international symposium on industrial electronics vol. 1, pp. 315–319 (2002)
15.
Zurück zum Zitat Desay, V., Crook, J.N., Overstreet, G.A.: A comparison of neural networks and linear scoring models in the credit union environment. Eur. J. Oper. Res. 95(1996), 24–37 (1996)CrossRefMATH Desay, V., Crook, J.N., Overstreet, G.A.: A comparison of neural networks and linear scoring models in the credit union environment. Eur. J. Oper. Res. 95(1996), 24–37 (1996)CrossRefMATH
16.
Zurück zum Zitat Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Mach. Learn. 29(1997), 131–163 (1997)CrossRefMATH Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Mach. Learn. 29(1997), 131–163 (1997)CrossRefMATH
17.
Zurück zum Zitat Goldberg, D.E., Korb, B., Deb, K.: Messy genetic algorithms: motivation, analysis, and first results. Complex Syst. 5(3), 493–530 (1989)MathSciNetMATH Goldberg, D.E., Korb, B., Deb, K.: Messy genetic algorithms: motivation, analysis, and first results. Complex Syst. 5(3), 493–530 (1989)MathSciNetMATH
19.
Zurück zum Zitat Hall, M.a: Correlation-based feature selection for machine learning. Methodology 21i195i20, 15 (1999). April Hall, M.a: Correlation-based feature selection for machine learning. Methodology 21i195i20, 15 (1999). April
20.
Zurück zum Zitat Hand, D.J., Henley, W.E.: Statistical classification methods in consumer credit scoring. J. R. Stat. Soc. Ser. A (Stat. Soc.) 160, 523–541 (1997)CrossRef Hand, D.J., Henley, W.E.: Statistical classification methods in consumer credit scoring. J. R. Stat. Soc. Ser. A (Stat. Soc.) 160, 523–541 (1997)CrossRef
21.
Zurück zum Zitat Henley, W.E., Hand, D.J.: A k-nearest neighbour classifier for assessing consumer credit risk. Statistician 45(1996), 77–95 (1996)CrossRef Henley, W.E., Hand, D.J.: A k-nearest neighbour classifier for assessing consumer credit risk. Statistician 45(1996), 77–95 (1996)CrossRef
23.
Zurück zum Zitat Hansen, P., Mladenovic, N.: Variable neighbourhood search: principles and applications. Eur. J. Oper. Res. 130, 449–467 (2001)CrossRefMATH Hansen, P., Mladenovic, N.: Variable neighbourhood search: principles and applications. Eur. J. Oper. Res. 130, 449–467 (2001)CrossRefMATH
24.
Zurück zum Zitat Hertz, J.A., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company Inc, Redwood City (1991) Hertz, J.A., Krogh, A., Palmer, R.G.: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company Inc, Redwood City (1991)
25.
Zurück zum Zitat Hoos, H., Stutzle, T.: Stochastic Local Search: Foundations and Applications. Morgan Kaufmann, San Francisco (2005)MATH Hoos, H., Stutzle, T.: Stochastic Local Search: Foundations and Applications. Morgan Kaufmann, San Francisco (2005)MATH
26.
Zurück zum Zitat John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. San Mateo: Morgan Kaufman, pp. 338-345 (1995) John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. San Mateo: Morgan Kaufman, pp. 338-345 (1995)
27.
Zurück zum Zitat Ju, Y., Sohn, S.Y.: Technology credit scoring based on a quantification method. Sustainability 9(6), 1057 (2017). (Multidisciplinary Digital Publishing Institute)CrossRef Ju, Y., Sohn, S.Y.: Technology credit scoring based on a quantification method. Sustainability 9(6), 1057 (2017). (Multidisciplinary Digital Publishing Institute)CrossRef
29.
Zurück zum Zitat Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial intelligence, Special issue on relevance 273–324. (1996) Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial intelligence, Special issue on relevance 273–324. (1996)
30.
Zurück zum Zitat Lanzi, P.L.: Fast feature selection with genetic algorithms: a filter approach. In: IEEE international conference on evolutionary computation, vol. 25, pp 537–540 (1997) Lanzi, P.L.: Fast feature selection with genetic algorithms: a filter approach. In: IEEE international conference on evolutionary computation, vol. 25, pp 537–540 (1997)
31.
Zurück zum Zitat Li, J., Wei, L., Li, G., Xu, W.: An evolution strategy-based multiple kernels multi-criteria programming approach: the case of credit decision making. Decis. Support Syst. 51, 292–298 (2011)CrossRef Li, J., Wei, L., Li, G., Xu, W.: An evolution strategy-based multiple kernels multi-criteria programming approach: the case of credit decision making. Decis. Support Syst. 51, 292–298 (2011)CrossRef
32.
33.
Zurück zum Zitat Miller, M.: Research confirms value of credit scoring. Natl. Underwrit. 107(42), 30 (2003) Miller, M.: Research confirms value of credit scoring. Natl. Underwrit. 107(42), 30 (2003)
34.
Zurück zum Zitat Moscato, P.: On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. In: Caltech concurrent computation program, C3P Report 826 (1989) Moscato, P.: On evolution, search, optimization, genetic algorithms and martial arts: towards memetic algorithms. In: Caltech concurrent computation program, C3P Report 826 (1989)
37.
Zurück zum Zitat Phyu, T.N.: Survey of Classification Techniques in Data Mining. In: Proceedings of the international multi conference of engineers and computer scientists, vol I IMECS 2009, March 18–20, 2009, Hong Kong (2009) Phyu, T.N.: Survey of Classification Techniques in Data Mining. In: Proceedings of the international multi conference of engineers and computer scientists, vol I IMECS 2009, March 18–20, 2009, Hong Kong (2009)
38.
Zurück zum Zitat Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1992) Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1992)
39.
Zurück zum Zitat Quinlan, J.R.: Simplifying decision trees. Int. J. Man Mach. Stud. 27, 221–234 (1987)CrossRef Quinlan, J.R.: Simplifying decision trees. Int. J. Man Mach. Stud. 27, 221–234 (1987)CrossRef
40.
Zurück zum Zitat Sousaa, M.R., Gamaa, J., Brando, E.: A new dynamic modeling framework for credit risk assessment. Expert Syst. Appl. 45, 341–351 (2016)CrossRef Sousaa, M.R., Gamaa, J., Brando, E.: A new dynamic modeling framework for credit risk assessment. Expert Syst. Appl. 45, 341–351 (2016)CrossRef
41.
Zurück zum Zitat Tan, K.C., Teoh, E.J., Goh, K.C., Yua, Qb: A hybrid evolutionary algorithm for attribute selection in data mining. Expert Syst. Appl. 36, 8616–8630 (2009)CrossRef Tan, K.C., Teoh, E.J., Goh, K.C., Yua, Qb: A hybrid evolutionary algorithm for attribute selection in data mining. Expert Syst. Appl. 36, 8616–8630 (2009)CrossRef
42.
Zurück zum Zitat Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)MATH Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)MATH
44.
Zurück zum Zitat Wiginton, J.C.: A note on the comparison of logistic and discriminant models of consumer credit behavior. J. Financ. Quant. Anal. 15, 757–770 (1980)CrossRef Wiginton, J.C.: A note on the comparison of logistic and discriminant models of consumer credit behavior. J. Financ. Quant. Anal. 15, 757–770 (1980)CrossRef
45.
Zurück zum Zitat Yang, X.-S.: Harmony search as a metaheuristic algorithm. In: Editor, Z., Geem, W. (eds.) Music-Inspired Harmony Search Algorithm: Theory and Applications, Studies in Computational Intelligence. Springer, Berlin (2009) Yang, X.-S.: Harmony search as a metaheuristic algorithm. In: Editor, Z., Geem, W. (eds.) Music-Inspired Harmony Search Algorithm: Theory and Applications, Studies in Computational Intelligence. Springer, Berlin (2009)
Metadaten
Titel
Three local search-based methods for feature selection in credit scoring
verfasst von
Dalila Boughaci
Abdullah Ash-shuayree Alkhawaldeh
Publikationsdatum
28.05.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Vietnam Journal of Computer Science / Ausgabe 2/2018
Print ISSN: 2196-8888
Elektronische ISSN: 2196-8896
DOI
https://doi.org/10.1007/s40595-018-0107-y

Weitere Artikel der Ausgabe 2/2018

Vietnam Journal of Computer Science 2/2018 Zur Ausgabe