Classifier ensemble construction with rotation forest to improve medical diagnosis performance of machine learning algorithms

https://doi.org/10.1016/j.cmpb.2011.03.018Get rights and content

Abstract

Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature.

While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC).

Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases.

RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems.

Introduction

Machine learning algorithms have been successfully applied to design CADx systems. These algorithms are first trained with diagnosed samples, i.e. with precedent diagnoses of medical experts. In the test phase, the algorithms are later used to assist the medical experts in making diagnosis of future samples [1]. In this aspect, success of an analysis strategy can be defined as the ability of algorithm to predict the correct status (normal or disease) of unseen data.

Performance of CADx systems might be enhanced with more accurate machine learning algorithms. Predictive ability of such analysis methods can be improved mainly with two strategies: (i) application of feature selection methods on the dataset [2], (ii) construction of classifier ensembles [3].

Accuracy of classification strategies can be affected negatively with the use of too many features in the classification. This may lead to overfitting, in which noise or irrelevant features may decrease classification accuracy because of the finite size of the training samples [4]. In general, there are two widely used feature selection strategies: (i) filter approaches and (ii) wrappers. Wrapper methods find feature subsets based on the performance of a preselected classification algorithm on a training data set. In contrast, filters rely on properties of the features to select the best feature subset. While selecting a subset of features, both approaches utilize a search procedure such as individual ranking, forward search and backward search [5]. In this concept, CFS is a multivariate filter approach that can evaluate strength of features to return the most relevant variables [6]. CFS, in literature, is used in various medical diagnosis applications for feature selection purposes [7], [8], [9], [10].

A powerful technique in machine learning to increase accuracy of conventional base classifiers is to construct classifier ensembles. An ensemble classifier consists of base classifiers that learn a target function by combining their prediction mutually [11]. Some of the ensemble learning approaches seen in the machine learning literature is composite classifier systems, mixture of experts, consensus aggregation, dynamic classifier selection, classifier fusion and committees of neural networks [12]. In machine learning literature, there are various CADx applications that use classifier ensembles (particularly RF algorithm) to improve accuracy of convenient classifiers [13], [14], [15], [16], [17], [18].

Other than accuracy of the base classifiers, the performance of an ensemble algorithm is affected by diversity of the community of classifiers forming the ensemble. Diverse classifiers make different errors on different samples. Combination of such classifiers might lead to more accurate decisions [19].

This study presents an evaluation study that can help to design CADx systems with increased performance. The strategy based on a two-step approach in constructing classifiers with enhanced accuracy. In the first step, feature dimensions of three benchmarking datasets are reduced by the use of CFS algorithm. In the second step, 30 base classifiers and corresponding RF classifier ensembles are used in diagnosis of Parkinson's, heart and diabetes diseases to evaluate the resultant accuracies of algorithms. All the experiments are validated with leave-one-out (10-fold cross validation) scheme.

Section snippets

Overview

In this section, we explain our technique used for creating classifiers with improved accuracies. First, our feature selection strategy, i.e. CFS, is introduced. Following CFS explanation, RF ensemble classification scheme is explained with detail. Next, the datasets used to evaluate classifier performances are briefly introduced. Section 2 is ended with the explanation of evaluation metrics used through experiments.

Variable selection with CFS algorithm

In a classification problem, goodness of features from correlation point of

The benchmarking data with the application of CFS algorithm

We utilized three medical datasets, i.e. diabetes, heart and Parkinson's, from UCI machine learning repository for benchmarking purposes.

The diabetes dataset contains 768 data samples and each sample is defined with 8 features of Table 1. In the dataset, there are two classes as negative to diabetes and positive to diabetes. The two classes involve 500 and 268 samples, respectively.

With the application of CFS algorithm to diabetes dataset, the features with IDs of {2,6,7,8} are retained while

Machine learning algorithms and their abbreviations used in the study

In order to evaluate the performance of widely used machine learning algorithms with their corresponding RF classifier ensembles, we selected 30 algorithms from Weka data mining software. While selecting the algorithms, we attempted to keep diversity of algorithms. For the ease of evaluation in all of the figures and tables, we make use of ID number of the algorithms as a replacement for their names. The ID numbers and respective name of the algorithms are given in Table 4. We used default

Experimental results

In this section, the results of the experiments for diabetes, Cleveland heart and Parkinson's datasets are given in Table 5, Table 6, Table 7, respectively. In the tables, ‘e’ means RF classifier ensemble corresponding to base classifier measures. ‘Diff’ means ‘Difference’ while ‘AVG’ stands for ‘average’.

As Table 5 is examined with three metrics (ACC, KE and AUC) simultaneously, 24 out of 30 base classifiers’ performance is seen to be improved by the use of corresponding RF classifier ensemble

Conclusion and remarks

Machine learning applications, particularly CADx systems, needs classifiers with enhanced accuracies. Such applications, in general, require a two-step approach: (i) a relevant feature selection algorithm to find the most powerful features and (ii) a high accuracy classifier to obtain the highest classification performance.

In our study, we did not evaluate the effect of feature selection algorithm on classifier performances. Instead, we used a simple CFS algorithm to decrease the feature size

References (30)

  • K. Michalak et al.
  • A. Mendiburu et al.

    Parallel and multi-objective EDAs to create multivariate calibration models for quantitative chemical applications

  • I. Skrypnyk

    Comparison of feature selection strategies for hearing impairments diagnostics

  • A.G. Karegowda et al.

    Cascading GA; CFS for feature subset selection in medical data mining

  • Y. Cheng-San et al.

    A hybrid approach for selecting gene subsets using gene expression data

  • Cited by (234)

    View all citing articles on Scopus
    View full text