Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers

doi:10.1016/j.artmed.2007.05.002

Artificial Intelligence in Medicine

Volume 41, Issue 1, September 2007, Pages 25-37

https://doi.org/10.1016/j.artmed.2007.05.002 Get rights and content

Summary

Objectives

The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed.

Materials and methods

Number of regions of interests (ROIs) corresponding to C1–C4 have been defined by experienced radiologists in non-enhanced liver CT images. For each ROI, five distinct sets of texture features were extracted using first order statistics, spatial gray level dependence matrix, gray level difference method, Laws’ texture energy measures, and fractal dimension measurements. Two different ECs were constructed and compared. The first one consists of five multilayer perceptron neural networks (NNs), each using as input one of the computed texture feature sets or its reduced version after genetic algorithm-based feature selection. The second EC comprised five different primary classifiers, namely one multilayer perceptron NN, one probabilistic NN, and three k-nearest neighbor classifiers, each fed with the combination of the five texture feature sets or their reduced versions. The final decision of each EC was extracted by using appropriate voting schemes, while bootstrap re-sampling was utilized in order to estimate the generalization ability of the CAD architectures based on the available relatively small-sized data set.

Results

The best mean classification accuracy (84.96%) is achieved by the second EC using a fused feature set, and the weighted voting scheme. The fused feature set was obtained after appropriate feature selection applied to specific subsets of the original feature set.

Conclusions

The comparative assessment of the various CAD architectures shows that combining three types of classifiers with a voting scheme, fed with identical feature sets obtained after appropriate feature selection and fusion, may result in an accurate system able to assist differential diagnosis of focal liver lesions from non-enhanced CT images.

Introduction

One of the most common and robust imaging techniques for the detection of hepatic lesions is computed tomography (CT) [1]. Although the quality of CT images has been significantly improved during the last years, it is difficult in some cases, even for experienced doctors, to make a 100% accurate diagnosis. In these cases, the diagnosis has to be confirmed by administration of contrast agents, which is related with renal toxicity and allergic reactions, or invasive procedures (biopsies). During the last years, along with the developments in image processing and artificial intelligence, computer-aided diagnosis (CAD) systems, aiming at the characterization of liver tissue, attract much attention, since they can provide diagnostic assistance to clinicians, and contribute to reduction of the number of required biopsies.

Various approaches, most of them using ultrasound B-scan and CT images, have been proposed based on different image characteristics, such as texture features, estimated from first- and second-order gray level statistics, and fractal dimension estimators combined with various classifiers [2], [3], [4], [5]. Texture analysis of liver CT images based on spatial gray level dependence matrix (SGLDM), gray level run length method (GLRLM), and gray level difference method (GLDM) has been proposed by Mir et al. [6], in order to discriminate normal from malignant hepatic tissue. Chen et al. [7] have applied SGLDM texture features to a probabilistic neural network (P-NN) for the characterization of hepatic tissue (hepatoma and hemangioma) from CT images. Additionally, SGLDM-based texture features fed to a system of three sequentially placed neural networks (NNs) have been used by Gletsos et al. [8] for the classification of hepatic tissue into four categories. Although a lot of effort has been devoted to liver tissue characterization, the developed systems are most of the times limited to two or three classes of liver tissue and/or do not gain from the interaction of different texture characterization methods, or the combination of different classifiers.

In order to select the most robust characteristics from an initial high-dimensional feature set, that might be derived from different feature extraction techniques, feature selection methods can be applied. Deterministic, or stochastic feature selection methods decrease the feature extraction costs of the classification system, and may also enhance its performance [9]. During the last years, an increasing number of researchers are using genetic algorithms (GAs) for dimensionality reduction. The use of GAs for feature selection was first introduced in 1989 [10], and since then, GAs have been successfully applied to a broad spectrum of dimensionality reduction studies [11], [12], [13]. According to Ref. [8], the use of GAs results in more robust feature vectors as compared to other deterministic feature selection techniques, in problems related to liver tissue classification from CT images.

In the last decade, the use of multiple classifier systems has been proposed in order to optimize the performance of CAD systems. A set of classifiers whose individual predictions are fused through a combining strategy, usually a voting scheme, to classify new examples constitutes an ensemble of classifiers (EC) [14]. The attraction that this topic exerts on machine learning and diagnostic decision support research is based on the premise that ECs are often much more accurate than any individual classifier of the set [15], [16]. Early diagnosis of melanoma has been facilitated by combining three types of classifiers, namely linear discriminant analysis (LDA), k-nearest neighbor (k-NN), and a decision tree, with a voting scheme [17]. A multiple classification system based on a committee of NNs, trained by the Levenberg–Marquardt algorithm, along with a voting scheme across the NN outputs has been used by Jerebko et al. [18] for the detection of colonic polyps in CT colonography data. Furthermore, a novel system for diabetes diagnosis has been proposed [19], which is based on retinal images fractal characteristics and applies a voting scheme across the outputs of an EC consisting of a back-propagation trained NN, a radial basis function NN, and a GA-based classifier. The use of texture features and shape parameters along with a multi-classifier modular architecture composed from a self-organizing map (SOM) and/or k-NN classifiers has been recently proposed by Christodoulou et al. [15], aiming at the characterization of carotid plaques for the identification of individuals with asymptomatic carotid stenosis at risk of stroke.

Although previous studies [20], [21] have shown that CAD systems based on various texture features and ECs can enhance the diagnosis efficiency of CT focal liver lesions, the evaluation of the proposed methods on small-sized samples constitutes a significant drawback for these studies. To address this drawback, re-sampling methods like cross-validation, jack-knife, and bootstrap can be applied [22], [23], [24]. In Ref. [18], cross-validation has been applied for sensitivity estimation of a colonic polyp detection system, while in Ref. [25] the bootstrap method has been used for the development of a diagnosis system able to differentiate benign and malignant tumors from breast ultrasound images. The bootstrap method, which was introduced by Efron [22] as an approach to calculate confidence intervals for parameters where standard methods cannot be applied, is based on re-sampling with replacement. The comparative assessment of various re-sampling methods has shown that the bootstrap method provides less biased and more consistent results than the jack-knife method [26].

The principal aim of the present paper is to assess the potential of ECs in the development of a CAD system able to discriminate four hepatic tissue types (normal liver, hepatic cyst, hemangioma, and hepatocellular carcinoma) from non-enhanced CT images. Furthermore, the use of a variety of texture features as input to the CAD system is examined while the application of feature selection based on a GA is investigated aiming at improving the resulting classification performance. In order to overcome problems with small data sets and biased classification performances, the bootstrap method is applied. In this framework, five different CAD architectures based on the above design concepts are comparatively assessed.

The rest of the paper is organized as follows: in Section 2, the generic system design concepts are presented, including description of the data used, the methodology of feature extraction and selection, the ECs and the applied voting schemes, as well as the five alternative architectures of the CAD system. In Section 3, the experimental results of the five CAD system architectures are presented and compared, followed by conclusions presented in Section 4.

Section snippets

Methodology

The generic design of a CAD system aiming at the classification of CT liver tissue into one of the four classes: normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4), is presented in Fig. 1. Firstly, regions of interest (ROIs) drawn by an experienced radiologist on CT images were driven to a feature extraction module, where five different texture feature sets were obtained using first order statistics (FOS), spatial gray level dependence matrices (SGLDM),

Results and discussion

The classification accuracies achieved by CAD1, …, CAD5 using the 50 groups of sets (training, validation and testing set), obtained through bootstrap re-sampling, were estimated. For each architecture, the results are presented in terms of mean values and standard deviations of the classification accuracy of the primary classifiers, as well as the total CAD system performance with use of either plurality or weighted voting scheme.

The results for CAD1 and CAD2 are presented in Table 4. It can

Conclusion

The aim of the present paper was to define a CAD system architecture able to accurately classify hepatic tissue from non-enhanced CT images as normal liver, hepatic cyst, hemangioma, and hepatocellular carcinoma.

The system design was based on the use of texture features, feature selection techniques, and ECs. For each CT liver ROI, five types of texture feature sets, based on first order statistics, spatial gray level dependence matrices, gray level difference matrices, Laws’ texture energy

References (40)

H.M. Taylor et al.
Hepatic imaging: an overview
Radiol Clin North Am
(1998)
H. Sujana et al.
Application of artificial neural networks for the classification of liver lesions by image texture parameters
Ultrasound Med Biol
(1996)
W. Siedlecki et al.
A note on genetic algorithms for large scale feature selection
Pattern Recog Lett
(1989)
H. Handels et al.
Feature selection for optimized skin tumor recognition using genetic algorithms
Artif Intell Med
(1999)
S.M. Yamany et al.
Application of neural networks and genetic algorithms in the classification of endothelial cells
Pattern Recog Lett
(1997)
A. Sboner et al.
A multiple classifier system for early melanoma diagnosis
Artif Intell Med
(2003)
A.K. Jerebko et al.
Multiple neural network classification scheme for detection of colonic polyps in CT colonography data sets
Acad Radiol
(2003)
D.-R. Chen et al.
Use of the bootstrap technique with small training sets for computer-aided diagnosis in breast ultrasound
Ultrasound Med Biol
(2002)
Y. Wu et al.
Improved k-nearest neighbor classification
Pattern Recog
(2002)
A. Verikas et al.
Soft combination of neural classifiers: a comparative study
Pattern Recog Lett
(1999)

Y.M. Kadah et al.

Classification algorithms for quantitative tissue characterization of diffuse liver disease from ultrasound images

IEEE Trans Med Imaging

(1996)

Y.N. Sun et al.

Ultrasonic image analysis for liver diagnosis

IEEE Eng Med Biol

(1996)

Ch.-M. Wu et al.

Texture features for classification of ultrasonic liver images

IEEE Trans Med Imaging

(1992)

A.H. Mir et al.

Texture analysis of CT images

IEEE Eng Med Biol

(1995)

E.L. Chen et al.

An automatic diagnostic system for CT liver image classification

IEEE Trans Biomed Eng

(1998)

M. Gletsos et al.

A computer-aided diagnostic system to characterize CT focal liver lesions: design and optimization of a neural network classifier

IEEE Trans Inf Technol B

(2003)

A.K. Jain et al.

Statistical pattern recognition: a review

IEEE Trans Pattern Anal

(2000)

A.P. Dhawan et al.

Analysis of mammographic microcalcifications using gray-level image structure features

IEEE Trans Med Imaging

(1996)

F. Roli et al.

Methods for designing multiple classifier systems

C.I. Christodoulou et al.

Texture-based classification of atherosclerotic carotid plaques

IEEE Trans Med Imaging

(2003)

Cited by (96)

Ensemble learning-based stability improvement method for feature selection towards performance prediction
2024, Journal of Manufacturing Systems
The uncertainty and complexity of real data collected in the industrial production process increase the difficulty in data-based knowledge discovering. Feature selection is an important step to remove redundant and irrelevant data, and thus it is essential to construct an efficient feature selection method. In this paper, an ensemble learning-driven stable feature selection method is proposed to improve the stability and accuracy of the feature selection. Firstly, datasets of different characteristics are generated to increase the diversity of data segments for feature selection. Secondly, two criteria (stability and prediction accuracy) are adopted to evaluate the performance weight of each feature selection algorithm, to ensure that the results of high-performance selectors have high priority in the algorithm aggregation process. Thirdly, the feature subsets are weighted and filtered based on expert experience to further ensure its stability. Finally, comparative experiments are conducted to show the effectiveness of the proposed method. Comparing with other methods, the proposed one can achieve the highest overall stability for feature selection (namely 0.936 measured by the Spearman rank correlation coefficient), and select the reasonable feature subset for data-driven prediction with the low mean absolute error (namely 0.315 as the average level).
Improving hepatocellular carcinoma diagnosis using an ensemble classification approach based on Harris Hawks Optimization
2024, Heliyon
Hepato-Cellular Carcinoma (HCC) is the most common type of liver cancer that often occurs in people with chronic liver diseases such as cirrhosis. Although HCC is known as a fatal disease, early detection can lead to successful treatment and improve survival chances. In recent years, the development of computer recognition systems using machine learning approaches has been emphasized by researchers. The effective performance of these approaches for the diagnosis of HCC has been proven in a wide range of applications. With this motivation, this paper proposes a hybrid machine learning approach including effective feature selection and ensemble classification for HCC detection, which is developed based on the Harris Hawks Optimization (HHO) algorithm. The proposed ensemble classifier is based on the bagging technique and is configured based on the decision tree method. Meanwhile, HHO as an emerging meta-heuristic algorithm can select a subset of the most suitable features related to HCC for classification. In addition, the proposed method is equipped with several strategies for handling missing values and data normalization. The simulations are based on the HCC dataset collected by the Coimbra Hospital and University Center (CHUC). The results of the experiments prove the acceptable performance of the proposed method. Specifically, the proposed method with an accuracy of 97.13 % is superior in comparison with the equivalent methods such as LASSO and DTPSO.
Discrete learning-based intelligent methodology for heart disease diagnosis
2023, Biomedical Signal Processing and Control
Classification is one of the most frequently used data mining approaches which has been broadly applied in different fields of sciences, such as engineering, finance, energy, environments, transportation, etc., especially medicine, successfully. Over the years, various intelligent modeling techniques with different properties have been proposed to yield more accurate and more efficient classification results. However, in spite of the different appearance of all of the developed models, the same basic methodology is applied to the learning processes. Based on this methodology, a continuous distance-based cost function is considered and optimized for estimating the unknown parameters in the learning procedures. While using a continuous cost function in the classification field in which the goal function is discrete, is unreasonable or at least quite inefficient. In this paper, in contrast to conventional continuous distance-based methodologies, a novel discrete learning-based methodology is proposed for classification purposes. The main difference between the proposed learning methodology rather than conventional versions is its cost function. In the proposed learning methodology, a mismatching function is considered as a cost function, which is dissimilar to previously developed ones, which are continuous functions based on distance, is a discrete function based on direction. In this way, in the proposed learning process, unknown parameters are discretely adjusted and at once jumped to the target. This is in contrast to conventional continuous learning algorithms in which the unknown parameters are continuously adjusted and step-by-step near the target. The multilayer perceptron (MLP) which is one of the most widely-used intelligent classification approaches, is exemplarily chosen in order to implement the proposed methodology. Although it can be generally demonstrated that the classification rate of the proposed discrete learning-based MLP (DIMLP) model will not be worse than its conventional continuous learning-based one. However, in order to determine the superiority of the proposed DIMLP model, it is exemplarily evaluated on the heart disease diagnosis benchmark data set and several other medical datasets, and its performance is compared to the classic multilayer perceptron model. Empirical results illustrate that, as pre-expected, the classification rate of the proposed model is higher than its conventional version in all data sets. Obtained results indicate that the proposed DIMLP classifier can yield a 94.27 % classification rate in heart disease diagnosis, which approximately indicates a 9.35 % improvement over the classic version, which can only produce an 86.21 % classification rate. Therefore, the proposed methodology is an appropriate and effective alternative learning process for intelligent classification approaches, especially when more accurate results and/or a more reasonable model are required.
Differential diagnosis of hepatocellular carcinoma and hepatic hemangioma based on maximum wavelet-coefficient statistics: Novel radiomics features from plain CT
2022, Information Processing and Management
In computed tomography (CT)-based diagnoses of liver tumors, contrast-enhanced CT may cause renal toxicity and allergic reactions. Regular health examinations prefer plain CT, but subsequent diagnoses significantly depend on subjective experience. Radiomics provides a quantitative, objective, and noninvasive way for diagnosing liver tumors. This study aimed to use plain CT-based radiomics to diagnose hepatocellular (HCC, malignant) and hemangioma (HH, benign) liver tumors. Inspired by the knowledge that HCC and HH exhibit different histopathological characteristics, we developed a novel feature extraction technique (referred to as maximum wavelet-coefficient statistics, MWCS) to highlight the differences in histopathological characteristics by reorganizing and expressing the patterns of wavelet-coefficients that represent local changes. We attempted multiple feature selection algorithms and various machine learning approaches to train classification models and tested these models on an independent test cohort. Experimental results showed that the classification models based on the proposed MWCS-COM (using a statistical method of co-occurrence matrix in MWCS) feature set exhibited performance superior to those based on traditional feature sets. Furthermore, the linear support vector machine (SVM) model achieved state-of-the-art performance in the classification experiments with a test area under receiver operator characteristic curve (AUC) of 0.8734 (95% confidence interval, 0.8666–0.8802). This result indicated that the MWCS-COM features are highly advantageous to the differential diagnosis of HCC and HH from plain CT images. We also explored the potential associations between MWCS-COM features and histopathological characteristics and observed that the MWCS-COM features could potentially enhance radiologists’ diagnostic ability.
Computer-aided diagnosis of liver lesions using CT images: A systematic review
2020, Computers in Biology and Medicine
Citation Excerpt :
Most of the aforementioned segmentation methods were coupled with morphological operations to refine the segmentation results. Most of the authors preferred to manually delineate the lesions [14,15,26,28,33–35,40,42,43,48,51,78]. FCM with three clusters corresponding to liver, lesion and background was used in Refs. [13,16,18,22,30,38,65,70,79] and was the next most popular method.
Medical image processing has a strong footprint in radio diagnosis for the detection of diseases from the images. Several computer-aided systems were researched in the recent past to assist the radiologist in diagnosing liver diseases and reducing the interpretation time. The aim of this paper is to provide an overview of the state-of-the-art techniques in computer-assisted diagnosis systems to predict benign and malignant lesions using computed tomography images.
The research articles published between 1998 and 2020 obtained from various standard databases were considered for preparing the review. The research papers include both conventional as well as deep learning-based systems for liver lesion diagnosis. The paper initially discusses the various hepatic lesions that are identifiable on computed tomography images, then the computer-aided diagnosis systems and their workflow. The conventional and deep learning-based systems are presented in stages wherein the various methods used for preprocessing, liver and lesion segmentation, radiological feature extraction and classification are discussed.
The review suggests the scope for future, work as efficient and effective segmentation methods that work well with diverse images have not been developed. Furthermore, unsupervised and semi-supervised deep learning models were not investigated for liver disease diagnosis in the reviewed papers. Other areas to be explored include image fusion and inclusion of essential clinical features along with the radiological features for better classification accuracy.
The Application of Radiomics and AI to Molecular Imaging for Prostate Cancer
2024, Journal of Personalized Medicine

View all citing articles on Scopus

¹: Tel.: +30 210 772 2968; fax: +30 210 772 3557.

²: Tel.: +30 210 722 7488; fax: +30 210 729 2280.

View full text

Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers

Summary

Objectives

Materials and methods

Results

Conclusions

Introduction

Section snippets

Methodology

Results and discussion

Conclusion

Radiol Clin North Am

Ultrasound Med Biol

Pattern Recog Lett

Artif Intell Med

Pattern Recog Lett

Artif Intell Med

Acad Radiol

Ultrasound Med Biol

Pattern Recog

Pattern Recog Lett

Classification algorithms for quantitative tissue characterization of diffuse liver disease from ultrasound images

IEEE Trans Med Imaging

Ultrasonic image analysis for liver diagnosis

IEEE Eng Med Biol

Texture features for classification of ultrasonic liver images

IEEE Trans Med Imaging

Texture analysis of CT images

IEEE Eng Med Biol

An automatic diagnostic system for CT liver image classification

IEEE Trans Biomed Eng

A computer-aided diagnostic system to characterize CT focal liver lesions: design and optimization of a neural network classifier

IEEE Trans Inf Technol B

Statistical pattern recognition: a review

IEEE Trans Pattern Anal

Analysis of mammographic microcalcifications using gray-level image structure features

IEEE Trans Med Imaging

Methods for designing multiple classifier systems

Texture-based classification of atherosclerotic carotid plaques

IEEE Trans Med Imaging