New features for automatic classification of human chromosomes: A feasibility study

doi:10.1016/j.patrec.2005.06.011

Pattern Recognition Letters

Volume 27, Issue 1, 1 January 2006, Pages 19-28

https://doi.org/10.1016/j.patrec.2005.06.011 Get rights and content

Abstract

Karyotyping, a standard method for presenting pictures of the human chromosomes for diagnostic purposes, is a long standing, yet common technique in cytogenetics. Automating the chromosome classification process is the first step in designing an automatic karyotyping system. The main aim in this study was to define a new group of features for better representation and classification of chromosomes. Width, position and the average intensity of the two most eye-catching regions of each chromosome (that we call characteristic bands) are the new proposed features. The concept of a characteristic band is based on the expert cytogeneticists’ method in classification of the chromosomes. The length, centromeric index (CI) and an index of overall darkness or brightness of the image (NAGD) were also included in the final nine-dimensional feature vectors describing each chromosome. To automatically find the characteristic bands and calculate the new features, different windows in chromosome’s density profile were scored based on their intensity and width. As a feasibility study, our work was focused on classification of chromosomes in group E. Three layer artificial neural networks were employed to classify each chromosome in one of the three possible classes (chromosomes 16, 17 and 18). The best results obtained were accurate classification of up to 98.6% of chromosomes. Particularly a six-dimensional subset of the features showed reproducibly high performances in classification experiments. The results of this feasibility study show that new features inspired from human expert’s classification method are potentially capable of improving the accuracy of the karyotyping systems.

Introduction

Many genetic disorders or possible abnormalities that may occur in the future generations can be predicted through analyzing the shape and morphological characteristics of the chromosomes. In addition to some well-known genetic abnormalities like aneuploidy (improper number of chromosomes), translocation, and deletion, some of the fatal pathological conditions like lukemia are also correlated with chromosome defects (Hong, 2000). Karyotype, a standard table presenting pictures of the 46 human chromosomes obtained from a single cell either by drawing or by photography using a light microscope (Hong, 2000), is often used to analyze the shape and morphological characteristics of the chromosomes by a specialist for diagnostic purposes.

To develop a karyotype, a cell is photographed under a light microscope during the metaphase stage (one of the four stages of the cell division). Laboratory staining techniques applied to the samples create a unique band pattern for each chromosome. A band is a region along the chromosome axis with a distinct intensity from its adjacent. In the next step, each of the chromosomes (22 autosomal pairs and a pair of sex chromosomes) should be identified. This process is usually carried out manually by expert clinicians who view the pictures, identify the chromosomes, and cut and place them in their specified locations in the karyotype.

Despite the development of the banding techniques Karyotyping is still a difficult and time consuming task which must be done by an experienced operator or a cytogenetic expert. The tedious nature of manual karyotyping has encouraged many computer vision and medical image processing researchers to investigate automatic or semi-automatic techniques for Karyotyping in the last three decades (Carothers and Piper, 1994). However, automatic karyotyping is still considered as a difficult task mainly due to the shape variability caused by the non-rigid nature of the chromosomes that gives them unpredictable appearances within the pictures.

Chromosome classification can be viewed as a pattern recognition problem, where the aim is to assign each chromosome to one of the 24 possible classes. The feature vector commonly used to describe a chromosome includes the length, the centromeric index (the ratio of the short arm of the chromosome to its long arm, which are separated by the narrowest part of the chromosome known as the centromere), and a one-dimensional vector obtained by intensity sampling of the chromosome along its longitudinal axis, which is known as the density profile (Carothers and Piper, 1994, Lerner et al., 1995, Sweeney and Becker, 1997, Shin and Pu, 1990). In some studies, a reduced version of the density profile (Lerner et al., 1995) or features extracted from its Fourier or wavelet transformation have been used (Sweeney and Becker, 1997). Using wavelet packet transformation for extraction of features that represent the chromosome shape has recently been reported (Guimaraes et al., 2003). The resulting feature vector is then used with a classification method like the Bayesian classifier (Carothers and Piper, 1994, Qiang and Castleman, 2000), neural network classifier (Cho, 2000, Lerner et al., 1995, Sweeney and Becker, 1997, Lerner, 1998, Graham et al., 1992), or fuzzy classifier (Vanderheydt et al., 1980), nearest neighbor (Groen et al., 1989).

Although the results reported in these studies are encouraging, the karyotyping process in daily laboratory routine still needs the human interaction. A human expert can identify each chromosome in the picture using a hierarchical chromosome identification and classification method. He/she uses some geometric and morphologic features such as the length of the chromosomes for initially classifying them into a small number of groups. Then, applying some simple rules such as the location of the centromere, the location and width of the characteristic bands and their position relative to the centromere and/or relative to each other, the human expert can effectively recognize and identify each chromosome. The concept of characteristic band is very important in this process. Based on the survey conducted in this research, the level of importance of a band is mainly based on the following three factors:

(1)
Width of the band.
(2)
Intensity of the band.
(3)
Relative position of the band.

If a wide and dark band is repeated in the same position for the same chromosome in different images, it is considered as a characteristic band. In this study we have defined a set of features that describe the important characteristic bands for each chromosome. These features include the width, position and the average intensity of the most noticeable characteristic bands of the chromosomes. For a quick reference these features are named (db1W) and (db2W) for the Width of the first and the second dark bands, respectively, (db1P) and (db2P) for the Position of the first and the second dark bands, and (db1I) and (db2I) for the gray level Intensity of the first and the second dark bands (Fig. 1).

As a feasibility study, our work has been limited to the chromosomes in group E. Since the chromosomes in this group have very close lengths, the intensity-based features are more important in their classification process. The choice of number of the characteristic bands (two in our case) was based on the consultation with the experts and analyzing the ideogram of the chromosomes in group E (the standard ideogram of chromosome 16 is shown in Fig. 2). Table 1 summarizes the typical values of the proposed features for chromosomes 16, 17 and 18 calculated by an expert using the standard ideograms (Fig. 2). The intensity-based features (db1I, db2I) are not included in this table, because ideograms do not suggest typical values for them.

This study is aimed to simulate the human expert’s knowledge and design a robust chromosome identification and classification algorithm. We have used a medial axis transformation (MAT) based technique to extract the density profile of the chromosomes and used a wavelet based denoising method for identifying the characteristic bands. Multi Layer Perceptron networks are used for classification. The results confirm the efficiency of the new set of features. The rest of the paper is organized as follows. Section 2 describes the dataset used in this study for testing the proposed algorithm. Section 3 illustrates the feature extraction process including automatic extraction of the density profile of the chromosomes and tuning process used to automatically extract the features mimicking human expert knowledge. Section 4 discusses the classification method and presents results of the application of the proposed algorithm to the dataset. Finally Section 5 concludes the paper.

Section snippets

The dataset

The images used in this study were produced in the Cytogenetic Laboratory of Cancer Institute, Imam Hospital, Tehran, Iran. The images were acquired by a conventional photography system using a light microscope (Leitz, ortholux) with a magnification factor of 100×. The chromosomes were segmented from the pictures by an expert in the Cytogenetic Laboratory and then scanned by a scanner (Microtek, ScanPlus 6) with a resolution of 300 dpi. The gray scale resolution of the resulting digitized

Chromosomes in the feature domain

Conventionally, a chromosome is described by its length, its centromeric index (CI) and its density profile. In this work, length and CI are used together with the new features developed based on the human expert classification method. The process of feature extraction will be discussed in this section. Due to the non-homogeneous illuminating conditions in the microscopic images, an intensity normalizing procedure is necessary before the calculation of any feature depending on the intensity of

Classification method and results

Utilization of artificial neural networks (ANN’s) for classification of chromosomes has been intensively studied in the past. Lerner (1998) has suggested that ANN’s are the best chromosome classifiers, especially when the number of classes is small. When the number of classes increases, the efficiency of Bayes piecewise linear classifier approaches to the ANN based classifier. In the present study, the number of classes was limited to three. Therefore, ANN was employed for classification. Three

Discussions and conclusions

Automatic human chromosome classification is one of the most widely investigated stages of the karyotyping process (Lerner, 1998). Over the past few years, several classification methods have been developed and tested for this purpose. Most of these classifiers have two main flaws (Groen et al., 1989, Piper and Granum, 1989): poor performance compared to the human expert (70–80% compared to 99.7%) and the requirement for an operator interaction to correct the misclassifications. The main source

Acknowledgements

The authors would like to thank Dr. S.R. Ghaffari and Ms. F. Farzanfar from the Cytogenetic Laboratory of the Cancer Institute of Imam Hospital in Tehran, Iran for their help in providing images and useful comments.

References (20)

F.C.A. Groen et al.
Human chromosome classification based on local band descriptors
Pattern Recognition Letters
(1989)
B. Lerner et al.
Medial axis transform-based features and a neural network for human chromosome classification
Pattern Recognition
(1995)
L. Vanderheydt et al.
Design of graph-representation and a fuzzy-classifier for human chromosomes
Pattern Recognition
(1980)
A. Aldroubi et al.
Wavelets in Medicine and Biology
(1996)
Blum, H., 1967. A transformation for extracting new descriptors of the shape. In: Proceedings of the Symposium on...
A. Carothers et al.
Computer-aided classification of human chromosomes: A review
Statistics and Computing
(1994)
J.M. Cho
Chromosome classification using backpropagation neural networks
IEEE Engineering in Medicine and Biology
(2000)
J. Graham et al.
A neural network chromosome classifier
Journal of Radiation Research
(1992)
Guimaraes, L.V., Schuck, A., Elbern, A., 2003. Chromosome classification for karyotype composing applying shape...
L.M. Hong
Medical Cytogenetics
(2000)

There are more references available in the full text version of this article.

Cited by (70)

Chromosome classification via deep learning and its application to patients with structural abnormalities of chromosomes
2023, Medical Engineering and Physics
Karyotyping is an important technique in cytogenetic practice for the early diagnosis of genetic diseases. Clinical karyotyping is tedious, time-consuming, and error-prone. The objective of our study was to develop a single-stage deep convolutional neural networks (DCNN)-based model to automatically classify normal and abnormal chromosomes in an end-to-end manner.
We analyzed 2,424 normal chromosomes and 544 abnormal chromosomes. A preliminary support vector machine (SVM) model was developed to evaluate the basic recognition performance on the dataset. A DCNN-based model was then proposed to process the same dataset.
By utilizing the SVM model, the classification accuracy of 24 normal chromosomes was 86.01 %. The 32 types of normal and abnormal chromosomes got an accuracy of 85.37 %. The accuracy of the DCNN-based model performing the 24 normal chromosomal classification was 91.75 %. The accuracy of the 32 type classification was 87.76 %. To differentiate eight common structural abnormalities, we obtained accuracies that ranged from 90.84 % to 100 %, and the values of the AUC ranged from 91.81 % to 100 %.
Our proposed DCNN-based model effectively performed the karyotype classification in an end-to-end manner. It had the competence to be used as a prediction tool for abnormal karyotype detection and screening in genetic diagnosis without initial feature extraction. We believe our work is meaningful for genetic triage management to lower the cost in clinical practice.
Identifying Centromere Position of Human Chromosome Images using Contour and Shape based Analysis
2019, Measurement: Journal of the International Measurement Confederation
The most significant information of the shape of any image/object is concentrated in curvature regions along the contour and objects boundaries rather than uniformly distributed contour. The points belonging to greater magnitude of curvature gives more meaningful information about the shape of an object. The sign of the curvature can be positive (convex) and negative (concave), the negative curvature information is most significant for segmentation. The contour and region based geometry gives a better visual representation of the shape of an object and helps in identifying the centromere position in chromosomes. Centromere of a chromosome is the constriction point which divides the chromosome into two sections or arms. The two arms are p arm (short arm) and q arm (long arm). The size of the arms are calculated with respect to the position of the centromere. The centromere is identified using boundary concavity method which helps in detecting the dominant points (centromere points) in chromosomes. The method uses the concave function and weighted shortest path calculation for centromere detection. SVM classifier is used for improving the accuracy in detecting the centromere of the chromosomes. As the classifier is binary classifier, it helps in recognizing the centromere and non-centromere regions in chromosomes. Comparative analysis is performed with two other methods (i) Medial Axis Transform (MAT) and (ii) Projection Vector. Boundary concavity proves to be efficient for straight, bent and severely bent chromosomes.
Towards an autonomous human chromosome classification system using Competitive Support Vector Machines Teams (CSVMT)
2017, Expert Systems with Applications
Citation Excerpt :
To ease this process, development of reliable classification systems for automatic chromosome karyotyping is of paramount importance (Wang et al., 2005). Such a system requires addressing two issues: (i) defining and extracting the right features to be used for classification (Moradi & Setarehdan, 2006; Ritter & Schreib, 2001) and (ii) building very advanced and complex classification algorithms with high reliability. Considering that each human cell normally contains 23 pairs of chromosomes, 22 pairs of chromosomes and 1 pair of sex chromosome, chromosome classification is a difficult multi-class problem (MCP) compared to one-against-one (OAO) classification tasks (Subasi, 2012).
In broad terms, karyotyping is the process of examination and classification of human chromosome images to diagnose genetic diseases and disorders. It requires time consuming manual examination of cell images by a cytogeneticist to distinguish chromosome classes from each other. Thus, a reliable autonomous human chromosome classification system not only saves time and money but also reduces errors due to the inadequate knowledge level of the expert. Human cell contains 23 pairs of chromosome, 22 autosomes and a pair of sex chromosomes. Hence, we face a multi-class classification task which represents a challenging case for any sort of classifier. In this work, to solve this classification problem, we propose a novel methodology consisting two stages: (i) data preparation and training, and (ii) testing. To determine the most informative content of the dataset several preliminary experiments are conducted and a Principal Component Analysis is done. Then, a single Support Vector Machine (SVM_ij) is trained to separate a pair of classes, (i,j) where a numerical optimization method Pattern Search (PS), is employed to find the optimal parameters for the SVM_ij. Considering 22 pairs of autosomes, 22 × 22 experts are trained and optimized. The cluster of experts, we obtain is named as Competitive SVM Teams (CSVMTs) where each SVM_ij competes with the others to label a new classification instance. The final output of the classifier is determined by majority voteing. The results obtained on Copenhagen dataset proves the merit of the algorithm as correct classification rates (CRR) on train and test samples are 99.55% and 97.84% respectively, which are higher than any accuracy rate achieved so far in the related literature.
Straightening of highly curved human chromosome for cytogenetic analysis
2014, Measurement: Journal of the International Measurement Confederation
Analyzing the morphological characteristics of the human chromosomes is a general task of diagnosing many genetic disorders. For this purpose, 23 pairs of the chromosomes are placed on a table like format known as a karyotype. This is usually carried out manually by a skilled operator. Automation of this procedure is a difficult image processing task due to the non-rigid nature of the chromosomes making them to have unpredictable shapes and curvatures within the image. A novel Projective Straightening Algorithm (PSA) for straightening and length detection in any given (straight, curved or highly curved) chromosome is proposed. This conventional method starts with filtering the spikes in chromosome images using median filter. Kettler algorithm is used to convert the image into binary image and Stentiford Thinning Method (STM) provides the medial axis of chromosome image. Projective Straightening Algorithm is used to straighten the medial axis, with straighten medial axis as reference and midline, binary image is straightened. A row matrix is created with respect to the straightened binary image. Chromosome image is projected over the row matrix which gives the straightened chromosome image. The parameters such as straighten angle, length and area of human chromosomes are calculated.
ChromEDA: Chromosome classification by ensemble framework based domain adaptation
2024, Microscopy Research and Technique
Masked Conditional Variational Autoencoders for Chromosome Straightening
2024, IEEE Transactions on Medical Imaging

View all citing articles on Scopus

View full text

New features for automatic classification of human chromosomes: A feasibility study

Abstract

Introduction

Section snippets

The dataset

Chromosomes in the feature domain

Classification method and results

Discussions and conclusions

Acknowledgements

Pattern Recognition Letters

Pattern Recognition

Pattern Recognition

Wavelets in Medicine and Biology

Computer-aided classification of human chromosomes: A review

Statistics and Computing

Chromosome classification using backpropagation neural networks

IEEE Engineering in Medicine and Biology

A neural network chromosome classifier

Journal of Radiation Research

Medical Cytogenetics