New features for automatic classification of human chromosomes: A feasibility study

https://doi.org/10.1016/j.patrec.2005.06.011Get rights and content

Abstract

Karyotyping, a standard method for presenting pictures of the human chromosomes for diagnostic purposes, is a long standing, yet common technique in cytogenetics. Automating the chromosome classification process is the first step in designing an automatic karyotyping system. The main aim in this study was to define a new group of features for better representation and classification of chromosomes. Width, position and the average intensity of the two most eye-catching regions of each chromosome (that we call characteristic bands) are the new proposed features. The concept of a characteristic band is based on the expert cytogeneticists’ method in classification of the chromosomes. The length, centromeric index (CI) and an index of overall darkness or brightness of the image (NAGD) were also included in the final nine-dimensional feature vectors describing each chromosome. To automatically find the characteristic bands and calculate the new features, different windows in chromosome’s density profile were scored based on their intensity and width. As a feasibility study, our work was focused on classification of chromosomes in group E. Three layer artificial neural networks were employed to classify each chromosome in one of the three possible classes (chromosomes 16, 17 and 18). The best results obtained were accurate classification of up to 98.6% of chromosomes. Particularly a six-dimensional subset of the features showed reproducibly high performances in classification experiments. The results of this feasibility study show that new features inspired from human expert’s classification method are potentially capable of improving the accuracy of the karyotyping systems.

Introduction

Many genetic disorders or possible abnormalities that may occur in the future generations can be predicted through analyzing the shape and morphological characteristics of the chromosomes. In addition to some well-known genetic abnormalities like aneuploidy (improper number of chromosomes), translocation, and deletion, some of the fatal pathological conditions like lukemia are also correlated with chromosome defects (Hong, 2000). Karyotype, a standard table presenting pictures of the 46 human chromosomes obtained from a single cell either by drawing or by photography using a light microscope (Hong, 2000), is often used to analyze the shape and morphological characteristics of the chromosomes by a specialist for diagnostic purposes.

To develop a karyotype, a cell is photographed under a light microscope during the metaphase stage (one of the four stages of the cell division). Laboratory staining techniques applied to the samples create a unique band pattern for each chromosome. A band is a region along the chromosome axis with a distinct intensity from its adjacent. In the next step, each of the chromosomes (22 autosomal pairs and a pair of sex chromosomes) should be identified. This process is usually carried out manually by expert clinicians who view the pictures, identify the chromosomes, and cut and place them in their specified locations in the karyotype.

Despite the development of the banding techniques Karyotyping is still a difficult and time consuming task which must be done by an experienced operator or a cytogenetic expert. The tedious nature of manual karyotyping has encouraged many computer vision and medical image processing researchers to investigate automatic or semi-automatic techniques for Karyotyping in the last three decades (Carothers and Piper, 1994). However, automatic karyotyping is still considered as a difficult task mainly due to the shape variability caused by the non-rigid nature of the chromosomes that gives them unpredictable appearances within the pictures.

Chromosome classification can be viewed as a pattern recognition problem, where the aim is to assign each chromosome to one of the 24 possible classes. The feature vector commonly used to describe a chromosome includes the length, the centromeric index (the ratio of the short arm of the chromosome to its long arm, which are separated by the narrowest part of the chromosome known as the centromere), and a one-dimensional vector obtained by intensity sampling of the chromosome along its longitudinal axis, which is known as the density profile (Carothers and Piper, 1994, Lerner et al., 1995, Sweeney and Becker, 1997, Shin and Pu, 1990). In some studies, a reduced version of the density profile (Lerner et al., 1995) or features extracted from its Fourier or wavelet transformation have been used (Sweeney and Becker, 1997). Using wavelet packet transformation for extraction of features that represent the chromosome shape has recently been reported (Guimaraes et al., 2003). The resulting feature vector is then used with a classification method like the Bayesian classifier (Carothers and Piper, 1994, Qiang and Castleman, 2000), neural network classifier (Cho, 2000, Lerner et al., 1995, Sweeney and Becker, 1997, Lerner, 1998, Graham et al., 1992), or fuzzy classifier (Vanderheydt et al., 1980), nearest neighbor (Groen et al., 1989).

Although the results reported in these studies are encouraging, the karyotyping process in daily laboratory routine still needs the human interaction. A human expert can identify each chromosome in the picture using a hierarchical chromosome identification and classification method. He/she uses some geometric and morphologic features such as the length of the chromosomes for initially classifying them into a small number of groups. Then, applying some simple rules such as the location of the centromere, the location and width of the characteristic bands and their position relative to the centromere and/or relative to each other, the human expert can effectively recognize and identify each chromosome. The concept of characteristic band is very important in this process. Based on the survey conducted in this research, the level of importance of a band is mainly based on the following three factors:

  • (1)

    Width of the band.

  • (2)

    Intensity of the band.

  • (3)

    Relative position of the band.

If a wide and dark band is repeated in the same position for the same chromosome in different images, it is considered as a characteristic band. In this study we have defined a set of features that describe the important characteristic bands for each chromosome. These features include the width, position and the average intensity of the most noticeable characteristic bands of the chromosomes. For a quick reference these features are named (db1W) and (db2W) for the Width of the first and the second dark bands, respectively, (db1P) and (db2P) for the Position of the first and the second dark bands, and (db1I) and (db2I) for the gray level Intensity of the first and the second dark bands (Fig. 1).

As a feasibility study, our work has been limited to the chromosomes in group E. Since the chromosomes in this group have very close lengths, the intensity-based features are more important in their classification process. The choice of number of the characteristic bands (two in our case) was based on the consultation with the experts and analyzing the ideogram of the chromosomes in group E (the standard ideogram of chromosome 16 is shown in Fig. 2). Table 1 summarizes the typical values of the proposed features for chromosomes 16, 17 and 18 calculated by an expert using the standard ideograms (Fig. 2). The intensity-based features (db1I, db2I) are not included in this table, because ideograms do not suggest typical values for them.

This study is aimed to simulate the human expert’s knowledge and design a robust chromosome identification and classification algorithm. We have used a medial axis transformation (MAT) based technique to extract the density profile of the chromosomes and used a wavelet based denoising method for identifying the characteristic bands. Multi Layer Perceptron networks are used for classification. The results confirm the efficiency of the new set of features. The rest of the paper is organized as follows. Section 2 describes the dataset used in this study for testing the proposed algorithm. Section 3 illustrates the feature extraction process including automatic extraction of the density profile of the chromosomes and tuning process used to automatically extract the features mimicking human expert knowledge. Section 4 discusses the classification method and presents results of the application of the proposed algorithm to the dataset. Finally Section 5 concludes the paper.

Section snippets

The dataset

The images used in this study were produced in the Cytogenetic Laboratory of Cancer Institute, Imam Hospital, Tehran, Iran. The images were acquired by a conventional photography system using a light microscope (Leitz, ortholux) with a magnification factor of 100×. The chromosomes were segmented from the pictures by an expert in the Cytogenetic Laboratory and then scanned by a scanner (Microtek, ScanPlus 6) with a resolution of 300 dpi. The gray scale resolution of the resulting digitized

Chromosomes in the feature domain

Conventionally, a chromosome is described by its length, its centromeric index (CI) and its density profile. In this work, length and CI are used together with the new features developed based on the human expert classification method. The process of feature extraction will be discussed in this section. Due to the non-homogeneous illuminating conditions in the microscopic images, an intensity normalizing procedure is necessary before the calculation of any feature depending on the intensity of

Classification method and results

Utilization of artificial neural networks (ANN’s) for classification of chromosomes has been intensively studied in the past. Lerner (1998) has suggested that ANN’s are the best chromosome classifiers, especially when the number of classes is small. When the number of classes increases, the efficiency of Bayes piecewise linear classifier approaches to the ANN based classifier. In the present study, the number of classes was limited to three. Therefore, ANN was employed for classification. Three

Discussions and conclusions

Automatic human chromosome classification is one of the most widely investigated stages of the karyotyping process (Lerner, 1998). Over the past few years, several classification methods have been developed and tested for this purpose. Most of these classifiers have two main flaws (Groen et al., 1989, Piper and Granum, 1989): poor performance compared to the human expert (70–80% compared to 99.7%) and the requirement for an operator interaction to correct the misclassifications. The main source

Acknowledgements

The authors would like to thank Dr. S.R. Ghaffari and Ms. F. Farzanfar from the Cytogenetic Laboratory of the Cancer Institute of Imam Hospital in Tehran, Iran for their help in providing images and useful comments.

References (20)

There are more references available in the full text version of this article.

Cited by (70)

  • Identifying Centromere Position of Human Chromosome Images using Contour and Shape based Analysis

    2019, Measurement: Journal of the International Measurement Confederation
  • Towards an autonomous human chromosome classification system using Competitive Support Vector Machines Teams (CSVMT)

    2017, Expert Systems with Applications
    Citation Excerpt :

    To ease this process, development of reliable classification systems for automatic chromosome karyotyping is of paramount importance (Wang et al., 2005). Such a system requires addressing two issues: (i) defining and extracting the right features to be used for classification (Moradi & Setarehdan, 2006; Ritter & Schreib, 2001) and (ii) building very advanced and complex classification algorithms with high reliability. Considering that each human cell normally contains 23 pairs of chromosomes, 22 pairs of chromosomes and 1 pair of sex chromosome, chromosome classification is a difficult multi-class problem (MCP) compared to one-against-one (OAO) classification tasks (Subasi, 2012).

  • Straightening of highly curved human chromosome for cytogenetic analysis

    2014, Measurement: Journal of the International Measurement Confederation
View all citing articles on Scopus
View full text