Elsevier

Expert Systems with Applications

Volume 46, 15 March 2016, Pages 380-393
Expert Systems with Applications

A cooperative semi-supervised fuzzy clustering framework for dental X-ray image segmentation

https://doi.org/10.1016/j.eswa.2015.11.001Get rights and content

Highlights

  • We concentrated on the dental X-ray image segmentation problem.

  • A new framework combining Otsu, FCM and semi-supervised fuzzy clustering was shown.

  • It was tested on real datasets from Hanoi Medical University in terms of accuracy.

  • The new framework has better performance than the relevant methods.

  • Suggestions on means and variances of the criteria of the new framework were made.

Abstract

Dental X-ray image segmentation (DXIS) is an indispensable process in practical dentistry for diagnosis of periodontitis diseases from an X-ray image. It has been said that DXIS is one of the most important and necessary steps to analyze dental images in order to get valuable information for medical diagnosis support systems and other recognition tools. Specialized data mining methods for DXIS have been investigated to achieve high accuracy of segmentation. However, traditional image processing and clustering algorithms often meet challenges in determining parameters or common boundaries of teeth samples. It was shown that performance of a clustering algorithm is enhanced when additional information provided by users is attached to inputs of the algorithm. In this paper, we propose a new cooperative scheme that applies semi-supervised fuzzy clustering algorithms to DXIS. Specifically, the Otsu method is used to remove the Background area from an X-ray dental image. Then, the FCM algorithm is chosen to remove the Dental Structure area from the results of the previous steps. Finally, Semi-supervised Entropy regularized Fuzzy Clustering algorithm (eSFCM) is opted to clarify and improve the results based on the optimal result from the previous clustering method. The proposed framework is evaluated on a real collection of dental X-ray image datasets from Hanoi Medical University, Vietnam. Experimental results have revealed that clustering quality of the cooperative framework is better than those of the relevant ones. The findings of this paper have great impact and significance to researches in the fields of medical science and expert systems. It has been the fact that medical diagnosis is often an experienced and case-based process which requests long time practicing in real patients. In many situations, young clinicians do not have chance for such the practice so that it is necessary to utilize a computerized medical diagnosis system which could simulate medical processes from previous real evidences. By learning from those cases, clinicians would improve their experience and responses for later ones. In the view of expert systems, this paper made uses of knowledge-based algorithms for a practical application. This shows the advantages of such the algorithm in the conjunction domain between expert systems and medical informatics. The findings also suggested the most appropriate configuration of the algorithm and parameters for this problem that could be reused by other researchers in similar applications. The usefulness and significance of this research are clearly demonstrated within the extent of real-life applications.

Introduction

Dental X-ray image segmentation (DXIS) is an indispensable process in practical dentistry for diagnosis of periodontitis diseases. It has been said that DXIS is one of the most important and necessary steps to analyze X-ray dental images in order to get valuable information for medical diagnosis support systems and other recognition tools (Said, Fahmy, Nassar, & Ammar, 2004). In the view of medical systems, segmentation aims to determine isolated teeth or different parts such as stump, gums, etc. (Said, Nassar, Fahmy, & Ammar, 2006).

An X-ray dental image consists of three main parts (Scott, 1977) (Fig. 1a): (i) Teeth area: often has high values of grayscale and is what we have to clarify from the image; (ii) Dental structural area: has medium values of grayscale and consists of gums, bone, and other periodontitis structure; (iii) Background area: has the smallest value of grayscale among all and shows the background of a teeth structure. The structure of X-ray dental images makes the segmentation more complicated than traditional image segmentation (Zhou & Abdel-Mottaleb, 2005). In the other words, the connection between various parts of an X-ray dental image and low quality of the image due to noises, low contrast, errors on image scanning, etc. degrade the segmentation performance. For instance, the blank holes in missing teeth in Fig. 1b cannot be processed by conventional image thresholding techniques (Kondo, Ong, & Foong, 2004). Thus, specialized data mining methods for DXIS have been investigated to achieve high accuracy of segmentation (Nomir & Abdel-Mottaleb, 2005).

A classification of various techniques for DXIS was introduced in (Rad, Rahim, & Norouzi, 2014). In the classification tree in Fig. 2, thresholding is the simplest technique among all. It divides histogram of an image into two separate regions according to a threshold T. The classification is performed by assigning a label to each pixel: either “Main part” (Teeth and dental structure areas) or “Background part”. A typical thresholding method is Otsu (Otsu, 1975). Bhandari, Singh, Kumar, and Singh (2014) employed cuckoo search algorithm and wind driven optimization for multilevel thresholding using Kapur's entropy. This algorithm was used to obtain the best solution or best fitness value from the initial random threshold values, and to evaluate the quality of a solution, correlation function is used. Ayala, dos Santos, Mariani, and dos Santos Coelho (2015) proposed a beta differential evolution (BDE) algorithm for determining the n − 1 optimal n-level threshold on a given image using Otsu criterion. Bhandari, Kumar, and Singh (2015a) introduced a modified artificial bee colony (MABC) algorithm based satellite image segmentation using different objective functions to find the optimal multilevel thresholds. Bhandari, Kumar, and Singh (2015b) developed another technique for color image segmentation using Cuckoo Search algorithm supported by Tsallis entropy for multilevel thresholding toward the effective colored segmentation of satellite images. Oliva et al. (2015) proposed a new algorithm for multilevel segmentation based on the Electromagnetism-Like which is used to find the optimal threshold values by maximizing the Tsallis entropy. Zhou, Tian, Zhao, and Zhao (2015) presented a novel image segmentation algorithm that combines improved Firefly Algorithm with Two-Dimensional Otsu to solve the problems of time consuming, low accuracy and easy to produce false segmentation image. However, a major disadvantage of the thresholding group is the determination of threshold T for detecting the main part of a dental image especially on noise images (Xu, Xu, Jin, & Song, 2011). Even though evolutionary algorithms have been utilized for such the determination, the complexity of computational models makes them hard to apply in real situations.

In the boundary-based techniques such as the Level Set method (Rad, Mohd Rahim, Rehman, Altameem, & Saba, 2013), a surface is covered by a curve so that image characteristics such as edges and angles can be detected efficiently. Nonetheless, choosing an appropriate function that represents for the curve is an ongoing question (Rad et al., 2014). Another type of methods for DXIS is region-based techniques. This group aims to split jointed sub-images based on pre-defined conditions, which can be the information either about edges or grayscales. A number of seed points are utilized in the division process to ensure the spatial constraints. The drawback of this group is how to determine the significant coefficient for dividing. Moreover, it works well only if pre-defined functions of curves were given (Said et al., 2006). More details about discussions of those groups of techniques can be seen in (Li et al., 2006, Mahoor and Abdel-Mottaleb, 2005, Narkhede, 2013, Setarehdan et al., 2012, Shah et al., 2006). Even though the image processing algorithms such as the thresholding, the boundary-based and the region-based methods achieved reasonable accuracy of segmentation for DXIS, they often have trouble in determining parameters or common boundaries of teeth samples (Sujji, Lakshmi, & Jiji, 2013).

Clustering is an effective tool in dealing with issues related to quality (Nayak, Naik, & Behera, 2015). It divides a set of objects into clusters where pixels in a group have a certain similarity degree which is larger than those in other groups. Several classes of clustering methods with different characteristics were presented in the literature. However, they are often classified into two main groups. The first is hard clustering methods in which each data point belongs to exactly one cluster. A typical algorithm for this group is K-means (Larose, 2014) with an example being seen from (Lin, Chen, Lee, & Liao, 2014) where a fast K-means algorithm based on a level histogram was proposed. The second is fuzzy clustering methods which use a membership matrix indicating the membership degrees of data points to different clusters so that data point may belong to more than one cluster. Fuzzy clustering methods like Fuzzy C-Means (FCM) (Bezdek, Ehrlich, & Full, 1984) are often used in pattern recognition problems, knowledge discovery from databases, risk assessment, and image segmentation (Chen et al., 2011, Son, 2014a 2014b, Son, 2014c, Son, 2015, Son and Thong, 2015, Son et al., 2013, Son et al., 2012, Son et al., 2012, Son et al., 2014, Srivastava et al., 2013, Sujji et al., 2013, Thong and Son, 2014). In the context of image segmentation, an optimal-selection-based suppressed FCM with self-tuning non local spatial information was presented (Zhao, Fan, & Liu, 2014). However, the limitations of this type of techniques are sensitive to noises and initialization; thus making unexpected clustering results if bad initialization is given (Nayak et al., 2015, Yin et al., 2006).

An observation in Maraziotis (2012) showed that performance of a clustering algorithm is enhanced when additional information provided by users is attached to inputs of the algorithm. In this case, a new type of clustering methods called semi-supervised fuzzy clustering is invoked. It is quite useful for determining parameters or common boundaries in image segmentation (Chen, Sun, Zhou, & Li, 2012). For instance, Portela, Cavalcanti, and Ren (2014) proposed a clustering based semi-supervised classifier for MR brain tissue segmentation that labels voxels clusters of an image slice and then uses statistics and class labels information of the resultant clusters to classify the remaining image slices by applying Gaussian Mixture Model (GMM). In a semi-supervised fuzzy clustering, additional information is used to guide, supervise and control the process of clustering. There are three basic types of additional information (Yin, Shu, & Huang, 2012):

  • Must-link and cannot-link constraints: a must-link constraint requires that two elements must belong to the same cluster, whereas a cannot-link constraint indicates two elements which are not in the same cluster (which must be in 2 different clusters);

  • Class labels of a part of data: a part of data is labeled and others are unlabeled;

  • A pre-defined membership matrix.

Some studies regarding image segmentation by semi-supervised fuzzy clustering often use the membership matrix as additional information. Within this kind of additional information, Yasunori, Yukihiro, Makito, and Sadaaki (2009) proposed Semi-Supervised Algorithm with Standard Fuzzy Clustering (SSSFC) – a semi-supervised fuzzy clustering algorithm by mixing the membership function into the entire clustering process. Bouchachia and Pedryzc (2006) used the membership matrix for identifying ukj based on given value u˜ik. Their algorithm is called Semi-Supervised Fuzzy C-Mean algorithm of Bouchachia and Pedrycz (SSFCMBP). Yin et al. (2012) proposed Semi-supervised Entropy regularized Fuzzy Clustering algorithm (eSFCM), which integrates the entropy factor into the semi-supervised clustering algorithm and uses additional value ukj¯ to increase clustering performance. This algorithm has better performance than SSSFC and SSFCMBP but it requires specifying the membership matrix before running the algorithm (Yin et al., 2012). In some applications, this may be a challenge since we do not know suitable values for the pre-defined membership matrix. An overview of typical semi-supervised fuzzy clustering algorithms can be referenced in (Thong & Son, 2016).

From upon analyses, our motivation in this paper is to enhance the accuracy of segmentation in DXIS by mean of a semi-supervised fuzzy clustering algorithm. Starting with the remark in Rad et al. (2013) that clustering methods could be integrated with another type of algorithms to improve the performance, in this paper we consider the integration of the Otsu method, Fuzzy C-Means (FCM) and the eSFCM algorithms to remedy the limitations of standalone methods. Thus, the contributions are highlighted as follows:

  • 1)

    Using the Otsu method to remove the Background area from an X-ray dental image. This method has the advantage of fast processing and can efficiently determine the background/ main parts so that it is utilized in the pre-processing step of the new method;

  • 2)

    Using the FCM algorithm to remove the Dental Structure area from the results of the previous steps. The achieved membership matrix is then used for the initialization of the next semi-supervised fuzzy clustering algorithm. This solves the problem of current semi-supervised algorithms;

  • 3)

    Using the eSFCM algorithm to clarify and improve the results being achieved by Step 2 with pre-defined membership matrix being taken from the previous step.

By these contributions, the new algorithm named as eSFCM-Otsu would obtain more reliable and higher accuracy than other clustering methods. The proposed framework will be evaluated on a real collection of dental X-ray image datasets from Hanoi Medical University, Vietnam (Mathworks, 2015) in terms of clustering quality (Vendramin, Campello, & Hruschka, 2010). The findings would suggest the efficiency and advantages of the proposed approach for the DXIS problem. The structure of this paper is organized as follows: Section 2 presents the proposed framework for dental image segmentation based on semi-supervised fuzzy clustering. Experimental results on the performance of algorithms are shown in Section 3. Finally, Section 4 gives the conclusions and delineates further works.

In this section, the cooperative framework eSFCM-Otsu is firstly described in Section 2.1. Details of the Otsu method, Fuzzy C-Means and the eSFCM method are given in the next sub-sections, respectively. The last section is dedicated to analyze the method in theoretical aspects.

In Fig. 3, we illustrate the cooperative framework – eSFCM-Otsu in a flowchart manner. A given dental X-ray image with some user-defined parameters such as the number of clusters (C), the fuzzifier (m) , the Otsu threshold (T) and the stopping threshold (ɛ) is inputted in the framework. Note that the values for those parameters, as suggested in the relevant articles, are often C = 3 (see Scott, 1977), m = 2.0 and ɛ=0.001 (see Bezdek et al., 1984) and T being the median value in the grayscale pixels (Otsu, 1975). Nonetheless, we can adjust the parameters for specific analyses.

Since an X-ray image could not contain the Background area, a pre-processing procedure that checks whether the Background area exists or not is employed. It relies on the test grayscale samples of the Background part suggesting that a focused window of sample in the considered dental X-ray image could be (nearly) identical to those in the sample database by the mean of similarity metric. If so, the Otsu method is applied to remove the Background area from the image. This method has the advantage of fast processing and can efficiently determine the background/ main parts so that it is utilized in this step. From the main areas extracted from the previous step, we continue applying FCM to classify the Teeth and the Dental Structure areas.

The outcomes of this process are the final centers and membership matrix U¯. Since the outcomes are assumed to approximate to optimal results, we use U¯ as the additional information for the semi-supervised fuzzy clustering in the next step. A small modification to guarantee the sum-row constraint of the semi-supervised problem is done by applying the minimum operator in each row of U¯. Then, eSFCM uses U¯ to clarify and improve the results in the next step of the framework. Last steps of the eSFCM – Otsu framework measure performance of the framework through validity indices and give the final segmentation results.

The Otsu method changes an original image to a binary image. It was introduced in Otsu (1975) and also used in the X-Ray image segmentation by Rad et al. (2014). An inputted image can be divided into 3 regions by distributing density: The region with the lowest density corresponding to the background or soft area; the medium density areas corresponding to the bone; and the highest density areas corresponding to the teeth. However, in many cases of images, the density of the teeth closes to the bone so that perfectly 2 regions (Background/Main parts) should be used in the Otsu method.

Otsu is a typical method of thresholding – the easiest and quickest class among the image processing techniques based on pixel. There are many methods to get the threshold. The simplest technique in the threshold method is to partition image into two regions based on a global threshold T. In this case, the Otsu method selects a threshold with the goal to minimize the changes of the inner class of black and white pixels and labels each pixel in image area (r0) or background area (r1). Each pixel is labeled based on its gray value (f(x)). In the other words, g(x)={roiff(x)Tr1iff(x)<T.

Result of the threshold step is a binary image which simplifies the image analysis process in next steps. Descriptions of the Otsu method are shown in Table 1.

In the general case where the number of clusters – C is greater than 2, multiple thresholds can be used to detect various clusters. Suppose that T1, T2, … , Tn (Ti are equidistant points in closed interval [min, max] for all i =1, 2, … ,n) are the thresholds. The value of each pixel (f(x)) is computed as the average of R, G, B values at that pixel. It is clear that, Apixelbelongsto{cluster1iff(x)T1cluster2ifT1<f(x)T2.....................clusternifTn1<f(x)Tnclustern+1iff(x)>Tn

Then, uij of the membership matrix U equals to 1 if pixel j belongs to cluster i and equals to 0 otherwise.

Example 1

For a 9 × 9 image as in Fig. 4, using the Otsu method with T(0) = 3. After 5 iterations, we have two partitions (two clusters) of the input image corresponding to values 0 or 1 as in Fig. 5.

Fuzzy C-Means (FCM), proposed by Bezdek et al. (1984), is based on the iteration process to optimize the membership matrix and the cluster centers. The objective function of FCM is defined as follows: J=k=1Nj=1Cukj×XkVjmin,{ukj[0,1]j=1Cukj=1k=1,N¯;j=1,C¯.

In Eqs. (3) and (4), some terms are used as follows:

  • m is fuzzier;

  • C is the number of clusters;

  • N is the number of data elements;

  • r is the dimensionality of the data;

  • ukj is the membership degree of data elements Xk to cluster j

  • XkRr is the kth element of X={X1,X2,,XN}, which is the main part from the Otsu method;

  • Vj is the center of cluster j.

Use the Lagrange method, the cluster centers and the membership matrix are determined in Eqs. (5) and (6), respectively. Vj=k=1CukjmXkk=1Cukjm,ukj=1i=1C(XkVjXkVi)1m1

Descriptions of the FCM method are shown in Table 2.

Semi-supervised entropy regularized fuzzy clustering algorithm was proposed by Yasunori et al. (2009), and in 2012, Yin et al. (2012) proposed entropy factor modification and then the new algorithm – Semi-supervised Entropy regularized Fuzzy Clustering (eSFCM) – use additional value ukj¯ to increase the clustering performance with conditions: j=1Cu¯kj1;ukj[0,1];k=1,N¯.

The initial cluster centers are defined by formula (8). v¯j=k=1Nukj2Xkk=1Nukj2;j=1,,C.

The covariance matrix of the samples using the Mahalanobis distance is calculated as follows. P=1Nj=1Ck=1Nukj2(xkv¯j)(xkv¯j)T.

Then, the distance is calculated by: dA2(x1,x2)=(x1x2)TA(x1x2),A=P1.

Then, the objective function of eSFCM is determined as J(U,V)=k=1Nj=1CukjXkVjA2+λ1k=1Nj=1C(|ukjukj¯|ln|ukjukj¯|)min.

Solve problem (4) and (11), we obtain the solutions: ukj=ukj¯+eλXkVjA2i=1CeλXkViA2(1i=1Cuki¯),k=1,N¯,j=1,C¯.Vj=k=1NukjXkk=1Nukj;j=1,C¯

Descriptions of the eSFCM method are shown in Table 3.

The proposed framework eSFCM-Otsu has the following advantages in comparison with the relevant ones.

  • 1)

    eSFCM-Otsu relies on a semi-supervised fuzzy clustering algorithm – eSFCM which was shown to have better accuracy than other clustering methods in the group. Moreover, by combining additional techniques such as Otsu and FCM for pre-processing steps, eSFCM-Otsu was totally designed for the segmentation of dental X-Ray images. This will both enhance the accuracy of segmentation and remedy the limitations of standalone methods being raised in the Introduction section.

  • 2)

    Regarding additional information, the relevant semi-supervised fuzzy clustering methods such as eSFCM often request users to determine it so that their performances are somehow unreliable. eSFCM-Otsu makes uses of FCM to automatically calculate additional information in the form of pre-defined membership matrix. Using the maximal operator, the most possibility of a data point belonging to a group is specified and used to orient the semi-supervised fuzzy clustering afterward. Thus, this reduces uncertain judgments and improves the stability of results.

  • 3)

    The entire computational time of the framework is not much in comparison with other works. For example, some thresholding methods appeared in Bhandari et al., 2015a, Bhandari et al., 2015b, Oliva et al. (2015) and Zhou et al. (2015) used time-consuming algorithms such as evolutionary algorithms, Tsallis entropy, Electromagnetism-Like, etc. to determine the threshold values. Other groups such as the Level Set method (Rad et al., 2013) require a complex algorithm to determine the surface. Unlike those methods, the pre-defined membership matrix from FCM is used to achieve optimal results so that the overall framework can quickly classify the image in a reasonable time.

  • 4)

    eSFCM-Otsu is the first attempt to use the semi-supervised fuzzy clustering approach for the DXIS problem. Moreover, it definitely has certain meaning in dental image processing.

  • 5)

    The proposed framework is straightforward and easy for implementation.

However, the framework has some limitations such as,

  • 1)

    Dental features are not used in the clustering process. It is obvious that a clustering algorithm employing spatial components in the objective function would have much more accurate achieved results than those without using spatial components. This should be taken into account in the design of algorithm.

  • 2)

    Parameter values for each dental X-Ray image segmentation task should be derived by experience. Even though we have stated the range of parameters, experiments should be done again to verify them and to have the best results.

Section snippets

Setting

In this part, the experimental environments are described such as,

  • Experimental tools: the proposed algorithm – eSFCM-Otsu – has been implemented in addition to the standalone methods – FCM (Bezdek et al., 1984) and Otsu (Otsu, 1975) as well as other variants of the framework – that use different semi-supervised fuzzy clustering algorithms instead of eSFCM that are SSSFC (Yasunori et al., 2009) and SSFCMBP (Bouchachia & Pedrycz, 2006) in Matlab 2014 and executed on a PC VAIO laptop with Core i5

Conclusions

In this paper, we concentrated on the dental X-ray image segmentation with main approach being fuzzy clustering methodology. The contribution of this work is a new cooperative framework that combines Otsu threshold method, Fuzzy C-Means and semi-supervised fuzzy clustering (eSFCM). FCM classifies the Main part of a dental image into Teeth and Dental Structure areas. The achieved results are then rectified by mean of eSFCM with a pre-defined membership matrix being taken from the optimal one of

Acknowledgment

The authors are greatly indebted to Prof. B. Lin and anonymous reviewers for their comments and their valuable suggestions that improved the quality and clarity of paper.

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.05-2014.01.

References (49)

  • NomirO. et al.

    A system for human identification from X-ray dental radiographs

    Pattern Recognition

    (2005)
  • OlivaD. et al.

    Improving segmentation velocity using an evolutionary method

    Expert Systems with Applications

    (2015)
  • PortelaN.M. et al.

    Semi-supervised clustering for MR brain image segmentation

    Expert Systems with Applications

    (2014)
  • SonL.H.

    Enhancing clustering quality of geo-demographic analysis using context fuzzy clustering type-2 and particle swarm optimization

    Applied Soft Computing

    (2014)
  • SonL.H.

    HU-FCF: a hybrid user-based fuzzy collaborative filtering method in recommender systems

    Expert Systems With Applications

    (2014)
  • SonL.H.

    optimizing municipal solid waste collection using chaotic particle swarm optimization in GIS based environments: a case study at Danang City

    Vietnam. Expert Systems With Applications

    (2014)
  • SonL.H.

    DPFCM: a novel distributed picture fuzzy clustering method on picture fuzzy sets

    Expert Systems With Applications

    (2015)
  • SonL.H. et al.

    A novel intuitionistic fuzzy clustering method for geo-demographic analysis

    Expert Systems with Applications

    (2012)
  • SonL.H. et al.

    Spatial interaction – modification model and applications to geo-demographic analysis

    Knowledge-Based Systems

    (2013)
  • SonL.H. et al.

    A lossless DEM compression for fast retrieval method using fuzzy clustering and MANFIS neural network

    Engineering Applications of Artificial Intelligence

    (2014)
  • SonL.H. et al.

    Intuitionistic fuzzy recommender systems: an effective tool for medical diagnosis

    Knowledge-Based Systems

    (2015)
  • XuX. et al.

    Characteristic analysis of Otsu threshold and its applications

    Pattern recognition letters

    (2011)
  • YinX. et al.

    Semi-supervised fuzzy clustering with metric learning and entropy regularization

    Knowledge-Based Systems

    (2012)
  • YinZ. et al.

    Fuzzy clustering with novel separable criterion

    Tsinghua Science & Technology

    (2006)
  • Cited by (96)

    • A review on semi-supervised clustering

      2023, Information Sciences
    • Individual tooth detection and identification from dental panoramic X-ray images via point-wise localization and distance regularization

      2021, Artificial Intelligence in Medicine
      Citation Excerpt :

      Moreover, forensic identification can be conducted by analyzing the corresponding individual teeth of the subjects [5]. Various applications such as classification [6,38–41] and segmentation [7,8] were developed using dental panoramic X-ray images [3]. Especially, automated detection and identification of individual teeth are the most demanded algorithms and a critical prerequisite for other applications [9].

    View all citing articles on Scopus
    View full text