Classification of laryngeal disorders based on shape and vascular defects of vocal folds

https://doi.org/10.1016/j.compbiomed.2015.02.001Get rights and content

Highlights

  • Vocal folds on videolaryngostroboscopy images are detected by HOG for examination.

  • Vocal fold images are classified into five laryngeal disorder types.

  • We exploit shape and vascular features of vocal folds for classification.

  • An average classification success rate of 81% is achived.

  • We demonstrate that visible vessels of vocal folds can act as a prognostic marker.

Abstract

Vocal fold disorders such as laryngitis, vocal nodules, and vocal polyps may cause hoarseness, breathing and swallowing difficulties due to vocal fold malfunction. Despite the fact that state of the art medical imaging techniques help physicians to obtain more detailed information, difficulty in differentiating minor anomalies of vocal folds encourages physicians to research new strategies and technologies to aid the diagnostic process. Recent studies on vocal fold disorders note the potential role of the vascular structure of vocal folds in differential diagnosis of anomalies. However, standards of clinical usage of the blood vessels have not been well established yet due to the lack of objective and comprehensive evaluation of the vascular structure.

In this paper, we present a novel approach that categorizes vocal folds into healthy, nodule, polyp, sulcus vocalis, and laryngitis classes exploiting visible blood vessels on the superior surface of vocal folds and shapes of vocal fold edges by using image processing techniques and machine learning methods. We first detected the vocal folds on videolaryngostroboscopy images by using Histogram of Oriented Gradients (HOG) descriptors. Then we examined the shape of vocal fold edges in order to provide features such as size and splay portion of mass lesions. We developed a new vessel centerline extraction procedure that is specialized to the vascular structure of vocal folds. Extracted vessel centerlines were evaluated in order to get vascular features of vocal folds, such as the amount of vessels in the longitudinal and transverse form. During the last step, categorization of vocal folds was performed by a novel binary decision tree architecture, which evaluates features of the vocal fold edge shape and vascular structure.

The performance of the proposed system was evaluated by using laryngeal images of 70 patients. Sensitivity of 86%, 94%, 80%, 73%, and 76% were obtained for healthy, polyp, nodule, laryngitis, and sulcus vocalis classes, respectively. These results indicate that visible vessels of vocal folds can act as a prognostic marker for vocal fold pathologies, as well as the vocal fold shape features, and may play a critical role in more effective diagnosis.

Introduction

Diagnosis of vocal fold disorders such as laryngitis, vocal nodules, and vocal polyps is based on examining structural defects of the vocal folds and vocal fold vibrations by medical imaging devices such as videolaryngostroboscopy and endoscopic high-speed cameras. However, the subjective diagnosis is error-prone and may vary between different physicians examining the same patient because of the variety of vocal fold anomalies [1], [2]. Even if shape and vibration pattern of vocal folds give significant information about laryngeal disorders, as well as voice signal and questionnaire data, the variety of vocal fold anomalies drives physicians to research new approaches to aid the diagnostic process [3].

There have been several medical research studies published in the last 30 years in which the presence of visible blood vessels on the superior surface of vocal folds has been associated with benign lesions such as nodules, polyps, and minimal structural alterations such as epidermoid cysts and vocal fold sulci [4], [5], [6], [7], [8]. One of the most comprehensive studies on this topic is reported by de Biase and Pontes [1]. They noted that the incidence of blood vessels is higher in sulcus vocalis, epidermoid cysts, and polyps than in the nodule and control groups. Increasing interest in the effects of the vocal fold anomalies on the anatomic structure of blood vessels leads us to consider exploiting vascular structure and shape features of vocal folds for specialized classification of laryngeal diseases.

There are very few attempts for automated analysis of laryngeal images by using visual characteristics of vocal folds. The clinical diagnosis of vocal fold paresis is based on examination of the rapidly moving vocal folds during phonation. Lohscheller et al. introduced a visualization method, namely Phonovibrography, for capturing the whole spatiotemporal pattern of activity [9]. In Voigt et al. [10], subjects were classified into healthy and vocal fold paresis classes exploiting Phonovibrography. On the other hand, examination of the shape defect of the vocal folds is required for diagnosis of organic lesions. In Ilgner et al. [11], manually marked suspect lesions are classified into healthy and diseased tissues exploiting textural features of laryngoscopy images. A larger set of laryngeal images has been used in Verikas et al. [12] for classification of vocal fold images into three decision classes, namely, nodular, diffuse, and healthy. They exploited color, texture, and geometric features extracted from an image of patient’s vocal folds, voice signal, and questionnaire data. Their nodular class includes nodules, polyps, and cysts, whereas the diffuse class is composed of papillomata, hyperplastic laryngitis with keratosis, and carcinoma. The primary disadvantage of this study is the use of vocal fold images that can only be acquired during direct micro-laryngoscopy, which is a surgical instrument. There are several risks linked to the procedure such as anesthesia, a sore or numb tongue, bleeding, and infection. A decision support system for diagnostics of laryngeal diseases can be developed by using different types of analyses of vocal folds. However, clinical usage of the system requires the use of images that are acquired during routine physical examination instead of surgery. In addition to this, there are a limited number of vocal fold disorders that can be classified by the proposed design. Vocal fold disorders that do not cause benign lesions such as sulcus vocalis and laryngitis are not examined.

Even if the latest medical research indicates that there is a relationship between blood vessels and vocal fold disorders, there is no reported study in the literature that classifies laryngeal diseases by using computational methods by means of vascular structure and shape defects of vocal folds. In our previous work [13], we extracted blood vessels on vocal folds and used transverse vessels for healthy-altered vocal fold classification. In Turkmen et al. [14], we exploited the orientation pattern of vessels to aid differential diagnosis of nodules and cysts.

In this study we propose a novel method that classifies laryngeal disorders by evaluating both vessel and shape features of vocal folds. The block diagram of the proposed system is shown in Fig. 1. The system is composed of four main steps:

  • 1.

    Detection of the vocal folds on videolaryngostroboscopy images exploiting HOG descriptors.

  • 2.

    Extraction of size, location, shape, and symmetry features of vocal fold mass lesions.

  • 3.

    Vessel centerline extraction and analyzing the orientation pattern of vessels.

  • 4.

    Classification of laryngeal disorders into healthy, sulcus vocalis, laryngitis, nodule, and polyp groups by using a new binary decision tree architecture.

The key contributions of this study could be given as follows:

  • 1.

    We propose a novel approach that segments vocal folds from surrounding tissue by using HOG descriptors. Since edge directions of vocal folds significantly differ from surrounding laryngeal tissue, using HOG is very tempting in detection of vocal folds on videolaryngostroboscopy images.

  • 2.

    We present a novel glottal area segmentation method that enables segmentation of vocal folds even if the glottal area is divided into two parts by vocal fold pathologies.

  • 3.

    We present a novel vessel centerline extraction method which is designed by considering the challenges inflicted by vocal fold imaging techniques and the anatomic structure of vessels on the superior surface of vocal folds.

  • 4.

    Automatic analysis of vascular features gives physicians an opportunity of fast, convenient, and objective evaluation of laryngeal images.

  • 5.

    Classification of vocal fold pathologies based on vascular features reveals how vascular structure varies among different vocal fold pathologies. Therefore, it contributes to establishing the standards of clinical usage of vessels.

  • 6.

    Since structural defect of blood vessels can be a symptom of vocal fold disorders, evaluating vocal folds in terms of vascular structure enables early diagnosis.

  • 7.

    Evaluation of vascular structure provides not only classification of vocal fold mass lesions, but also minimal structural alterations such as sulcus vocalis.

  • 8.

    One of the outputs of the system is the measurement related to the shape features of mass lesions that can also be used by physicians for objectively monitoring the clinical condition of patients.

Section snippets

Vocal fold image database

We have collected a vocal fold image database that consists of 70 videolaryngostroboscopy and direct microlaryngoscopy recordings. Images of vocal folds on which blood vessels can be clearly seen are selected manually for processing. Vocal fold disorders can be unilateral or bilateral. Distribution of subjects among recordings in the database and the total number of analyzed vocal folds of each disorder type is given in Table 1.

Sample vocal fold images from the dataset are given in Fig. 2.

Segmentation of vocal folds

Videolaryngostroboscopy images contain vocal folds with surrounding larynx tissue. Therefore, vocal folds have to be segmented from surrounding tissue and the glottal area for reliable analysis of vessel features. The segmentation algorithm is composed of three main parts: detection of vocal folds on laryngeal image, segmentation of glottis, and finally normalization of vocal fold images.

Vocal fold edge features

The distortion of the vocal fold edge contour is highly discriminative for categorization of organic mass lesions. Evaluation of vocal fold lesions is done by analyzing the vocal fold edges that stand out from the vocal fold edge baselines (which indicate the edge borders of vocal folds before vocal folds were affected by the pathologies). In this study, we determine vocal fold edge baselines exploiting the glottal area in order to analyze the shape features of lesions.

Let x be the pixels on

Extraction of vessel centerlines

The assessment of the blood vessel characteristics such as length, tortuosity, and direction provides new insights to diagnose many diseases such as diabetic retinopathy, esophageal cancer, and heart diseases [25], [26], [27], [28]. Therefore for years, several algorithms for blood vessel segmentation, most of which were designed for retinal images were developed [26], [27], [28], [29], [30], [31], [32], [33], [34], [35]. However, vocal fold vessel characteristics differ from retinal vessels in

Classification methodology

In this study, we aimed to design a vocal fold disorder classification system that simulates the human decision-making mechanism based on supervised machine learning methods. The proposed algorithm uses binary decision tree architecture depicted in Fig. 6. Each node represents one binary classifier that realizes separation of two classes. Healthy/non-healthy vocal fold classification is done at the first level by exploiting both edge and vessel features. Vocal fold edge distortion is examined

Experimental results

As long as vocal fold images are acquired during routine videolaryngostroboscopy, adaptive calculation of parameters used in segmentation of vocal folds and blood vessel centerline extraction steps are not required. Therefore, a validation set which consists of 20 frames that have been extracted from 10 laryngostroboscopy videos is used for adjusting the parameters used in the segmentation of vocal folds and blood vessel centerline extraction steps. The performance evaluation of the proposed

Discussion

Studies that focus on automated analysis of laryngeal images can be divided into two groups: studies that detect vocal fold vibration irregularities by examining digital recordings of vocal fold movements [9], [10] and studies that analyze laryngeal still images [11], [12].

Studies that evaluate vocal fold vibration patterns are able to detect vocal fold disorders such as vocal fold paresis that cannot be diagnosed by evaluating still images. However, analyzing vibration patterns is not

Conclusion

In this paper we propose a novel approach that classifies vocal fold disorders into five categories, namely healthy, nodule, polyp, sulcus vocalis, and laryngitis by exploiting shape features of mass lesions and orientation pattern of blood vessels on the superior surface of vocal folds. A novel binary decision tree which combines human expertise and machine learning approaches is designed for classification of vocal folds. It is observed that performances of machine learning methods that are

Conflict of interest statement

None declared.

Acknowledgment

This research has been supported by Yildiz Technical University’s Scientific Research Projects Coordination Department under the grant number 2011-04-01-DOP02. We wish to express our appreciation to SESVAK for their clinical support of this work.

H. Irem Turkmen received her B.Sc., M.Sc. and Ph.D. degrees in computer engineering from Yildiz Technical University, Istanbul, Turkey, in 2005 and 2008 and 2013, respectively. Her research interests include medical image processing, pattern recognition, speech processing, and machine learning.

References (39)

  • J.M. Ulis et al.

    What’s new in differential diagnosis and treatment of hoarseness?

    Curr. Opin. Otolaryngol. Head Neck Surg.

    (2009)
  • M. Bouchayer et al.

    Epidermoid cysts, sulci, and mucosal bridges of the true vocal cord: a report of 157 cases.

    Laryngoscope

    (1985)
  • P. Pontes et al.

    Vocal fold cover minor structural alterations: diagnostic errors

    Phonoscope

    (1999)
  • I. Hochman et al.

    Ectasias and varices of the vocal fold: clearing the striking zone

    Ann. Otol. Rhinol. Laryngol.

    (1999)
  • V. Kambic et al.

    Vocal cord polyps: incidence, histology and pathogenesis

    J. Laryngol. Otol.

    (1981)
  • G.N. Postma et al.

    Microvascular lesions of the true vocal fold

    Ann. Otol. Rhinol. Laryngol.

    (1998)
  • J. Lohscheller et al.

    Phonovibrography: mapping high-speed movies of vocal-fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics

    IEEE Trans. Med. Imaging

    (2008)
  • J.F.R. Ilgner et al.

    Colour texture analysis for quantitative laryngoscopy

    Acta Otolaryngol.

    (2003)
  • Turkmen, H.I., M.E. Karsligil, İ. Koçak. . Assessment of videolaryngostroboscopy images based on visible vessels of...
  • Cited by (0)

    H. Irem Turkmen received her B.Sc., M.Sc. and Ph.D. degrees in computer engineering from Yildiz Technical University, Istanbul, Turkey, in 2005 and 2008 and 2013, respectively. Her research interests include medical image processing, pattern recognition, speech processing, and machine learning.

    M. Elif Karsligil received her B.Sc., M.Sc., and Ph.D. degrees in computer engineering from Yildiz Technical University, Istanbul, Turkey, in 1988, 1990 and 1998, respectively. She is currently an assistant professor at the Computer Engineering Department of Yildiz Technical University. From October 2001 to November 2002, she worked as senior researcher at NTT Communication Science Laboratories, Kyoto, Japan. Her research interests include machine learning, pattern recognition, speech processing, and digital video processing.

    Ismail Kocak received his MD degree in Hacettepe University English Medical Faculty and residency in Ankara University Faculty of Medicine, Ear Nose Throat Department, Ankara, Turkey in 1992 and 1996, respectively. He received his M.Sc. degree in Bogazici University Institute of Biomedical Engineering, Istanbul, Turkey in 2002. He is currently an Associate Professor in Otorhinolaryngology Head and Neck Surgery at SESVAK and a faculty member at Anadolu University DILKOM, Education, Research & Training Centre for Speech & Language Pathology.

    View full text