Skip to main content
Top
Published in: Medical & Biological Engineering & Computing 7/2013

01-07-2013 | Original Article

Pathological speech signal analysis and classification using empirical mode decomposition

Authors: Muhammad Kaleem, Behnaz Ghoraani, Aziz Guergachi, Sridhar Krishnan

Published in: Medical & Biological Engineering & Computing | Issue 7/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Automated classification of normal and pathological speech signals can provide an objective and accurate mechanism for pathological speech diagnosis, and is an active area of research. A large part of this research is based on analysis of acoustic measures extracted from sustained vowels. However, sustained vowels do not reflect real-world attributes of voice as effectively as continuous speech, which can take into account important attributes of speech such as rapid voice onset and termination, changes in voice frequency and amplitude, and sudden discontinuities in speech. This paper presents a methodology based on empirical mode decomposition (EMD) for classification of continuous normal and pathological speech signals obtained from a well-known database. EMD is used to decompose randomly chosen portions of speech signals into intrinsic mode functions, which are then analyzed to extract meaningful temporal and spectral features, including true instantaneous features which can capture discriminative information in signals hidden at local time-scales. A total of six features are extracted, and a linear classifier is used with the feature vector to classify continuous speech portions obtained from a database consisting of 51 normal and 161 pathological speakers. A classification accuracy of 95.7 % is obtained, thus demonstrating the effectiveness of the methodology.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Henriquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Diaz-de-Maria F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process 17(6):1186–1195CrossRef Henriquez P, Alonso JB, Ferrer MA, Travieso CM, Godino-Llorente JI, Diaz-de-Maria F (2009) Characterization of healthy and pathological voice through measures based on nonlinear dynamics. IEEE Trans Audio Speech Lang Process 17(6):1186–1195CrossRef
2.
go back to reference Parsa V, Jamieson DG (2000) Identification of pathological voices using glottal noise measures. J Speech Lang Hear Res 43(2):469–485PubMed Parsa V, Jamieson DG (2000) Identification of pathological voices using glottal noise measures. J Speech Lang Hear Res 43(2):469–485PubMed
3.
go back to reference Saenz-Lechona N, Godino-Llorentea JI, Osma-Ruiza V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1(2):120–128CrossRef Saenz-Lechona N, Godino-Llorentea JI, Osma-Ruiza V, Gomez-Vilda P (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed Signal Process Control 1(2):120–128CrossRef
4.
go back to reference Gelzinis A, Verikas A, Bacauskiene M (2008) Automated speech analysis applied to laryngeal disease categorization. Comput Methods Programs Biomed 91(1):36–47PubMedCrossRef Gelzinis A, Verikas A, Bacauskiene M (2008) Automated speech analysis applied to laryngeal disease categorization. Comput Methods Programs Biomed 91(1):36–47PubMedCrossRef
5.
go back to reference Schlotthauer G, Torres ME, Jackson-Menaldi MC (2010) A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J Voice 24(3):346–353PubMedCrossRef Schlotthauer G, Torres ME, Jackson-Menaldi MC (2010) A pattern recognition approach to spasmodic dysphonia and muscle tension dysphonia automatic classification. J Voice 24(3):346–353PubMedCrossRef
6.
go back to reference Godino-Llorente JI, Gomez-Vilda P (2004) Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans Biomed Eng 51(2):380–384PubMedCrossRef Godino-Llorente JI, Gomez-Vilda P (2004) Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors. IEEE Trans Biomed Eng 51(2):380–384PubMedCrossRef
7.
go back to reference Shama K, Krishna A, Cholayya NU (2007) Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology. EURASIP J Adv Signal Process. doi:10.1155/2007/85286 Shama K, Krishna A, Cholayya NU (2007) Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology. EURASIP J Adv Signal Process. doi:10.​1155/​2007/​85286
8.
go back to reference Markaki M, Stylianou Y, Arias-Londono JD, Godino-Llorente JI (2010) Dysphonia detection based on modulation spectral features and cepstral coefficients. In: Douglas S, Kehtarnavaz N (eds) Proceedings of the 2010 IEEE international conference on acoustics, speech, and signal processing, Dallas, Texas, USA, pp 5162–5165 Markaki M, Stylianou Y, Arias-Londono JD, Godino-Llorente JI (2010) Dysphonia detection based on modulation spectral features and cepstral coefficients. In: Douglas S, Kehtarnavaz N (eds) Proceedings of the 2010 IEEE international conference on acoustics, speech, and signal processing, Dallas, Texas, USA, pp 5162–5165
9.
go back to reference Umapathy K, Krishnan S, Parsa V, Jamieson DG (2005) Discrimination of pathological voices using a time–frequency approach. IEEE Trans Biomed Eng 52(3):421–430PubMedCrossRef Umapathy K, Krishnan S, Parsa V, Jamieson DG (2005) Discrimination of pathological voices using a time–frequency approach. IEEE Trans Biomed Eng 52(3):421–430PubMedCrossRef
10.
go back to reference Ghoraani B, Krishnan S (2009) A joint time–frequency and matrix decomposition feature extraction methodology for pathological voice classification. EURASIP J Adv Signal Process. doi:10.1155/2009/928974 Ghoraani B, Krishnan S (2009) A joint time–frequency and matrix decomposition feature extraction methodology for pathological voice classification. EURASIP J Adv Signal Process. doi:10.​1155/​2009/​928974
11.
go back to reference Parsa V, Jamieson DG (2001) Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J Speech Lang Hear Res 4(2):327–338CrossRef Parsa V, Jamieson DG (2001) Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J Speech Lang Hear Res 4(2):327–338CrossRef
12.
go back to reference Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond A 454(1971):903–995CrossRef Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu HH (1998) The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond A 454(1971):903–995CrossRef
13.
go back to reference Kaleem MF, Sugavaneswaran L, Guergachi A, Krishnan S (2010) Application of empirical mode decomposition and Teager energy operator to EEG signals for mental task classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 4590–4593 Kaleem MF, Sugavaneswaran L, Guergachi A, Krishnan S (2010) Application of empirical mode decomposition and Teager energy operator to EEG signals for mental task classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 4590–4593
14.
go back to reference Mijovic B, De Vos M, Gligorijevic I, Taelman J, Van Huffel S (2010) Source separation from single-channel recordings by combining empirical mode decomposition and independent component analysis. IEEE Trans Biomed Eng 57(9):2188–2196PubMedCrossRef Mijovic B, De Vos M, Gligorijevic I, Taelman J, Van Huffel S (2010) Source separation from single-channel recordings by combining empirical mode decomposition and independent component analysis. IEEE Trans Biomed Eng 57(9):2188–2196PubMedCrossRef
15.
go back to reference Schlotthauer G, Torres ME, Rufiner HL (2009) Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Doessel O, Schlegel WC (eds) IFMBE proceedings, world congress on medical physics and biomedical engineering, vol 25/4, Springer, Berlin, pp 984–987 Schlotthauer G, Torres ME, Rufiner HL (2009) Voice fundamental frequency extraction algorithm based on ensemble empirical mode decomposition and entropies. In: Doessel O, Schlegel WC (eds) IFMBE proceedings, world congress on medical physics and biomedical engineering, vol 25/4, Springer, Berlin, pp 984–987
16.
go back to reference Schlotthauer G, Torres ME, Rufiner HL (2010) Pathological voice analysis and classification based on empirical mode decomposition. In: Esposito A et al (eds) Development of multimodal interfaces: active listening and synchrony; LNCS 5967, pp 364–381 Schlotthauer G, Torres ME, Rufiner HL (2010) Pathological voice analysis and classification based on empirical mode decomposition. In: Esposito A et al (eds) Development of multimodal interfaces: active listening and synchrony; LNCS 5967, pp 364–381
17.
go back to reference Kay Elemetrics Corporation (1994) Massachusetts eye and ear infirmary voice disorders database. Version 1.03 (CDROM), Lincoln Park, NJ, USA Kay Elemetrics Corporation (1994) Massachusetts eye and ear infirmary voice disorders database. Version 1.03 (CDROM), Lincoln Park, NJ, USA
18.
go back to reference Sugavaneswaran L, Umapathy K, Krishnan S (2010) Exploiting the ambiguity domain for non-stationary biomedical signal classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 1934–1937 Sugavaneswaran L, Umapathy K, Krishnan S (2010) Exploiting the ambiguity domain for non-stationary biomedical signal classification. In: Armentano R, Monzon JE, Sacristan E, Lovell N (eds) Proceedings of the 2010 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Buenos Aires, Brazil, pp 1934–1937
19.
go back to reference Malyska N, Quatieri TF, Sturim D (2005) Automatic dysphonia recognition using iologically-inspired amplitude-modulation features. In: Petropulu AP, Bystrom M (eds) Proceedings of the 2005 IEEE international conference on acoustics, speech, and signal processing, Philadelphia, Pennsylvania, USA, vol 1, pp 873–876 Malyska N, Quatieri TF, Sturim D (2005) Automatic dysphonia recognition using iologically-inspired amplitude-modulation features. In: Petropulu AP, Bystrom M (eds) Proceedings of the 2005 IEEE international conference on acoustics, speech, and signal processing, Philadelphia, Pennsylvania, USA, vol 1, pp 873–876
20.
go back to reference Furui S (1986) On the role of spectral transition for speech perception. J Acoust Soc Am 80(4):1016–1025PubMedCrossRef Furui S (1986) On the role of spectral transition for speech perception. J Acoust Soc Am 80(4):1016–1025PubMedCrossRef
21.
go back to reference Adam O (2006) Advantages of the Hilbert Huang transform for marine mammals signal analysis. J Acoust Soc Am 120(5):2965–2973PubMedCrossRef Adam O (2006) Advantages of the Hilbert Huang transform for marine mammals signal analysis. J Acoust Soc Am 120(5):2965–2973PubMedCrossRef
23.
go back to reference Hettmansperger TP, McKean J (2010) Robust nonparametric statistical methods, 2nd edn. Chapman and Hall/CRC Monographs on Statistics and Applied Probability, CRC Press, New York Hettmansperger TP, McKean J (2010) Robust nonparametric statistical methods, 2nd edn. Chapman and Hall/CRC Monographs on Statistics and Applied Probability, CRC Press, New York
24.
go back to reference Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley and Sons, New York Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley and Sons, New York
25.
go back to reference Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv Adapt Data Analysis 1(1):1:41 Wu Z, Huang NE (2009) Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv Adapt Data Analysis 1(1):1:41
26.
go back to reference Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53(3):468–477PubMedCrossRef Moran RJ, Reilly RB, de Chazal P, Lacy PD (2006) Telephony-based voice pathology assessment using automated speech analysis. IEEE Trans Biomed Eng 53(3):468–477PubMedCrossRef
27.
go back to reference Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. In: Bonato P, Laine A, Lovell N (eds) Proceedings of the 2011 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Boston, MA, USA, pp 7095–7098 Kaleem MF, Ghoraani B, Guergachi A, Krishnan S (2011) Telephone-quality pathological speech classification using empirical mode decomposition. In: Bonato P, Laine A, Lovell N (eds) Proceedings of the 2011 annual international conference of the IEEE engineering in medicine and biology society (EMBC), Boston, MA, USA, pp 7095–7098
Metadata
Title
Pathological speech signal analysis and classification using empirical mode decomposition
Authors
Muhammad Kaleem
Behnaz Ghoraani
Aziz Guergachi
Sridhar Krishnan
Publication date
01-07-2013
Publisher
Springer-Verlag
Published in
Medical & Biological Engineering & Computing / Issue 7/2013
Print ISSN: 0140-0118
Electronic ISSN: 1741-0444
DOI
https://doi.org/10.1007/s11517-013-1051-8

Other articles of this Issue 7/2013

Medical & Biological Engineering & Computing 7/2013 Go to the issue

Premium Partner