Skip to main content

2017 | OriginalPaper | Buchkapitel

Robust Automatic Evaluation of Intelligibility in Voice Rehabilitation Using Prosodic Analysis

verfasst von : Tino Haderlein, Anne Schützenberger, Michael Döllinger, Elmar Nöth

Erschienen in: Text, Speech, and Dialogue

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech intelligibility for voice rehabilitation has been successfully evaluated by automatic prosodic analysis. In this paper, the influence of reading errors and the selection of certain words for the computation of prosodic features (nouns only, nouns and verbs, beginning of each sentence, beginnings of sentences and subclauses) are examined. 73 hoarse patients (48.3 ± 16.8 years) read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 trained experts according to a 5-point scale. Eight prosodic features showed human-machine correlations of r \(\ge \) 0.4. The normalized energy in a word-pause-word interval, computed from all words (r = 0.69 for the full speaker set), the mean of jitter in nouns and verbs (r = 0.67), and the pause duration before a word (r = 0.66) were the most robust features. However, reading errors can significantly influence these results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The prosody module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). doi:10.1007/978-3-662-04230-4_8 CrossRef Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The prosody module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). doi:10.​1007/​978-3-662-04230-4_​8 CrossRef
2.
Zurück zum Zitat Ellis, L., Fucci, D.: Magnitude-estimation scaling of speech intelligibility: effects of listeners’ experience and semantic-syntactic context. Percept. Mot. Skills 73, 295–305 (1991)CrossRef Ellis, L., Fucci, D.: Magnitude-estimation scaling of speech intelligibility: effects of listeners’ experience and semantic-syntactic context. Percept. Mot. Skills 73, 295–305 (1991)CrossRef
3.
Zurück zum Zitat Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 195–202. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23538-2_25 CrossRef Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 195–202. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-23538-2_​25 CrossRef
4.
Zurück zum Zitat Haderlein, T., Nöth, E., Batliner, A., Eysholdt, U., Rosanowski, F.: Automatic intelligibility assessment of pathologic speech over the telephone. Logoped. Phoniatr. Vocol. 36, 175–181 (2011)CrossRef Haderlein, T., Nöth, E., Batliner, A., Eysholdt, U., Rosanowski, F.: Automatic intelligibility assessment of pathologic speech over the telephone. Logoped. Phoniatr. Vocol. 36, 175–181 (2011)CrossRef
5.
Zurück zum Zitat Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS, vol. 5246, pp. 325–332. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87391-4_42 CrossRef Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS, vol. 5246, pp. 325–332. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-87391-4_​42 CrossRef
6.
Zurück zum Zitat Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods. Med. 2015, 11p. (2015) Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods. Med. 2015, 11p. (2015)
7.
Zurück zum Zitat International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999) International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)
8.
Zurück zum Zitat Kaufmann, R., Obler, L.: Classification of normal reading error types. In: Leong, C., Joshi, R. (eds.) Developmental and Acquired Dyslexia, pp. 149–157. Kluwer Academic Publishers, Dordrecht (1995)CrossRef Kaufmann, R., Obler, L.: Classification of normal reading error types. In: Leong, C., Joshi, R. (eds.) Developmental and Acquired Dyslexia, pp. 149–157. Kluwer Academic Publishers, Dordrecht (1995)CrossRef
9.
Zurück zum Zitat Kempler, D., van Lancker, D.: Effect of speech task on intelligibility in dysarthria: a case study of Parkinson’s disease. Brain Lang. 80, 449–464 (2002)CrossRef Kempler, D., van Lancker, D.: Effect of speech task on intelligibility in dysarthria: a case study of Parkinson’s disease. Brain Lang. 80, 449–464 (2002)CrossRef
10.
Zurück zum Zitat Kollmeier, B., Wesselkamp, M.: Development and evaluation of a German sentence test for objective and subjective speech intelligibility assessment. J. Acoust. Soc. Am. 102, 2412–2421 (1997)CrossRef Kollmeier, B., Wesselkamp, M.: Development and evaluation of a German sentence test for objective and subjective speech intelligibility assessment. J. Acoust. Soc. Am. 102, 2412–2421 (1997)CrossRef
11.
Zurück zum Zitat Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment, Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009) Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment, Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)
12.
Zurück zum Zitat Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: the use of prosody in the linguistic components of a speech understanding system. IEEE Trans. Speech Audio Process. 8, 519–532 (2000) Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: the use of prosody in the linguistic components of a speech understanding system. IEEE Trans. Speech Audio Process. 8, 519–532 (2000)
13.
Zurück zum Zitat Origlia, A., Alfano, I.: Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification. In: Calzolari, N., et al. (ed.) Proceedings of 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 997–1002 (2012) Origlia, A., Alfano, I.: Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification. In: Calzolari, N., et al. (ed.) Proceedings of 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 997–1002 (2012)
14.
Zurück zum Zitat Rosenberg, A.: Automatic detection and classification of prosodic events. Ph.D. thesis, Columbia University, New York (2009) Rosenberg, A.: Automatic detection and classification of prosodic events. Ph.D. thesis, Columbia University, New York (2009)
15.
Zurück zum Zitat Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)CrossRef Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)CrossRef
Metadaten
Titel
Robust Automatic Evaluation of Intelligibility in Voice Rehabilitation Using Prosodic Analysis
verfasst von
Tino Haderlein
Anne Schützenberger
Michael Döllinger
Elmar Nöth
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-64206-2_2