nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

Speech Driven by Artificial Larynx: Potential Advancement Using Synthetic Pitch Contours

verfasst von : Hua-Li Jian

Erschienen in: Universal Access in Human-Computer Interaction. Access to Learning, Health and Well-Being

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Despite a long history of development, the speech qualities achieved with artificial larynx devices are limited. This paper explores recent advances in prosodic speech processing and technology and assesses their potentials in improving the quality of speech with an artificial larynx – in particular, tone and intonation through pitch variation. Three approaches are discussed: manual pitch control, automatic pitch control and re-synthesized speech.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Rehabilitation of Balance-Impaired Stroke Patients Through Audio-Visual Biofeedback

Nächstes Kapitel Multimodal Feedback for Balance Rehabilitation

Stalker, J.L., Hawk, A.M., Smaldino, J.J.: The intelligibility and acceptability of speech produced by five different electronic artificial larynx devices. J. Commun. Disord. 1(5), 299–301 (1982)CrossRef

Pindzola, R.H., Moffet, B.: Comparison of ratings of four artificial larynxes. J. Commun. Disord. 21, 459–467 (1988)CrossRef

Modrzejewski, M., Olszewski, E., Wszol, W., Rerona, E., Strek, P.: Acoustic assessment of voice signal deformation after partial surgery of the larynx. Auris Nasus Larynx 26, 183–190 (1999)CrossRef

Alipour, F., Scherer, R.C., Finnegan, E.: Measures of spectral slope using an excised larynx model. J. Voice 26(4), 403–411 (2012)CrossRef

Ooe, K., Fukuda, T., Arai, F.: A new type of artificial larynx using a PZT ceramics vibrator as a sound source. IEEE/ASME Trans. Mechantronics 5(2), 221–225 (2000)CrossRef

Niu, H.J., Won, M.X. Waq, S.P.: Enhancement of electronic artificial larynx speech by denoising. In: IEEE International Conference on Neural Networks & Signal Processing, pp. 908–911. IEEE Press (2003)

Schwarz, R., Huttner, B., Dollinger, M., Luegmair, G., Eysholdt, U., Schuster, M., Lohscheller, J., Gurlek, E.: Substitute voice production: quantification of PE segment vibrations using a biomechanical model. IEEE Trans. Biomed. Eng. 58(10), 2767–2776 (2011)CrossRef

Sharifzadeh, H.R., McLoughlin, I.V., Ahmadi, F.: Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec. IEEE Trans. Biomed. Eng. 57(10), 2448–2458 (2010)CrossRef

Ooe, K.: Development of controllable artificial larynx by neck myoelectric signal. Procedia Eng. 47, 869–872 (2012)CrossRef

10.

Stepp, C.A., Heaton, J.T., Rolland, R.G., Hillman, R.E.: Neck and face surface electromyography for prosthetic voice control after total laryngectomy. IEEE Trans. Neural Syst. Rehabil. Eng. 17(2), 146–155 (2009)CrossRef

11.

Heaton, J.T., Robertson, M., Griffin, C.: Development of a wireless electromyographically controlled electrolarynx voice prosthesis. In: 33rd Annual International Conference of the IEEE EMBS, pp. 5352–5355. IEEE Press (2011)

12.

Uemi, N., Ifukube, T., Tamashi, T., Matsushima, J.: Design of a new electrolarynx having a pitch control function. In: IEEE lnternational Workshop on Robot and Human Communication, pp. 198–203. IEEE Press (1994)

13.

Blankinship, E., Beckwith, R.: Tools for expressive text-to-speech markup. In: Proceedings of the 14th Annual ACM Symposium on User Interface Software and Technology, pp. 159–160. ACM press (2001)

14.

Győrbíró, N., Fábián, A., Hományi, G.: An activity recognition system for mobile phones. Mobile Netw. Appl. 14(1), 82–91 (2009)CrossRef

15.

Carrino, F., Ridi, A., Ingold, R., Abou Khaled, O., Mugellini, E.: Gesture vs. gesticulation: a test protocol. In: Kurosu, M. (ed.) HCII/HCI 2013, Part IV. LNCS, vol. 8007, pp. 157–166. Springer, Heidelberg (2013)

16.

Plumpe, M., Meredith, S.: Which is more important in a concatenative text to speech system - pitch, duration or spectral discontinuity? In: Proceedings of the Third ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan, Australia (1998)

17.

Klabbers, E., van Santen, J.P.H.: Control and prediction of the impact of pitch modification on synthetic speech quality. In: Eurospeech 2003 (2003)

18.

Gu, H.Y., Yang, C.C.: An HMM based pitch-contour generation method for mandarin speech synthesis. J. Inf. Sci. Eng. 27, 1561–1580 (2011)MathSciNet

19.

Chen, J.H., Kao, Y.A.: Pitch marking based on an adaptable filter and a peak-valley estimation method. Comput. Linguist. Chin. Lang. Process. 6(2), 1–12 (2012)MATH

20.

Hirschberg, J.: Accent and discourse context: assigning pitch accent in synthetic speech. In: AAAI 1990 Proceedings (1990)

21.

Hirschberg, J., Litman, D.: Disambiguating cue phrases in text and speech. In: Proceedings of COLING 1990, Helsinki, August (1990)

22.

Hirschberg, J.: Pitch accent in context predicting intonational prominence from text. Artif. Intell. 63(1), 305–340 (1993)MathSciNetCrossRef

23.

Chiou, G.I., Hwang, J.N.: Lipreading from color video. IEEE Trans. Image Process. 6(8), 1192–1195 (1997)CrossRef

24.

Zhou, Z.H., Zhao, G.Y., Pietikainen, M.: Towards a practical lip-reading system. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 137–144 (2011)

25.

Li, M., Cheung, Y.M.: A novel motion based lip feature extraction for lip-reading. In: International Conference on Computational Intelligence and Security, CIS 2008, vol. 1, pp. 361–365 (2008)

26.

Garay-Vitoria, N., Abascal, J.: Text prediction systems: a survey. Univers. Access. Inf. Soc. 4(3), 188–203 (2006)CrossRef

27.

Fredkin, E.: Trie Memory. Commun. ACM 3(9), 490–499 (1960)CrossRef

28.

Philips, L.: Hanging on the metaphone. Comput. Lang. 7(12), 38–43 (1990)

29.

Litman, D., Walker, M., Kearns, M.: Automatic detection of poor speech recognition at the dialogue level. In: Proceedings of the 37th Annual Meeting of the Association of Computational Linguistics, ACL 1999, College Park, pp. 309–316 (1999)

30.

Litman, D., Pan, S.: Empirically evaluating an adaptable spoken dialogue system. In: Proceedings of the 7th International Conference on User Modeling (UM), Banff, pp. 55–64 (1999)

31.

Walker, M., Kamm, C., Litman, D.: Towards developing general models of usability with PARADISE. Nat. Lang. Eng. Special Issue on Best Practice Spoken Language Dialogue System Engineering 6, 363–377 (2000)

32.

Hirschberg, J., Litman, D., Swerts, M.: Prosodic and other cues to speech recognition failures. Speech Commun. 43(1), 155–175 (2004)CrossRef

33.

Ostendorf, M., Byrne, B., Bacchiani, M., Finke, M., Gunawardana, A., Ross, K., Roweis, S., Shriberg, E., Talkin, D.,Waibel, A., Wheatley, B., Zeppenfeld, T.: Modeling systematic variations in pronunciation via a language-dependent hidden speaking mode. In: Report on 1996 CLSP/JHU Workshop on Innovative Techniques for Large Vocabulary Continuous Speech Recognition (1997)

34.

Litman, D., Hirschberg, J., Swerts, M.: Predicting user reactions to system error. In: Proceedings of the ACL-2001, Toulouse, pp. 329–369 (2001)

35.

Hirschberg, J., Litman, D., Swerts, M.: Identifying user corrections automatically in spoken dialogue systems. In: Procedings of the NAACL 2001, Pittsburgh, pp. 208–215 (2001)

36.

Doddington, G., Liggett, W., Martin, A., Przybocki, M., Reynolds, D.: Sheep, goats, lambs and wolves: a statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation. In: Proceedings of the International Conference on Spoken Language Processing-98, Sydney, pp. 608–611 (1998)

37.

Hirschberg, J., Litman, D., Swerts, M.: Prosodic cues to recognition errors. In: Proceedings of the Automatic Speech Recognition and Understanding Workshop (ASRU 1999), Keystone, pp. 349–352 (1999)

38.

Litman, D., Hirschberg, J., Swerts, M.: Characterizing and predicting corrections in spoken dialogue systems. Comput. Linguist. 32(3), 417–438 (2006)CrossRef

39.

Tamura, M., Masuko, T., Tokuda, K., Kobayashi, T.: Text-to-speech synthesis with arbitrary speaker’s voice from average voice. In: Proceedings of Eurospeech 2001, pp. 345–348 (2001)

Titel: Speech Driven by Artificial Larynx: Potential Advancement Using Synthetic Pitch Contours
verfasst von: Hua-Li Jian
Verlag: Springer International Publishing
Buch: Universal Access in Human-Computer Interaction. Access to Learning, Health and Well-Being
Print ISBN: 978-3-319-20683-7

Electronic ISBN: 978-3-319-20684-4

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-3-319-20684-4_30

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.