Skip to main content

2018 | OriginalPaper | Buchkapitel

Commonalities of Glottal Sources and Vocal Tract Shapes Among Speakers in Emotional Speech

verfasst von : Yongwei Li, Ken-Ichi Sakakibara, Daisuke Morikawa, Masato Akagi

Erschienen in: Studies on Speech Production

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper explores the commonalities of the glottal source waves and vocal tract shapes among four speakers in emotional speech (vowel: /a/, neutral, joy, anger, and sadness) based on a source-filter model with the proposed precise estimation scheme. The results are as follows. When compared with the spectral tilts of glottal source waves of neutral, (1) those of anger and joy increased, and those of sadness decreased in the 200- to 700-Hz frequency range; (2) those of anger increased, but those of joy decreased, and those of sadness were the same as those of neutral in the 700- to 2000-Hz range; and (3) all spectral tilts had the same tendency over 2000 Hz. For front vocal tract shapes, the area function of anger was the largest, that of sadness was the smallest, and those of joy and neutral were in the middle.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Schröder, M., Cowie, R., Douglas-Cowie, E., Westerdijk, M., Gielen, S.: Acoustic correlates of emotion dimensions in view of speech synthesis. In: 7th European Conference on Speech Communication and Technology (2001) Schröder, M., Cowie, R., Douglas-Cowie, E., Westerdijk, M., Gielen, S.: Acoustic correlates of emotion dimensions in view of speech synthesis. In: 7th European Conference on Speech Communication and Technology (2001)
2.
Zurück zum Zitat Hamada, Y., Elbarougy, R., Akagi, M.: A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space. In: Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA), pp. 1–7. IEEE Press (2014) Hamada, Y., Elbarougy, R., Akagi, M.: A method for emotional speech synthesis based on the position of emotional state in Valence-Activation space. In: Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA), pp. 1–7. IEEE Press (2014)
3.
Zurück zum Zitat Li, X., Akagi, M.: Multilingual speech emotion recognition system based on a three-layer model. In: Interspeech, pp. 3608–3612 (2016) Li, X., Akagi, M.: Multilingual speech emotion recognition system based on a three-layer model. In: Interspeech, pp. 3608–3612 (2016)
4.
Zurück zum Zitat Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70(3), 614–636 (1996)CrossRef Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70(3), 614–636 (1996)CrossRef
5.
Zurück zum Zitat Airas, M., Alku, P.: Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalised amplitude quotient. Phonetica 63(1), 26–46 (2006)CrossRef Airas, M., Alku, P.: Emotions in vowel segments of continuous speech: analysis of the glottal flow using the normalised amplitude quotient. Phonetica 63(1), 26–46 (2006)CrossRef
6.
Zurück zum Zitat Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Commun. 40(1–2), 189–212 (2003)CrossRef Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Commun. 40(1–2), 189–212 (2003)CrossRef
7.
Zurück zum Zitat Kitamura, T.: Similarity of effects of emotions on the speech organ configuration with and without speaking. In: Interspeech, pp. 909–912, (2010) Kitamura, T.: Similarity of effects of emotions on the speech organ configuration with and without speaking. In: Interspeech, pp. 909–912, (2010)
8.
Zurück zum Zitat Erickson, D., Zhu, C., Kawahara, S., Suemitsu, A.: Articulation, acoustics and perception of Mandarin Chinese Emotional Speech. Open Linguist. 2(1), 620–635 (2016) Erickson, D., Zhu, C., Kawahara, S., Suemitsu, A.: Articulation, acoustics and perception of Mandarin Chinese Emotional Speech. Open Linguist. 2(1), 620–635 (2016)
9.
Zurück zum Zitat Fant, G., Liljencrants, J., Lin, Q.-G.: A four-parameter model of glottal flow. in: STL-QPSR 1985, vol. 4, pp. 1–13 (1985) Fant, G., Liljencrants, J., Lin, Q.-G.: A four-parameter model of glottal flow. in: STL-QPSR 1985, vol. 4, pp. 1–13 (1985)
10.
Zurück zum Zitat Vincent, D., Rosec, O., Chonavel, T.: Estimation of LF glottal source parameters based on an ARX model. In: Interspeech, pp. 333–336 (2005) Vincent, D., Rosec, O., Chonavel, T.: Estimation of LF glottal source parameters based on an ARX model. In: Interspeech, pp. 333–336 (2005)
11.
Zurück zum Zitat Kane, J., Gobl, C.: Evaluation of automatic glottal source analysis. International Conference on Nonlinear Speech Processing, Springer, pp. 1–8 (2013) Kane, J., Gobl, C.: Evaluation of automatic glottal source analysis. International Conference on Nonlinear Speech Processing, Springer, pp. 1–8 (2013)
12.
Zurück zum Zitat Ohtsuka, T., Kasuya, H.: Aperiodicity control in ARX-based speech analysis-synthesis method. In: Seventh European Conference on Speech Communication and Technology, pp. 2267–2270 (2001) Ohtsuka, T., Kasuya, H.: Aperiodicity control in ARX-based speech analysis-synthesis method. In: Seventh European Conference on Speech Communication and Technology, pp. 2267–2270 (2001)
13.
Zurück zum Zitat Kawahara, H., Sakakibara, K.-I., Banno, H., Morise, M., Toda, T., Irino, T.: Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 520–529. IEEE Press (2015) Kawahara, H., Sakakibara, K.-I., Banno, H., Morise, M., Toda, T., Irino, T.: Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), pp. 520–529. IEEE Press (2015)
14.
Zurück zum Zitat Drugman, T., Bozkurt, B., Dutoit, T.: A comparative study of glottal source estimation techniques. Comput. Speech Lang. 26(1), 20–34 (2012)CrossRef Drugman, T., Bozkurt, B., Dutoit, T.: A comparative study of glottal source estimation techniques. Comput. Speech Lang. 26(1), 20–34 (2012)CrossRef
16.
Zurück zum Zitat Wakita, H.: Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms. IEEE Trans. Audio Electroacoust. 21(5), 417–427 (1973)CrossRef Wakita, H.: Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms. IEEE Trans. Audio Electroacoust. 21(5), 417–427 (1973)CrossRef
17.
Zurück zum Zitat Schroder M., Cowie R., Douglas-Cowie E., Westerdijk M., Gielen S.C.: Acoustic correlates of emotion dimensions in view of speech synthesis. In: Proceedings of Interspeech 2001, pp. 87–90 (2001) Schroder M., Cowie R., Douglas-Cowie E., Westerdijk M., Gielen S.C.: Acoustic correlates of emotion dimensions in view of speech synthesis. In: Proceedings of Interspeech 2001, pp. 87–90 (2001)
Metadaten
Titel
Commonalities of Glottal Sources and Vocal Tract Shapes Among Speakers in Emotional Speech
verfasst von
Yongwei Li
Ken-Ichi Sakakibara
Daisuke Morikawa
Masato Akagi
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-00126-1_3