Skip to main content
Top

2016 | OriginalPaper | Chapter

An Agonist-Antagonist Pitch Production Model

Authors : Branislav Gerazov, Philip N. Garner

Published in: Speech and Computer

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Prosody is a phenomenon that is crucial for numerous fields of speech research, accenting the importance of having a robust prosody model. A class of intonation models based on the physiology of pitch production are especially attractive for their inherent multilingual support. These models rely on an accurate model of muscle activation. Traditionally they have used the 2nd order spring-damper-mass (SDM) muscle model. However, recent research has shown that the SDM model is not sufficient for adequate modelling of the muscle dynamics. The 3rd order Hill type model offers a more accurate representation of muscle dynamics, but it has been shown to be underdamped when using physiologically plausible muscle parameters. In this paper we propose an agonist-antagonist pitch production (A2P2) model that both validates and gives insight behind the improved results of using higher-order critically damped system models in intonation modelling.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The WCAD implementation code is available on gitHub at https://​github.​com/​dipteam/​wcad.
 
Literature
1.
go back to reference Beranek, L.: Acoustics Sound Fields and Transducers. Academic Press, Waltham (2012). S.l Beranek, L.: Acoustics Sound Fields and Transducers. Academic Press, Waltham (2012). S.l
2.
go back to reference Burkhardt, F., Campbell, N.: Emotional speech synthesis. In: The Oxford Handbook of Affective Computing, p. 286 (2014) Burkhardt, F., Campbell, N.: Emotional speech synthesis. In: The Oxford Handbook of Affective Computing, p. 286 (2014)
3.
go back to reference Cutler, A., Dahan, D., Van Donselaar, W.: Prosody in the comprehension of spoken language: a literature review. Lang. Speech 40(2), 141–201 (1997) Cutler, A., Dahan, D., Van Donselaar, W.: Prosody in the comprehension of spoken language: a literature review. Lang. Speech 40(2), 141–201 (1997)
4.
go back to reference Fujisaki, H.: A model for synthesis of pitch contours of connected speech. Annu. Rep. Eng. Res. Inst. Univ. Tokyo 28, 53–60 (1969) Fujisaki, H.: A model for synthesis of pitch contours of connected speech. Annu. Rep. Eng. Res. Inst. Univ. Tokyo 28, 53–60 (1969)
5.
go back to reference Fujisaki, H.: The roles of physiology, physics and mathematics in modeling prosodic features of speech. In: Proceedings of Speech Prosody (2006) Fujisaki, H.: The roles of physiology, physics and mathematics in modeling prosodic features of speech. In: Proceedings of Speech Prosody (2006)
7.
go back to reference Gerazov, B., Garner, P.N.: An investigation of muscle models for physiologically based intonation modelling. In: Proceedings of the 23rd Telecommunications Forum, Belgrade, Serbia, pp. 468–471, November 2015 Gerazov, B., Garner, P.N.: An investigation of muscle models for physiologically based intonation modelling. In: Proceedings of the 23rd Telecommunications Forum, Belgrade, Serbia, pp. 468–471, November 2015
8.
go back to reference Gerazov, B., Honnet, P.E., Gjoreski, A., Garner, P.N.: Weighted correlation based atom decomposition intonation modelling. In: Proceedings of Interspeech, Dresden, Germany, September 2015 Gerazov, B., Honnet, P.E., Gjoreski, A., Garner, P.N.: Weighted correlation based atom decomposition intonation modelling. In: Proceedings of Interspeech, Dresden, Germany, September 2015
9.
go back to reference Honnet, P.E., Gerazov, B., Garner, P.N.: Atom decomposition-based intonation modelling. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, Australia. IEEE, April 2015 Honnet, P.E., Gerazov, B., Garner, P.N.: Atom decomposition-based intonation modelling. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, Australia. IEEE, April 2015
11.
go back to reference Piovesan, D., Pierobon, A., Mussa Ivaldi, F.A.: Critical damping conditions for third order muscle models: implications for force control. J. Biomech. Eng. 135(10), 101010 (2013)CrossRef Piovesan, D., Pierobon, A., Mussa Ivaldi, F.A.: Critical damping conditions for third order muscle models: implications for force control. J. Biomech. Eng. 135(10), 101010 (2013)CrossRef
12.
go back to reference Plamondon, R.: A kinematic theory of rapid human movements: part I: movement representation and generation. Biol. Cybern. 72(4), 295–307 (1995)CrossRefMATH Plamondon, R.: A kinematic theory of rapid human movements: part I: movement representation and generation. Biol. Cybern. 72(4), 295–307 (1995)CrossRefMATH
13.
go back to reference Prom-on, S., Xu, Y., Thipakorn, B.: Modeling tone and intonation in Mandarin and English as a process of target approximation. J. Acoust. Soc. Am. 125, 405–424 (2009)CrossRef Prom-on, S., Xu, Y., Thipakorn, B.: Modeling tone and intonation in Mandarin and English as a process of target approximation. J. Acoust. Soc. Am. 125, 405–424 (2009)CrossRef
14.
go back to reference van Santen, J., Mishra, T., Klabbers, E.: Prosodic processing. In: Benesty, J., Sondhi, M.M., Huang, Y. (eds.) Springer Handbook of Speech Processing, pp. 471–488. Springer, Heidelberg (2008)CrossRef van Santen, J., Mishra, T., Klabbers, E.: Prosodic processing. In: Benesty, J., Sondhi, M.M., Huang, Y. (eds.) Springer Handbook of Speech Processing, pp. 471–488. Springer, Heidelberg (2008)CrossRef
15.
go back to reference Schuller, B., Batliner, A.: Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. Wiley, Hoboken (2013)CrossRef Schuller, B., Batliner, A.: Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing. Wiley, Hoboken (2013)CrossRef
16.
go back to reference Strik, H.: Physiological control and behaviour of the voice source in the production of prosody. Ph.D. thesis, Dept. of Language and Speech, Univ. of Nijmegen, Nijmegen, Netherlands, October 1994 Strik, H.: Physiological control and behaviour of the voice source in the production of prosody. Ph.D. thesis, Dept. of Language and Speech, Univ. of Nijmegen, Nijmegen, Netherlands, October 1994
17.
go back to reference Thomas, R.E., Rosa, A.J., Toussaint, G.J.: The Analysis and Design of Linear Circuits, 7th edn. Wiley Publishing, Chichester (2012) Thomas, R.E., Rosa, A.J., Toussaint, G.J.: The Analysis and Design of Linear Circuits, 7th edn. Wiley Publishing, Chichester (2012)
18.
go back to reference Titze, I.R., Martin, D.W.: Principles of voice production. J. Acoust. Soc. Am. 104(3), 1148 (1998)CrossRef Titze, I.R., Martin, D.W.: Principles of voice production. J. Acoust. Soc. Am. 104(3), 1148 (1998)CrossRef
19.
go back to reference Vogt, T., André, E., Wagner, J.: Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868, pp. 75–91. Springer, Heidelberg (2008)CrossRef Vogt, T., André, E., Wagner, J.: Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868, pp. 75–91. Springer, Heidelberg (2008)CrossRef
20.
go back to reference Zatsiorsky, V., Prilutsky, B.: Biomechanics of Skeletal Muscles. Human Kinetics, Champaign (2012) Zatsiorsky, V., Prilutsky, B.: Biomechanics of Skeletal Muscles. Human Kinetics, Champaign (2012)
Metadata
Title
An Agonist-Antagonist Pitch Production Model
Authors
Branislav Gerazov
Philip N. Garner
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-43958-7_9

Premium Partner