Skip to main content
Top

2018 | OriginalPaper | Chapter

5. Intonation Rules for Text Reading

Author : Asoke Kumar Datta

Published in: Epoch Synchronous Overlap Add (ESOLA)

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Intonation is the cognitive aspect of the ensemble of pitch variations in the course of an utterance. This perceptual impression of speech melody correlates, to a first approximation, with changes in the fundamental frequency (F0) of the signal. This chapter presents the study of intonation patterns for text reading in Standard Colloquial Bengali for the development of rules and appropriate methods for using them in a text-to-speech synthesis system. In the model presented here, the pitch movements at the syllabic level are considered to be basic. Syllabic stylization uses the closest linear match using linear regression and t the pitch movements are expressed in semitones per second. The sentence level intonation pattern is the sequences of the word level patterns constituting the sentence. This chapter also presents the statistical method for the implementation of these obtained rule in TTS. The model is tested by synthesizing several sentences and the perceptual results are satisfactory.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agüero PD, Wimmer K, Bonafonte A (2004) Automatic analysis and synthesis of Fujisaki’s intonation model for TTS. Speech prosody 2004, Nara, Japan Agüero PD, Wimmer K, Bonafonte A (2004) Automatic analysis and synthesis of Fujisaki’s intonation model for TTS. Speech prosody 2004, Nara, Japan
go back to reference Cardozo BL, Ritsma RJ (1965) Short-time characteristics of periodicity of pitch. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, Belgium, paper B37 Cardozo BL, Ritsma RJ (1965) Short-time characteristics of periodicity of pitch. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, Belgium, paper B37  
go back to reference Chowdhury S, Datta AK, Chaudhuri BB (2000) Pitch detection algorithm using state phase analysis. J Acoust Soc India 28(1–4):247–250 Chowdhury S, Datta AK, Chaudhuri BB (2000) Pitch detection algorithm using state phase analysis. J Acoust Soc India 28(1–4):247–250
go back to reference Chowdhury S, Datta AK, Chaudhuri BB (2001) Study of intonation patterns for text reading in standard colloquial Bengali. In: Proceedings of the Sixth International Workshop on Recent Trends in Speech, Music and Allied Signal Processing (IWSMSP), National Physical Laboratory, New Delhi, 19–21 Dec 2001, pp 56–64 Chowdhury S, Datta AK, Chaudhuri BB (2001) Study of intonation patterns for text reading in standard colloquial Bengali. In: Proceedings of the Sixth International Workshop on Recent Trends in Speech, Music and Allied Signal Processing (IWSMSP), National Physical Laboratory, New Delhi, 19–21 Dec 2001, pp 56–64
go back to reference Chowdhury S, Datta AK, Chaudhuri BB (2002) Intonation patterns for text reading in standard colloquial Bengali. J Acoust Soc India 30:160–163 Chowdhury S, Datta AK, Chaudhuri BB (2002) Intonation patterns for text reading in standard colloquial Bengali. J Acoust Soc India 30:160–163
go back to reference Crystal D (2003) A dictionary of linguistics & phonetics, 5th edn. Blackwell Publishing, pp 326 Crystal D (2003) A dictionary of linguistics & phonetics, 5th edn. Blackwell Publishing, pp 326
go back to reference Dedina MJ, Nusbaum HC (1991) PRONOUNCE: a program for pronunciation by analogy. Comput Speech Lang 5:55–64CrossRef Dedina MJ, Nusbaum HC (1991) PRONOUNCE: a program for pronunciation by analogy. Comput Speech Lang 5:55–64CrossRef
go back to reference Fujisaki H, Hirose K (1984) Analysis of voice fundamental frequency contours for declarative sentences of Japanese. J Acoust Soc Jpn 5(4):233–242CrossRef Fujisaki H, Hirose K (1984) Analysis of voice fundamental frequency contours for declarative sentences of Japanese. J Acoust Soc Jpn 5(4):233–242CrossRef
go back to reference Fujisaki H, Omura T (1971) Characteristics of durations of pauses and speech segments in connected speech. Annual Report, Engineering Research Institute, Faculty of Engineering, University of Tokyo, vol 30, pp 69–74 Fujisaki H, Omura T (1971) Characteristics of durations of pauses and speech segments in connected speech. Annual Report, Engineering Research Institute, Faculty of Engineering, University of Tokyo, vol 30, pp 69–74
go back to reference Hart J’t, Collier R, Cohen A (1990) A perceptual study of intonation, an experimental phonetic approach to speech melody. Cambridge Studies in Speech Science and Communication, Cambridge University Press, Cambridge Hart J’t, Collier R, Cohen A (1990) A perceptual study of intonation, an experimental phonetic approach to speech melody. Cambridge Studies in Speech Science and Communication, Cambridge University Press, Cambridge
go back to reference Hiki S (1970) Control rule of the tongue movement for dynamic analog speech synthesis. J Acoust Soc Am Supplement 147:S85CrossRef Hiki S (1970) Control rule of the tongue movement for dynamic analog speech synthesis. J Acoust Soc Am Supplement 147:S85CrossRef
go back to reference Kaiki N, Sagisaka Y (1992) Pause characteristics and local phrase-dependency structure in Japanese. In: Proceeding ICSLP-1992, Banff, Canada, pp 357–360 Kaiki N, Sagisaka Y (1992) Pause characteristics and local phrase-dependency structure in Japanese. In: Proceeding ICSLP-1992, Banff, Canada, pp 357–360
go back to reference Klatt DH (1973) Interaction between two factors that influence vowel duration. J Acoust Soc Am 54:1102–1104CrossRef Klatt DH (1973) Interaction between two factors that influence vowel duration. J Acoust Soc Am 54:1102–1104CrossRef
go back to reference Das Mandal SK, Saha A, Sarkar I, Datta AK (2005) Phonological, international & prosodic aspects of concatenative speech synthesizer development for Bangla. In: Proceeding of SIMPLE 05, pp. 56–60 Das Mandal SK, Saha A, Sarkar I, Datta AK (2005) Phonological, international & prosodic aspects of concatenative speech synthesizer development for Bangla. In: Proceeding of SIMPLE 05, pp. 56–60
go back to reference Lee L-S, Tseng C-Y, Ouh-Young M (1989) The synthesis rules in a Chinese text-to-speech system. IEEE Trans Acous Speech Signal Process 37(9):269–285 Lee L-S, Tseng C-Y, Ouh-Young M (1989) The synthesis rules in a Chinese text-to-speech system. IEEE Trans Acous Speech Signal Process 37(9):269–285
go back to reference Moebius B (1995) Components of a quantitative model of German intonation. In: Proceedings of 13th International Congress of Phonetic Sciences, Stockholm, vol 2, pp 108–115 Moebius B (1995) Components of a quantitative model of German intonation. In: Proceedings of 13th International Congress of Phonetic Sciences, Stockholm, vol 2, pp 108–115
go back to reference Möhler G, Conkie A (1998) Parametric modeling of into nation using vector quantization. In: 3rd European Speech Communication Association (ESCA) Workshop on Speech Synthesis, Jenolan Caves, Australia Möhler G, Conkie A (1998) Parametric modeling of into nation using vector quantization. In: 3rd European Speech Communication Association (ESCA) Workshop on Speech Synthesis, Jenolan Caves, Australia
go back to reference Pike KL (1945) The intonation of American English. University of Michigan Press, AnnArbor, MI Pike KL (1945) The intonation of American English. University of Michigan Press, AnnArbor, MI
go back to reference Pitrelli JF, Zue VW (1989) A hierarchical model for phoneme duration in American English. In Proceeding of Eurospeech-89, Paris, pp 324–327 Pitrelli JF, Zue VW (1989) A hierarchical model for phoneme duration in American English. In Proceeding of Eurospeech-89, Paris, pp 324–327
go back to reference Pollack I (1968) Detection of rate of change of auditory frequency. J Exp Psychol 77:535–541 Pollack I (1968) Detection of rate of change of auditory frequency. J Exp Psychol 77:535–541
go back to reference Rao KS, Yegnanarayana B (2004) Modelling syllable duration in Indian languages using neural networks. In: ICASSP, pp 313–315 Rao KS, Yegnanarayana B (2004) Modelling syllable duration in Indian languages using neural networks. In: ICASSP, pp 313–315
go back to reference Reichel UD (2007) Data-driven extraction of intonation contour classe. In: 6th ISCA Workshop on Speech Synthesis, Germany, pp 240–245 Reichel UD (2007) Data-driven extraction of intonation contour classe. In: 6th ISCA Workshop on Speech Synthesis, Germany, pp 240–245
go back to reference Ritsma RJ (1965) Pitch discrimination and frequency discrimination. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, paper B22 Ritsma RJ (1965) Pitch discrimination and frequency discrimination. In: Commins DE (ed) Proceedings of the fifth International Congress on Acoustics, Liege, paper B22
go back to reference Roy R, Basu T, Saha A, Basu J, Das Manda Shyamal Krl (2008) Duration modeling for Bangla text to speech synthesis system. In: International Conference on Asian Language Processing 2008, Chiang Mai, Thailand, 12–14 Nov 2008 Roy R, Basu T, Saha A, Basu J, Das Manda Shyamal Krl (2008) Duration modeling for Bangla text to speech synthesis system. In: International Conference on Asian Language Processing 2008, Chiang Mai, Thailand, 12–14 Nov 2008
go back to reference Saha A, Basu T, Khan S (2008) Analysis of occurrence and duration of intra and inter sentential pauses in Bangla read out speech. In: Proceeding of Oriental COCOSDA, 2008, Kyoto, Japan, pp 53–58 Saha A, Basu T, Khan S (2008) Analysis of occurrence and duration of intra and inter sentential pauses in Bangla read out speech. In: Proceeding of Oriental COCOSDA, 2008, Kyoto, Japan, pp 53–58
go back to reference Sergeant RL, Harris JD (1962) Sensitivity to unidirectional frequency modulation. J Acoust Soc Am 34:1625–1628 Sergeant RL, Harris JD (1962) Sensitivity to unidirectional frequency modulation. J Acoust Soc Am 34:1625–1628
go back to reference Silverman K, Beckman M, Pitrelli J, Ostendorf M, Wightman C, Price P, et al. (1992) TOBI: a standard for labeling english prosody. In: Proceedings of International Conference on Spoken Language Processing (ICSLP 92), Banff, pp 867–870 Silverman K, Beckman M, Pitrelli J, Ostendorf M, Wightman C, Price P, et al. (1992) TOBI: a standard for labeling english prosody. In: Proceedings of International Conference on Spoken Language Processing (ICSLP 92), Banff, pp 867–870
go back to reference Taylor P (2000) Analysis and synthesis of intonation using the Tilt model. J Acoust Soc Am 107(3):1697–1714 Taylor P (2000) Analysis and synthesis of intonation using the Tilt model. J Acoust Soc Am 107(3):1697–1714
Metadata
Title
Intonation Rules for Text Reading
Author
Asoke Kumar Datta
Copyright Year
2018
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-7016-7_5