Skip to main content
Top

2017 | OriginalPaper | Chapter

Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus

Author : Tatiana Sherstinova

Published in: Speech and Computer

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Studying prosody is important for understanding many linguistic, pragmatic, and discourse phenomena, as well as for solution of many applied tasks (in particular, in speech technologies). Prosody of everyday speech is extremely diverse, demonstrating high interpersonal and intrapersonal variations. Furthermore, natural everyday speech produces a multitude of effects which are hardly possible to obtain in speech laboratories. Because of this fact, it is very important to create resources containing representative collections of everyday speech data. The ORD corpus is a large resource aimed at studying everyday Russian speech. The paper describes the main stages of speech processing in the ORD corpus starting from segmentation of original files into macroepisodes and up to compiling prosody information into the database. This prosody database will be further used for building empirical prosody models.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Couper-Kuhlen, E.: English Speech Rhythm: Form and Function in Everyday Verbal Interaction. John Benjamins Publications, Amsterdam (1993)CrossRef Couper-Kuhlen, E.: English Speech Rhythm: Form and Function in Everyday Verbal Interaction. John Benjamins Publications, Amsterdam (1993)CrossRef
2.
go back to reference Couper-Kuhlen, E., Selting, M. (eds.): Prosody in conversation: Interactional studies. Cambridge University Press, Cambridge (1996) Couper-Kuhlen, E., Selting, M. (eds.): Prosody in conversation: Interactional studies. Cambridge University Press, Cambridge (1996)
3.
go back to reference Wells, B., Macfarlane, S.: Prosody as an interactional resource: turn-projection and overlap. Lang. Speech 41, 265–294 (1998)CrossRef Wells, B., Macfarlane, S.: Prosody as an interactional resource: turn-projection and overlap. Lang. Speech 41, 265–294 (1998)CrossRef
4.
go back to reference Klatt, D.H.: Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J. Acoust. Soc. Am. 59, 1208–1221 (1976)CrossRef Klatt, D.H.: Linguistic uses of segmental duration in English: acoustic and perceptual evidence. J. Acoust. Soc. Am. 59, 1208–1221 (1976)CrossRef
5.
go back to reference Kello, C.T.: Patterns of timing in the acquisition, perception, and production of speech. J. Phonetics 31(3–4), 619–626 (2003)CrossRef Kello, C.T.: Patterns of timing in the acquisition, perception, and production of speech. J. Phonetics 31(3–4), 619–626 (2003)CrossRef
6.
go back to reference Campbell, N.: Timing in speech. A Multi-Level Process. In: Horne, M. (ed.) Prosody: Theory and Experiment, pp. 281–334. Kluwer Academic Publishers (2000) Campbell, N.: Timing in speech. A Multi-Level Process. In: Horne, M. (ed.) Prosody: Theory and Experiment, pp. 281–334. Kluwer Academic Publishers (2000)
7.
go back to reference O’Connell, D.C.: Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse. Springer New York, New York (2008)CrossRef O’Connell, D.C.: Communicating with One Another: Toward a Psychology of Spontaneous Spoken Discourse. Springer New York, New York (2008)CrossRef
8.
go back to reference Barth-Weingarten, D., Reber, E., Selting, M.: Prosody in interaction. John Benjamins, Amsterdam, Philadelphia (2010)CrossRef Barth-Weingarten, D., Reber, E., Selting, M.: Prosody in interaction. John Benjamins, Amsterdam, Philadelphia (2010)CrossRef
9.
go back to reference Benesty, J., Sondhi, M., Huang, Y. (eds.): Handbook of Speech Processing, Springer (2008) Benesty, J., Sondhi, M., Huang, Y. (eds.): Handbook of Speech Processing, Springer (2008)
10.
go back to reference Harrington, J.: The Phonetic Analysis of Speech Corpora. Wiley-Blackwell, Chichester (2010) Harrington, J.: The Phonetic Analysis of Speech Corpora. Wiley-Blackwell, Chichester (2010)
11.
go back to reference Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Pearson Prentice Hall, Englewood Cliffs (2001) Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Pearson Prentice Hall, Englewood Cliffs (2001)
12.
go back to reference Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, Englewood Cliffs (2008) Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Prentice Hall, Englewood Cliffs (2008)
13.
go back to reference Potapova, R.K., Potapov, V.V., Lebedeva, N.N., Agibalova. T.V.: Interdisciplinarity in the study of speech polyinformativity. Languages of Slavic Culture (2015) Potapova, R.K., Potapov, V.V., Lebedeva, N.N., Agibalova. T.V.: Interdisciplinarity in the study of speech polyinformativity. Languages of Slavic Culture (2015)
14.
go back to reference Wennerstrom, A.K.: The Music of Everyday Speech: Prosody and discourse analysis. Oxford University Press, New York (2001) Wennerstrom, A.K.: The Music of Everyday Speech: Prosody and discourse analysis. Oxford University Press, New York (2001)
15.
go back to reference Cummins, F.: Probing the dynamics of speech production. In: Sudhoff, S. et al. (ed.) Methods in Empirical Prosody Research. Language, Context and Cognition. W. De Gruyter, Berlin–New York, pp. 211–228 (2006) Cummins, F.: Probing the dynamics of speech production. In: Sudhoff, S. et al. (ed.) Methods in Empirical Prosody Research. Language, Context and Cognition. W. De Gruyter, Berlin–New York, pp. 211–228 (2006)
16.
go back to reference Sibata, T.: Sociolinguistics in Japanese contexts. In: Kunihiro, T., Inoue, F., Long, D. (eds.) Mouton de Gruyter. Berlin-New York (1999) Sibata, T.: Sociolinguistics in Japanese contexts. In: Kunihiro, T., Inoue, F., Long, D. (eds.) Mouton de Gruyter. Berlin-New York (1999)
17.
go back to reference Campbell, N.: Speech & expression; the value of a longitudinal corpus. LREC 2004, 183–186 (2004) Campbell, N.: Speech & expression; the value of a longitudinal corpus. LREC 2004, 183–186 (2004)
19.
go back to reference Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_36 CrossRef Asinovsky, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., Sherstinova, T.: The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 250–257. Springer, Heidelberg (2009). doi:10.​1007/​978-3-642-04208-9_​36 CrossRef
20.
go back to reference Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Sociolinguistic extension of the ORD corpus of Russian everyday speech. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS, vol. 9811, pp. 659–666. Springer, Cham (2016). doi:10.1007/978-3-319-43958-7_80 CrossRef Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Sociolinguistic extension of the ORD corpus of Russian everyday speech. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS, vol. 9811, pp. 659–666. Springer, Cham (2016). doi:10.​1007/​978-3-319-43958-7_​80 CrossRef
21.
go back to reference Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Everyday Russian language in different social groups. Commun. Res. 2(8), 81–92 (2016) Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O., Ermolova, O., Baeva, E., Martynenko, G., Ryko, A.: Everyday Russian language in different social groups. Commun. Res. 2(8), 81–92 (2016)
22.
go back to reference Sherstinova, T.: Macro episodes of Russian everyday oral communication: towards pragmatic annotation of the ORD speech corpus. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 268–276. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_33 CrossRef Sherstinova, T.: Macro episodes of Russian everyday oral communication: towards pragmatic annotation of the ORD speech corpus. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 268–276. Springer, Cham (2015). doi:10.​1007/​978-3-319-23132-7_​33 CrossRef
23.
go back to reference Sherstinova, T.: The structure of the ORD speech corpus of Russian everyday communication. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 258–265. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04208-9_37 CrossRef Sherstinova, T.: The structure of the ORD speech corpus of Russian everyday communication. In: Matoušek, V., Mautner, P. (eds.) TSD 2009. LNCS, vol. 5729, pp. 258–265. Springer, Heidelberg (2009). doi:10.​1007/​978-3-642-04208-9_​37 CrossRef
25.
go back to reference Sherstinova, T.: Pragmaticheskoe annotirovanie konnunicativnykh jedinic v korpuse ORD: mikroepisody i rechevye akty (Approaches to Pragmatic Annotation in the ORD Corpus: Microepisodes and Speech Acts). In: Proceedings of the International Conference on “Corpus linguistics-2015”, pp. 436–446 (2015) Sherstinova, T.: Pragmaticheskoe annotirovanie konnunicativnykh jedinic v korpuse ORD: mikroepisody i rechevye akty (Approaches to Pragmatic Annotation in the ORD Corpus: Microepisodes and Speech Acts). In: Proceedings of the International Conference on “Corpus linguistics-2015”, pp. 436–446 (2015)
27.
go back to reference Prodan, A., Chistikov, P., Talanov, A.: The system of preparation of a new voice for the speech synthesis system “VITALVOICE”. Komp’juternaja lingvistika i intellektual’nye tehnologii 9(16), 394–399 (2010) Prodan, A., Chistikov, P., Talanov, A.: The system of preparation of a new voice for the speech synthesis system “VITALVOICE”. Komp’juternaja lingvistika i intellektual’nye tehnologii 9(16), 394–399 (2010)
29.
go back to reference Sherstinova, T.: Speech acts annotation of everyday conversations in the ORD corpus of spoken Russian. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) Speech and Computer (SPECOM 2016). LNAI. Springer, Switzerland (2016) Sherstinova, T.: Speech acts annotation of everyday conversations in the ORD corpus of spoken Russian. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) Speech and Computer (SPECOM 2016). LNAI. Springer, Switzerland (2016)
Metadata
Title
Preparing Audio Recordings of Everyday Speech for Prosody Research: The Case of the ORD Corpus
Author
Tatiana Sherstinova
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-66429-3_62

Premium Partner