Skip to main content
Erschienen in:
Buchtitelbild

2013 | OriginalPaper | Buchkapitel

1. Introduction

verfasst von : K. Sreenivasa Rao, Shashidhar G. Koolagudi

Erschienen in: Robust Emotion Recognition using Spectral and Prosodic Features

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter briefly discusses about the importance of analysis of emotions from speech signal. Significance of emotions from psychological and signal processing aspects is discussed. Influence of emotions on the characteristics of speech production system is briefly mentioned. Role of excitation source, vocal tract system and prosodic features is discussed in the context of various speech tasks highlighting the task of recognizing emotions. Different types of emotional speech databases used for carrying out various emotion-specific tasks are briefly discussed. Various applications related to speech emotion recognition are mentioned. Important state-of-the-art issues prevailing in the area of emotional speech processing are discussed at the end of the chapter along with a note on the organization of the book.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat D. Ververidis, C. Kotropoulos, A state of the art review on emotional speech databases, in Eleventh Australasian International Conference on Speech Science and Technology, Auckland, New Zealand, Dec 2006 D. Ververidis, C. Kotropoulos, A state of the art review on emotional speech databases, in Eleventh Australasian International Conference on Speech Science and Technology, Auckland, New Zealand, Dec 2006
2.
Zurück zum Zitat S.G. Koolagudi, N. Kumar, K.S. Rao, Speech emotion recognition using segmental level prosodic analysis, in International Conference on Devices and Communication, Birla Institute of Technology, Mesra, India, Feb 2011. (IEEE Press, Washington DC, 2011) S.G. Koolagudi, N. Kumar, K.S. Rao, Speech emotion recognition using segmental level prosodic analysis, in International Conference on Devices and Communication, Birla Institute of Technology, Mesra, India, Feb 2011. (IEEE Press, Washington DC, 2011)
3.
Zurück zum Zitat M. Schubiger, English Intonation: Its Form and Function (Niemeyer, Tubingen, 1958) M. Schubiger, English Intonation: Its Form and Function (Niemeyer, Tubingen, 1958)
4.
Zurück zum Zitat J. Connor, G. Arnold, Intonation of Colloquial English, 2nd edn, (Longman, London, 1973) J. Connor, G. Arnold, Intonation of Colloquial English, 2nd edn, (Longman, London, 1973)
5.
Zurück zum Zitat M.E. Ayadi, M.S. Kamel, F. Karray, Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit. 44, 572–587 (2011)CrossRefMATH M.E. Ayadi, M.S. Kamel, F. Karray, Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recognit. 44, 572–587 (2011)CrossRefMATH
6.
Zurück zum Zitat P. Ekman, Basic emotions, in Handbook of Cognition and Emotion (Wiley, Sussex, 1999) P. Ekman, Basic emotions, in Handbook of Cognition and Emotion (Wiley, Sussex, 1999)
7.
Zurück zum Zitat R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, S. Kollias, W. Fellenz, J. Taylor, Emotion recognition in human–computer interaction. IEEE Signal Process. Mag. 18, 32–80 (2001)CrossRef R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, S. Kollias, W. Fellenz, J. Taylor, Emotion recognition in human–computer interaction. IEEE Signal Process. Mag. 18, 32–80 (2001)CrossRef
8.
Zurück zum Zitat J. William, What is an emotion? Mind 9, 188–205 (1884) J. William, What is an emotion? Mind 9, 188–205 (1884)
9.
Zurück zum Zitat A.D. Craig, Interoception and emotion: a neuroanatomical perspective, in Handbook of Emotion (The Guildford Press, New York, 2009). ISBN: 978-1-59385-650-2 A.D. Craig, Interoception and emotion: a neuroanatomical perspective, in Handbook of Emotion (The Guildford Press, New York, 2009). ISBN: 978-1-59385-650-2
10.
Zurück zum Zitat C.E. Williams, K.N. Stevens, Vocal correlates of emotional states, in The evaluation of speech in psychiatry (Grune and Stratton Inc., New York, 1981), pp. 189–220 C.E. Williams, K.N. Stevens, Vocal correlates of emotional states, in The evaluation of speech in psychiatry (Grune and Stratton Inc., New York, 1981), pp. 189–220
11.
Zurück zum Zitat J. Cahn, The generation of affect in synthesized speech. J. Am. Voice Input/Output Soc. 8, 1–19 (1990) J. Cahn, The generation of affect in synthesized speech. J. Am. Voice Input/Output Soc. 8, 1–19 (1990)
12.
Zurück zum Zitat G.M. David, Theories of emotion, in Psychology, vol. 7 (Worth publishers, New York, 2004) G.M. David, Theories of emotion, in Psychology, vol. 7 (Worth publishers, New York, 2004)
13.
Zurück zum Zitat J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)CrossRef J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)CrossRef
14.
Zurück zum Zitat L.R. Rabiner, B.H. Juang, Fundamentals of Speech Recognition (Prentice-Hall, Englewood Cliffs, 1993) L.R. Rabiner, B.H. Juang, Fundamentals of Speech Recognition (Prentice-Hall, Englewood Cliffs, 1993)
15.
Zurück zum Zitat J. Benesty, M.M. Sondhi, Y. Huang (eds.), Springer Handbook on Speech Processing (Springer, Berlin, 2008) J. Benesty, M.M. Sondhi, Y. Huang (eds.), Springer Handbook on Speech Processing (Springer, Berlin, 2008)
16.
Zurück zum Zitat D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun 48, 1162–1181 (2006)CrossRef D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun 48, 1162–1181 (2006)CrossRef
17.
Zurück zum Zitat F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, A database of german emotional speech, in Interspeech, 2005 F. Burkhardt, A. Paeschke, M. Rolfes, W. Sendlmeier, B. Weiss, A database of german emotional speech, in Interspeech, 2005
18.
Zurück zum Zitat S.G. Koolagudi, S. Maity, V.A. Kumar, S. Chakrabarti, K.S. Rao, in IITKGP-SESC: Speech Database for Emotion Analysis, Communications in Computer and Information Science, JIIT University, Noida, India, 17–19 Aug 2009, Springer. ISSN: 1865–0929 S.G. Koolagudi, S. Maity, V.A. Kumar, S. Chakrabarti, K.S. Rao, in IITKGP-SESC: Speech Database for Emotion Analysis, Communications in Computer and Information Science, JIIT University, Noida, India, 17–19 Aug 2009, Springer. ISSN: 1865–0929
19.
Zurück zum Zitat E. McMahon, R. Cowie, S. Kasderidis, J. Taylor, S. Kollias, What chance that a dc could recognize hazardous mental states from sensor inputs?, in Tales of the Disappearing Computer, Santorini, Greece, 2003 E. McMahon, R. Cowie, S. Kasderidis, J. Taylor, S. Kollias, What chance that a dc could recognize hazardous mental states from sensor inputs?, in Tales of the Disappearing Computer, Santorini, Greece, 2003
20.
Zurück zum Zitat C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13, 293–303 (2005)CrossRef C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13, 293–303 (2005)CrossRef
21.
Zurück zum Zitat M. Schroder, R. Cowie, E. Douglas-Cowie, M. Westerdijk, S. Gielen, Acoustic correlates of emotion dimensions in view of speech synthesis, in 7th European Conference on Speech Communication and Technology, EUROSPEECH 2001 Scandinavia, 2nd INTERSPEECH Event, Aalborg, Denmark, 3–7 Sept 2001 M. Schroder, R. Cowie, E. Douglas-Cowie, M. Westerdijk, S. Gielen, Acoustic correlates of emotion dimensions in view of speech synthesis, in 7th European Conference on Speech Communication and Technology, EUROSPEECH 2001 Scandinavia, 2nd INTERSPEECH Event, Aalborg, Denmark, 3–7 Sept 2001
22.
Zurück zum Zitat C. Williams, K. Stevens, Emotions and speech: some acoustical correlates. J. Acoust. Soc. Am. 52(4 pt 2), 1238–1250 (1972)CrossRef C. Williams, K. Stevens, Emotions and speech: some acoustical correlates. J. Acoust. Soc. Am. 52(4 pt 2), 1238–1250 (1972)CrossRef
23.
Zurück zum Zitat A. Batliner, J. Buckow, H. Niemann, E. Noth, V. Warnke, Verbmobile Foundations of Speech to Speech Translation (Springer, Berlin, 2000). ISBN: 3540677836, 9783540677833 A. Batliner, J. Buckow, H. Niemann, E. Noth, V. Warnke, Verbmobile Foundations of Speech to Speech Translation (Springer, Berlin, 2000). ISBN: 3540677836, 9783540677833
24.
Zurück zum Zitat B. Schuller, G. Rigoll, M. Lang, Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP ’04), May 17–21. (IEEE Press, 2004), pp. I-577–580. ISBN: 0-7803-8484-9 B. Schuller, G. Rigoll, M. Lang, Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP ’04), May 17–21. (IEEE Press, 2004), pp. I-577–580. ISBN: 0-7803-8484-9
25.
Zurück zum Zitat F. Dellert, T. Polzin, A. Waibel, Recognizing emotion in speech, in 4th International Conference on Spoken Language Processing, pp. 1970–1973, Philadelphia, 3–6 Oct 1996 F. Dellert, T. Polzin, A. Waibel, Recognizing emotion in speech, in 4th International Conference on Spoken Language Processing, pp. 1970–1973, Philadelphia, 3–6 Oct 1996
26.
Zurück zum Zitat R. Nakatsu, J. Nicholson, N. Tosa, Emotion recognition and its application to computer agents with spontaneous interactive capabilities. Knowl.-Based Syst. 13, 497–504 (2000)CrossRef R. Nakatsu, J. Nicholson, N. Tosa, Emotion recognition and its application to computer agents with spontaneous interactive capabilities. Knowl.-Based Syst. 13, 497–504 (2000)CrossRef
27.
Zurück zum Zitat F. Charles, D. Pizzi, M. Cavazza, T. Vogt, E. Andr, Emoemma: emotional speech input for interactive storytelling, in 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), pp. 1381–1382, Budapest, Hungary, 10–15 May 2009 F. Charles, D. Pizzi, M. Cavazza, T. Vogt, E. Andr, Emoemma: emotional speech input for interactive storytelling, in 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), pp. 1381–1382, Budapest, Hungary, 10–15 May 2009
28.
Zurück zum Zitat T.V. Sagar, Characterisation and synthesis of emotionsin speech using prosodic features, Master’s thesis, Department of Electronics and Communications Engineering, Indian Institute of Technology, Guwahati, May 2007 T.V. Sagar, Characterisation and synthesis of emotionsin speech using prosodic features, Master’s thesis, Department of Electronics and Communications Engineering, Indian Institute of Technology, Guwahati, May 2007
29.
Zurück zum Zitat D.J. France, R.G. Shiavi, S. Silverman, M. Silverman, M. Wilkes, Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Trans. Biomed. Eng. 47(7), 829–837 (2000)CrossRef D.J. France, R.G. Shiavi, S. Silverman, M. Silverman, M. Wilkes, Acoustical properties of speech as indicators of depression and suicidal risk. IEEE Trans. Biomed. Eng. 47(7), 829–837 (2000)CrossRef
30.
Zurück zum Zitat P.-Y. Oudeyer, The production and recognition of emotions in speech: features and algorithms. Int. J. Human Comput. Stud. 59, 157–183 (2003)CrossRef P.-Y. Oudeyer, The production and recognition of emotions in speech: features and algorithms. Int. J. Human Comput. Stud. 59, 157–183 (2003)CrossRef
31.
Zurück zum Zitat J. Hansen, D. Cairns, Icarus: source generator based real-time recognition of speech in noisy stressful and lombard effect environments. Speech Commun. 16(4), 391–422 (1995)CrossRef J. Hansen, D. Cairns, Icarus: source generator based real-time recognition of speech in noisy stressful and lombard effect environments. Speech Commun. 16(4), 391–422 (1995)CrossRef
32.
Zurück zum Zitat M. Schroder, R. Cowie, Issues in emotion-oriented computing toward a shared understanding, in Workshop on Emotion and Computing (HUMAINE), 2006 M. Schroder, R. Cowie, Issues in emotion-oriented computing toward a shared understanding, in Workshop on Emotion and Computing (HUMAINE), 2006
33.
Zurück zum Zitat S.G. Koolagudi, K.S. Rao, Real life emotion classification using vop and pitch based spectral features, in INDICON-2010, Jadavpur University, Kolkata, India, Dec 2010 S.G. Koolagudi, K.S. Rao, Real life emotion classification using vop and pitch based spectral features, in INDICON-2010, Jadavpur University, Kolkata, India, Dec 2010
Metadaten
Titel
Introduction
verfasst von
K. Sreenivasa Rao
Shashidhar G. Koolagudi
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-6360-3_1

Neuer Inhalt