Skip to main content
Erschienen in: Cognitive Processing 2/2012

01.10.2012 | Review

On the recognition of emotional vocal expressions: motivations for a holistic approach

verfasst von: Anna Esposito, Antonietta M. Esposito

Erschienen in: Cognitive Processing | Sonderheft 2/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Human beings seem to be able to recognize emotions from speech very well and information communication technology aims to implement machines and agents that can do the same. However, to be able to automatically recognize affective states from speech signals, it is necessary to solve two main technological problems. The former concerns the identification of effective and efficient processing algorithms capable of capturing emotional acoustic features from speech sentences. The latter focuses on finding computational models able to classify, with an approximation as good as human listeners, a given set of emotional states. This paper will survey these topics and provide some insights for a holistic approach to the automatic analysis, recognition and synthesis of affective states.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Sony AIBO Europe, Sony entertainment. www.​sonydigital-link.​com/​AIBO/​.
 
Literatur
Zurück zum Zitat Apolloni B, Aversano G, Esposito A (2000) Preprocessing and classification of emotional features in speech sentences. In: Kosarev Y (ed) Proceedings of international workshop on speech and computer. SPIIRAS, pp 49–52 Apolloni B, Aversano G, Esposito A (2000) Preprocessing and classification of emotional features in speech sentences. In: Kosarev Y (ed) Proceedings of international workshop on speech and computer. SPIIRAS, pp 49–52
Zurück zum Zitat Apolloni B, Esposito A, Malchiodi D, Orovas C, Palmas G, Taylor JG (2004) A general framework for learning rules from data. IEEE Trans Neural Networks 15(6):1333–1350CrossRef Apolloni B, Esposito A, Malchiodi D, Orovas C, Palmas G, Taylor JG (2004) A general framework for learning rules from data. IEEE Trans Neural Networks 15(6):1333–1350CrossRef
Zurück zum Zitat Atassi H, Esposito A (2008) Speaker independent approach to the classification of emotional vocal expressions. In: Proceedings of IEEE conference on tools with artificial intelligence (ICTAI), vol 1. Dayton, OH, USA, pp 487–494 Atassi H, Esposito A (2008) Speaker independent approach to the classification of emotional vocal expressions. In: Proceedings of IEEE conference on tools with artificial intelligence (ICTAI), vol 1. Dayton, OH, USA, pp 487–494
Zurück zum Zitat Atassi H, Riviello MT, Smékal Z, Hussain A, Esposito A (2010) Emotional vocal expressions recognition using the COST 2102 Italian database of emotional speech. In: Esposito A et al (eds) LNCS, vol 5967. Springer, Berlin, pp 406–422 Atassi H, Riviello MT, Smékal Z, Hussain A, Esposito A (2010) Emotional vocal expressions recognition using the COST 2102 Italian database of emotional speech. In: Esposito A et al (eds) LNCS, vol 5967. Springer, Berlin, pp 406–422
Zurück zum Zitat Aversano G, Esposito A, Esposito AM, Marinaro M (2001) A new text-independent method for phoneme segmentation. In: Ewing RL et al (eds) Proceedings of the IEEE international workshop on circuits and systems, vol 2, pp 516–519 Aversano G, Esposito A, Esposito AM, Marinaro M (2001) A new text-independent method for phoneme segmentation. In: Ewing RL et al (eds) Proceedings of the IEEE international workshop on circuits and systems, vol 2, pp 516–519
Zurück zum Zitat Bachorowski JA (1999) Vocal expression and perception of emotion. Curr Dir Psychol Sci 8:53–57CrossRef Bachorowski JA (1999) Vocal expression and perception of emotion. Curr Dir Psychol Sci 8:53–57CrossRef
Zurück zum Zitat Banse R, Scherer K (1996) Acoustic profiles in vocal emotion expression. J Pers Soc Psychol 70(3):614–636PubMedCrossRef Banse R, Scherer K (1996) Acoustic profiles in vocal emotion expression. J Pers Soc Psychol 70(3):614–636PubMedCrossRef
Zurück zum Zitat Bargh JA, Chen M, Burrows L (1996) Automaticity of social behavior: direct effects of trait construct and stereotype activation on action. J Pers Soc Psychol 71:230–244PubMedCrossRef Bargh JA, Chen M, Burrows L (1996) Automaticity of social behavior: direct effects of trait construct and stereotype activation on action. J Pers Soc Psychol 71:230–244PubMedCrossRef
Zurück zum Zitat Barsalou LW, Niedenthal PM, Barbey AK, Ruppert JA (2003) Social embodiment. In: Ross BH (ed) The psychology of learning and motivation, vol 43. Academic Press, San Diego, pp 43–92 Barsalou LW, Niedenthal PM, Barbey AK, Ruppert JA (2003) Social embodiment. In: Ross BH (ed) The psychology of learning and motivation, vol 43. Academic Press, San Diego, pp 43–92
Zurück zum Zitat Benoit C, Mohamadi T, Kandel S (1994) Effects of phonetic context on audio-visual intelligibility of French. J Speech Hear Res 37:1195–1203PubMed Benoit C, Mohamadi T, Kandel S (1994) Effects of phonetic context on audio-visual intelligibility of French. J Speech Hear Res 37:1195–1203PubMed
Zurück zum Zitat Block N (1995) The mind as the software of the brain. In: Smith EE, Osherson DN (eds) Thinking. MIT Press, Cambridge, pp 377–425 Block N (1995) The mind as the software of the brain. In: Smith EE, Osherson DN (eds) Thinking. MIT Press, Cambridge, pp 377–425
Zurück zum Zitat Blumberg BM, Todd PM, Maes P (1996) No bad dogs: ethological lessons for learning in Hamsterdam. In: Proceedings of the 4th international conference on simulation of adaptive behaviour, MIT Press/Bradford Books, Cambridge, pp 295–304 Blumberg BM, Todd PM, Maes P (1996) No bad dogs: ethological lessons for learning in Hamsterdam. In: Proceedings of the 4th international conference on simulation of adaptive behaviour, MIT Press/Bradford Books, Cambridge, pp 295–304
Zurück zum Zitat Breazeal C, Aryananda L (2002) Recognition of affective communicative intent in robot-directed speech. Auton Robots 12:83–104CrossRef Breazeal C, Aryananda L (2002) Recognition of affective communicative intent in robot-directed speech. Auton Robots 12:83–104CrossRef
Zurück zum Zitat Breitenstein C, Van Lancker D, Daum I (2001) The contribution of speech rate and pitch variation to the perception of vocal emotions in a German and an American sample. Cogn Emot 15(1):57–79 Breitenstein C, Van Lancker D, Daum I (2001) The contribution of speech rate and pitch variation to the perception of vocal emotions in a German and an American sample. Cogn Emot 15(1):57–79
Zurück zum Zitat Bryant GA, Barrett HC (2007) Recognizing intentions in infant-directed speech. Psychol Sci 18(8):746–751PubMedCrossRef Bryant GA, Barrett HC (2007) Recognizing intentions in infant-directed speech. Psychol Sci 18(8):746–751PubMedCrossRef
Zurück zum Zitat Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: Proceedings of Interspeech, pp 1517–1520 Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. In: Proceedings of Interspeech, pp 1517–1520
Zurück zum Zitat Busso C, Lee S, Narayanan SS (2007) Using neutral speech models for emotional speech analysis. In: Proceedings of Interspeech, Antwerp, Belgium, pp 2225–2228 Busso C, Lee S, Narayanan SS (2007) Using neutral speech models for emotional speech analysis. In: Proceedings of Interspeech, Antwerp, Belgium, pp 2225–2228
Zurück zum Zitat Butterworth BL, Beattie GW (1978) Gestures and silence as indicator of planning in speech. In: Campbell RN, Smith PT (eds) Recent advances in the psychology of language. Olenum Press, New York, pp 347–360 Butterworth BL, Beattie GW (1978) Gestures and silence as indicator of planning in speech. In: Campbell RN, Smith PT (eds) Recent advances in the psychology of language. Olenum Press, New York, pp 347–360
Zurück zum Zitat Callan DE, Jones JA, Munhall K, Callan AM, Kroos C, Vatikiotis-Bateson E (2003) Neural processes underlying perceptual enhancement by visual speech gestures. NeuroReport 14:2213–2218PubMedCrossRef Callan DE, Jones JA, Munhall K, Callan AM, Kroos C, Vatikiotis-Bateson E (2003) Neural processes underlying perceptual enhancement by visual speech gestures. NeuroReport 14:2213–2218PubMedCrossRef
Zurück zum Zitat Chafe WL (1987) Cognitive constraint on information flow. In: Tomlin R (ed) Coherence and grounding in discourse. John Benjamins, Amsterdam, pp 20–51 Chafe WL (1987) Cognitive constraint on information flow. In: Tomlin R (ed) Coherence and grounding in discourse. John Benjamins, Amsterdam, pp 20–51
Zurück zum Zitat de Byl PB, Toleman MA (2005) Engineering emotionally intelligent agents. Encycl Inf Sci Technol II:1052–1056CrossRef de Byl PB, Toleman MA (2005) Engineering emotionally intelligent agents. Encycl Inf Sci Technol II:1052–1056CrossRef
Zurück zum Zitat Dennett DC (1969) Content and consciousness. Humanities Press, Oxford Dennett DC (1969) Content and consciousness. Humanities Press, Oxford
Zurück zum Zitat Douglas-Cowie E, Cowie R, Schroder M (2000) A new emotion database: considerations, source and scope. In: Proceedings of ISCA workshop on speech and emotion. Belfast, Northern Ireland Douglas-Cowie E, Cowie R, Schroder M (2000) A new emotion database: considerations, source and scope. In: Proceedings of ISCA workshop on speech and emotion. Belfast, Northern Ireland
Zurück zum Zitat Duda R, Hart P, Stork D (2003) Pattern classification, 2nd edn. Wiley, New York Duda R, Hart P, Stork D (2003) Pattern classification, 2nd edn. Wiley, New York
Zurück zum Zitat El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn 44:572–587CrossRef El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn 44:572–587CrossRef
Zurück zum Zitat Elman JL (1991) Distributed representation, simple recurrent neural networks, and grammatical structure. Mach Learn 7:195–225 Elman JL (1991) Distributed representation, simple recurrent neural networks, and grammatical structure. Mach Learn 7:195–225
Zurück zum Zitat El-Nasr MS (1998) Modeling emotion dynamics in intelligent agents. M.Sc. dissertation, American University in Cairo El-Nasr MS (1998) Modeling emotion dynamics in intelligent agents. M.Sc. dissertation, American University in Cairo
Zurück zum Zitat Esposito A (2000) Approaching speech signal problems: an unifying viewpoint for the speech recognition process. In: Memoria of Taller Internacional de Tratamiento del Habla, Procesamiento de Vos y el Language, Suarez Garcia S, Baron Fernandez R (Eds), CIC-IPN Obra Compleata, Memoria. ISBN: 970-18-4936-1 Esposito A (2000) Approaching speech signal problems: an unifying viewpoint for the speech recognition process. In: Memoria of Taller Internacional de Tratamiento del Habla, Procesamiento de Vos y el Language, Suarez Garcia S, Baron Fernandez R (Eds), CIC-IPN Obra Compleata, Memoria. ISBN: 970-18-4936-1
Zurück zum Zitat Esposito A (2002) The importance of data for training intelligent devices. In: Apolloni B, Kurfess C (eds) From synapses to rules: discovering symbolic knowledge from neural processed data. Kluwer, Dordrecht, pp 229–250 Esposito A (2002) The importance of data for training intelligent devices. In: Apolloni B, Kurfess C (eds) From synapses to rules: discovering symbolic knowledge from neural processed data. Kluwer, Dordrecht, pp 229–250
Zurück zum Zitat Esposito A (2007) The amount of information on emotional states conveyed by the verbal and nonverbal channels: some perceptual data. In: Stilianou Y et al (eds) Progress in nonlinear speech processing. LNCS, vol 4391. Springer, Berlin, pp 245–268 Esposito A (2007) The amount of information on emotional states conveyed by the verbal and nonverbal channels: some perceptual data. In: Stilianou Y et al (eds) Progress in nonlinear speech processing. LNCS, vol 4391. Springer, Berlin, pp 245–268
Zurück zum Zitat Esposito A (2008) Affect in multimodal information. In: Tao J, Tan T (eds) Affective information processing, Springer, Heidelberg, pp 211–234 Esposito A (2008) Affect in multimodal information. In: Tao J, Tan T (eds) Affective information processing, Springer, Heidelberg, pp 211–234
Zurück zum Zitat Esposito A (2009) The perceptual and cognitive role of visual and auditory channels in conveying emotional information. Cogn Comput J 2:268–278CrossRef Esposito A (2009) The perceptual and cognitive role of visual and auditory channels in conveying emotional information. Cogn Comput J 2:268–278CrossRef
Zurück zum Zitat Esposito A, Aversano G (2005) Text independent methods for speech segmentation. In: Chollet G et al (eds) Nonlinear speech modeling and applications, LNCS, vol 3445, pp 261–290 Esposito A, Aversano G (2005) Text independent methods for speech segmentation. In: Chollet G et al (eds) Nonlinear speech modeling and applications, LNCS, vol 3445, pp 261–290
Zurück zum Zitat Esposito A, Marinaro M (2007) What pauses can tell us about speech and gesture partnership. In: Esposito A et al (eds) Fundamentals of verbal and nonverbal communication and the biometric issue, vol 18. IOS press, Amsterdam, pp 45–57 Esposito A, Marinaro M (2007) What pauses can tell us about speech and gesture partnership. In: Esposito A et al (eds) Fundamentals of verbal and nonverbal communication and the biometric issue, vol 18. IOS press, Amsterdam, pp 45–57
Zurück zum Zitat Esposito A, Riviello MT (2010) The new Italian audio and video emotional database. In: Esposito A et al (eds) LNCS, vol 5967. Springer, Berlin, pp 406–422 Esposito A, Riviello MT (2010) The new Italian audio and video emotional database. In: Esposito A et al (eds) LNCS, vol 5967. Springer, Berlin, pp 406–422
Zurück zum Zitat Esposito A, Riviello MT (2011) The cross-modal and cross-cultural processing of affective information. In: Apolloni B et al (eds) Frontiers in artificial intelligence and applications. IOS press, Amsterdam, pp 301–310 Esposito A, Riviello MT (2011) The cross-modal and cross-cultural processing of affective information. In: Apolloni B et al (eds) Frontiers in artificial intelligence and applications. IOS press, Amsterdam, pp 301–310
Zurück zum Zitat Esposito A, Riviello MT, Di Maio G (2009a) The COST 2102 Italian audio and video emotional database. In: Apolloni B et al (eds) WIRN09, vol 204. IOS press, Amsterdam, pp 51–61 Esposito A, Riviello MT, Di Maio G (2009a) The COST 2102 Italian audio and video emotional database. In: Apolloni B et al (eds) WIRN09, vol 204. IOS press, Amsterdam, pp 51–61
Zurück zum Zitat Esposito A, Riviello MT, Bourbakis N (2009b) Cultural specific effects on the recognition of basic emotions: a study on Italian subjects. In: Holzinger A, Miesenberger K (eds) USAB 2009, LNCS, vol 5889. Springer, Berlin, pp 135–148 Esposito A, Riviello MT, Bourbakis N (2009b) Cultural specific effects on the recognition of basic emotions: a study on Italian subjects. In: Holzinger A, Miesenberger K (eds) USAB 2009, LNCS, vol 5889. Springer, Berlin, pp 135–148
Zurück zum Zitat Fodor JA (1983) The modularity of mind. MIT Press, Cambridge Fodor JA (1983) The modularity of mind. MIT Press, Cambridge
Zurück zum Zitat Fragopanagos N, Taylor JG (2005) Emotion recognition in human–computer interaction. Neural Netw 18:389–405PubMedCrossRef Fragopanagos N, Taylor JG (2005) Emotion recognition in human–computer interaction. Neural Netw 18:389–405PubMedCrossRef
Zurück zum Zitat Frens MA, Van Opstal AJ, Van der Willigen RF (1995) Spatial and temporal factors determine auditory-visual interactions in human saccadic eye movements. Percept Psychophys 57:802–816PubMedCrossRef Frens MA, Van Opstal AJ, Van der Willigen RF (1995) Spatial and temporal factors determine auditory-visual interactions in human saccadic eye movements. Percept Psychophys 57:802–816PubMedCrossRef
Zurück zum Zitat Friend M (2000) Developmental changes in sensitivity to vocal paralanguage. Dev Sci 3:148–162CrossRef Friend M (2000) Developmental changes in sensitivity to vocal paralanguage. Dev Sci 3:148–162CrossRef
Zurück zum Zitat Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. JASA 87(4):1738–1752 Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. JASA 87(4):1738–1752
Zurück zum Zitat Hermansky H, Morgan N (1994) RASTA processing of speech. IEEE Trans Speech Audio Process 2(4):578–589CrossRef Hermansky H, Morgan N (1994) RASTA processing of speech. IEEE Trans Speech Audio Process 2(4):578–589CrossRef
Zurück zum Zitat Hozjan V, Kacic Z (2003) Context-independent multilingual emotion recognition from speech signals. Int J Speech Technol 6:311–320CrossRef Hozjan V, Kacic Z (2003) Context-independent multilingual emotion recognition from speech signals. Int J Speech Technol 6:311–320CrossRef
Zurück zum Zitat Hozjan V, Kacic Z (2006) A rule-based emotion-dependent feature extraction method for emotion analysis from speech. JASA 119(5):3109–3120 Hozjan V, Kacic Z (2006) A rule-based emotion-dependent feature extraction method for emotion analysis from speech. JASA 119(5):3109–3120
Zurück zum Zitat Hu H, Xu M, Wu W (2007) GMM supervector based SVM with spectral features for speech emotion recognition. In: Proceedings of ICASSP, vol 4, pp IV 413–IV 416 Hu H, Xu M, Wu W (2007) GMM supervector based SVM with spectral features for speech emotion recognition. In: Proceedings of ICASSP, vol 4, pp IV 413–IV 416
Zurück zum Zitat Hughes HC, Reuter-Lorenz PA, Nozawa G, Fendrich R (1994) Visual auditory interactions in sensorimotor processing: saccades versus manual responses. J Exp Psychol Hum Percept Perform 20:131–153PubMedCrossRef Hughes HC, Reuter-Lorenz PA, Nozawa G, Fendrich R (1994) Visual auditory interactions in sensorimotor processing: saccades versus manual responses. J Exp Psychol Hum Percept Perform 20:131–153PubMedCrossRef
Zurück zum Zitat Izard CE (1992) Basic emotions, relations among emotions, and emotion–cognition relations. Psychol Rev 99:561–565PubMedCrossRef Izard CE (1992) Basic emotions, relations among emotions, and emotion–cognition relations. Psychol Rev 99:561–565PubMedCrossRef
Zurück zum Zitat Jones C, Deeming A (2008) Affective human-robotic interaction. In: Peter C, Beale R (eds) Affect and emotion in HCI, LNCS, vol 4868. Springer, pp 175–185 Jones C, Deeming A (2008) Affective human-robotic interaction. In: Peter C, Beale R (eds) Affect and emotion in HCI, LNCS, vol 4868. Springer, pp 175–185
Zurück zum Zitat Klasmeyer G, Sendlmeier WF (1995) Objective voice parameters to characterize the emotional content in speech. In: Elenius K, Branderudf P (Eds) Proceedings of ICPhS, Arne Strömbergs Grafiska, vol 1, pp 182–185 Klasmeyer G, Sendlmeier WF (1995) Objective voice parameters to characterize the emotional content in speech. In: Elenius K, Branderudf P (Eds) Proceedings of ICPhS, Arne Strömbergs Grafiska, vol 1, pp 182–185
Zurück zum Zitat Lindblom B (1990) Explaining phonetic variation: a sketch of the H&H theory. In: Hardcastle J, Marchal A (eds) Speech production and speech modeling. Kluwer, Dordrecht, pp 403–439CrossRef Lindblom B (1990) Explaining phonetic variation: a sketch of the H&H theory. In: Hardcastle J, Marchal A (eds) Speech production and speech modeling. Kluwer, Dordrecht, pp 403–439CrossRef
Zurück zum Zitat Lugger M, Yang B (2007) The relevance of voice quality features in speaker independent emotion recognition. In: Proceedings of ICASSP, vol 4, pp 17–20 Lugger M, Yang B (2007) The relevance of voice quality features in speaker independent emotion recognition. In: Proceedings of ICASSP, vol 4, pp 17–20
Zurück zum Zitat Macaluso E, George N, Dolan R, Spence C, Driver J (2004) Spatial and temporal factors during processing of audiovisual speech: a PET study. NeuroImage 21:725–732PubMedCrossRef Macaluso E, George N, Dolan R, Spence C, Driver J (2004) Spatial and temporal factors during processing of audiovisual speech: a PET study. NeuroImage 21:725–732PubMedCrossRef
Zurück zum Zitat Makhoul J (1975) Linear prediction: a tutorial review. Proc IEEE 63(4):561–580CrossRef Makhoul J (1975) Linear prediction: a tutorial review. Proc IEEE 63(4):561–580CrossRef
Zurück zum Zitat Navas E, Hernáez I, Luengo I (2006) An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. IEEE Trans Audio Speech Lang Process 14(4):1117–1127CrossRef Navas E, Hernáez I, Luengo I (2006) An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. IEEE Trans Audio Speech Lang Process 14(4):1117–1127CrossRef
Zurück zum Zitat Newell A, Simon HA (1972) Human problem solving. Prentice Hall, Oxford Newell A, Simon HA (1972) Human problem solving. Prentice Hall, Oxford
Zurück zum Zitat Nushikyan EA (1995) Intonational universals in texual context. In: Elenius K, Branderudf P (eds) Proceedings of ICPhS 1995, Arne Strömbergs Grafiska, vol 1, pp 258–261 Nushikyan EA (1995) Intonational universals in texual context. In: Elenius K, Branderudf P (eds) Proceedings of ICPhS 1995, Arne Strömbergs Grafiska, vol 1, pp 258–261
Zurück zum Zitat Nwe T, Foo S, De Silva L (2003) Speech emotion recognition using Hidden Markov models. Speech Commun 41:603–623CrossRef Nwe T, Foo S, De Silva L (2003) Speech emotion recognition using Hidden Markov models. Speech Commun 41:603–623CrossRef
Zurück zum Zitat Oatley K, Jenkins JM (2006) Understanding emotions, 2nd edn. Blackwell, Oxford Oatley K, Jenkins JM (2006) Understanding emotions, 2nd edn. Blackwell, Oxford
Zurück zum Zitat Penrose R (1989) The emperor’s new mind. Oxford University Press, New York Penrose R (1989) The emperor’s new mind. Oxford University Press, New York
Zurück zum Zitat Perrott DR, Sadralodabai T, Saberi K, Strybel TZ (1991) Aurally aided visual search in the central visual field: effects of visual load and visual enhancement of the target. Hum Factors 33:389–400PubMed Perrott DR, Sadralodabai T, Saberi K, Strybel TZ (1991) Aurally aided visual search in the central visual field: effects of visual load and visual enhancement of the target. Hum Factors 33:389–400PubMed
Zurück zum Zitat Petrushin V (1999) Emotion in speech: recognition and application to call centers. In: Proceedings of the conference on artificial neural networks in engineering, pp 7–10 Petrushin V (1999) Emotion in speech: recognition and application to call centers. In: Proceedings of the conference on artificial neural networks in engineering, pp 7–10
Zurück zum Zitat Picard R (2000) Toward computers that recognize and respond to user emotion. IBM Syst J 39(3–4):705–719CrossRef Picard R (2000) Toward computers that recognize and respond to user emotion. IBM Syst J 39(3–4):705–719CrossRef
Zurück zum Zitat Pierre-Yves O (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59:157–183CrossRef Pierre-Yves O (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59:157–183CrossRef
Zurück zum Zitat Plutchik R (1993) Emotion and their vicissitudes: emotions and psychopatology. In: Lewis JM, Haviland-Jones M (eds) Handbook of emotion. Guilford Press, New York, pp 53–66 Plutchik R (1993) Emotion and their vicissitudes: emotions and psychopatology. In: Lewis JM, Haviland-Jones M (eds) Handbook of emotion. Guilford Press, New York, pp 53–66
Zurück zum Zitat Pudil P, Ferri F, Novovicova J, Kittler J (1994) Floating search method for feature selection with non monotonic criterion functions. Pattern Recogn 2:279–283 Pudil P, Ferri F, Novovicova J, Kittler J (1994) Floating search method for feature selection with non monotonic criterion functions. Pattern Recogn 2:279–283
Zurück zum Zitat Pylyshyn ZW (1984) Computation and cognition: toward a foundation for cognitive science. MIT Press, Cambridge Pylyshyn ZW (1984) Computation and cognition: toward a foundation for cognitive science. MIT Press, Cambridge
Zurück zum Zitat Razak A, Komiya R, Abidin M (2005) Comparison between fuzzy and nn method for speech emotion recognition. In: Proceedings of 3rd international conference on information technology and applications ICITA, vol 1, pp 297–302 Razak A, Komiya R, Abidin M (2005) Comparison between fuzzy and nn method for speech emotion recognition. In: Proceedings of 3rd international conference on information technology and applications ICITA, vol 1, pp 297–302
Zurück zum Zitat Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178CrossRef Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39:1161–1178CrossRef
Zurück zum Zitat Scherer KR (1989) Vocal correlates of emotional arousal and affective disturbance. In: Wagner H, Mner H, Manstead A (eds) Handbook of social psychophysiology. Wiley, New York, pp 165–197 Scherer KR (1989) Vocal correlates of emotional arousal and affective disturbance. In: Wagner H, Mner H, Manstead A (eds) Handbook of social psychophysiology. Wiley, New York, pp 165–197
Zurück zum Zitat Scherer K (2003) Vocal communication of emotion: a review of research paradigms. Speech Commun 40:227–256CrossRef Scherer K (2003) Vocal communication of emotion: a review of research paradigms. Speech Commun 40:227–256CrossRef
Zurück zum Zitat Scherer KR, Banse R, Wallbott HG (2001) Emotion inferences from vocal expression correlate across languages and cultures. J Cross Cult Psychol 32:76–92CrossRef Scherer KR, Banse R, Wallbott HG (2001) Emotion inferences from vocal expression correlate across languages and cultures. J Cross Cult Psychol 32:76–92CrossRef
Zurück zum Zitat Schubert TW (2004) The power in your hand: gender differences in bodily feedback from making a fist. Pers Soc Psychol Bull 30:757–769PubMedCrossRef Schubert TW (2004) The power in your hand: gender differences in bodily feedback from making a fist. Pers Soc Psychol Bull 30:757–769PubMedCrossRef
Zurück zum Zitat Schuller B, Rigoll G, Lang M (2004) Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings of the ICASSP, vol 1, pp 577–580 Schuller B, Rigoll G, Lang M (2004) Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In: Proceedings of the ICASSP, vol 1, pp 577–580
Zurück zum Zitat Schulz M, Ross B, Pantev C (2003) Evidence for training-induced cross modal reorganization of cortical functions in trumpet players. NeuroReport 14:157–161PubMedCrossRef Schulz M, Ross B, Pantev C (2003) Evidence for training-induced cross modal reorganization of cortical functions in trumpet players. NeuroReport 14:157–161PubMedCrossRef
Zurück zum Zitat Slaney M, McRoberts G (2003) Baby ears: a recognition system for affective vocalizations. Speech Commun 39:367–384CrossRef Slaney M, McRoberts G (2003) Baby ears: a recognition system for affective vocalizations. Speech Commun 39:367–384CrossRef
Zurück zum Zitat Sloman A (2001) Beyond shallow models of emotion. Cogn Process 2(1):177–198 Sloman A (2001) Beyond shallow models of emotion. Cogn Process 2(1):177–198
Zurück zum Zitat Smit ER, Semin GR (2004) Socially situated cognition: cognition in its social context. Adv Exp Soc Psychol 36:53–117CrossRef Smit ER, Semin GR (2004) Socially situated cognition: cognition in its social context. Adv Exp Soc Psychol 36:53–117CrossRef
Zurück zum Zitat Stein BE, Jiang W, Wallace MT, Stanford TR (2001) Nonvisual influences on visual-information processing in the superior colliculus. Prog Brain Res 134:143–156PubMedCrossRef Stein BE, Jiang W, Wallace MT, Stanford TR (2001) Nonvisual influences on visual-information processing in the superior colliculus. Prog Brain Res 134:143–156PubMedCrossRef
Zurück zum Zitat Stepper S, Strack F (1993) Proprioceptive determinants of emotional and non-emotional feelings. J Pers Soc Psychol 64:211–220CrossRef Stepper S, Strack F (1993) Proprioceptive determinants of emotional and non-emotional feelings. J Pers Soc Psychol 64:211–220CrossRef
Zurück zum Zitat Ström N (1997) Sparse connection and pruning in large dynamic artificial neural networks. In: Proceedings of Eurospeech, vol 5, pp 2807–2810 Ström N (1997) Sparse connection and pruning in large dynamic artificial neural networks. In: Proceedings of Eurospeech, vol 5, pp 2807–2810
Zurück zum Zitat Velasquez JD (1999) From affect programs to higher cognitive emotions: an emotion-based control approach. In: Proceedings of workshop on emotion-based agent architectures, Seattle, USA, pp 10–15 Velasquez JD (1999) From affect programs to higher cognitive emotions: an emotion-based control approach. In: Proceedings of workshop on emotion-based agent architectures, Seattle, USA, pp 10–15
Metadaten
Titel
On the recognition of emotional vocal expressions: motivations for a holistic approach
verfasst von
Anna Esposito
Antonietta M. Esposito
Publikationsdatum
01.10.2012
Verlag
Springer-Verlag
Erschienen in
Cognitive Processing / Ausgabe Sonderheft 2/2012
Print ISSN: 1612-4782
Elektronische ISSN: 1612-4790
DOI
https://doi.org/10.1007/s10339-012-0516-2

Weitere Artikel der Sonderheft 2/2012

Cognitive Processing 2/2012 Zur Ausgabe

Neuer Inhalt