Skip to main content
Top

2014 | OriginalPaper | Chapter

Extracting Emotions and Communication Styles from Prosody

Authors : Licia Sbattella, Luca Colombo, Carlo Rinaldi, Roberto Tedesco, Matteo Matteucci, Alessandro Trivilini

Published in: Physiological Computing Systems

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

According to many psychological and social studies, vocal messages contain two distinct channels—an explicit, linguistic channel, and an implicit, paralinguistic channel. In particular, the latter contains information about the emotional state of the speaker, providing clues about the implicit meaning of the message. Such information can improve applications requiring human-machine interactions (for example, Automatic Speech Recognition systems or Conversational Agents), as well as support the analysis of human-human interactions (for example, clinic or forensic applications). PrEmA, the tool we present in this work, is able to recognize and classify both emotions and communication style of the speaker, relying on prosodic features. In particular, recognition of communication-styles is, to our knowledge, new, and could be used to infer interesting clues about the state of the interaction. PrEmA uses two LDA-based classifiers, which rely on two sets of prosodic features. Experimenting PrEmA with Italian speakers we obtained \(Ac=71\,\%\) for emotions and \(Ac=86\,\%\) for communication styles.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Notice that the performance index provided in this section are indicative and cannot be compared each other, since each system used its own vocal dataset.
 
2
Such segments were considered too loud for being clear silences, but too quiet for providing a clear voiced signal.
 
4
From the 10 LDA-based classifiers generated for the emotion classification task, the one with better performance indexes was chosen as a final model; the same approach was followed for the communication-style classifier.
 
Literature
1.
go back to reference Anolli, L.: Le emozioni. Edizioni Unicopli, Milano (2002) Anolli, L.: Le emozioni. Edizioni Unicopli, Milano (2002)
2.
go back to reference Anolli, L., Ciceri, R.: The Voice of Emotions. Angeli, Milano (1997) Anolli, L., Ciceri, R.: The Voice of Emotions. Angeli, Milano (1997)
3.
go back to reference Asawa, K., Verma, V., Agrawal, A.: Recognition of vocal emotions from acoustic profile. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (2012) Asawa, K., Verma, V., Agrawal, A.: Recognition of vocal emotions from acoustic profile. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (2012)
4.
go back to reference Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Rocchi, S., Rossi, F., Tesser, F.: Definizione ed annotazione prosodica di un database di parlato-letto usando il formalismo ToBI. In: Proceedings of Il Parlato Italiano, Napoli, Italy, February 2003 Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Rocchi, S., Rossi, F., Tesser, F.: Definizione ed annotazione prosodica di un database di parlato-letto usando il formalismo ToBI. In: Proceedings of Il Parlato Italiano, Napoli, Italy, February 2003
5.
go back to reference Balconi, M., Carrera, A.: Il lessico emotivo nel decoding delle espressioni facciali. ESE - Psychofenia - Salento University Publishing (2005) Balconi, M., Carrera, A.: Il lessico emotivo nel decoding delle espressioni facciali. ESE - Psychofenia - Salento University Publishing (2005)
6.
go back to reference Banse, R., Sherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70, 614–636 (1996)CrossRef Banse, R., Sherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70, 614–636 (1996)CrossRef
8.
go back to reference Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001) Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001)
9.
go back to reference Boersma, P., Weenink, D.: Manual of praat: doing phonetics by computer [computer program] (2013) Boersma, P., Weenink, D.: Manual of praat: doing phonetics by computer [computer program] (2013)
10.
go back to reference Bonvino, E.: Le strutture del linguaggio: un’introduzione alla fonologia. La Nuova Italia, Milano (2000) Bonvino, E.: Le strutture del linguaggio: un’introduzione alla fonologia. La Nuova Italia, Milano (2000)
11.
go back to reference Borchert, M., Diisterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: IEEE Natural Language Processing and Knowledge Engineering (2005) Borchert, M., Diisterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: IEEE Natural Language Processing and Knowledge Engineering (2005)
12.
go back to reference Caldognetto, E.M., Poggi, I.: Il parlato emotivo. aspetti cognitivi, linguistici e fonetici. In: Il parlato italiano. Atti del Convegno Nazionale. Napoli, Italy (2004) Caldognetto, E.M., Poggi, I.: Il parlato emotivo. aspetti cognitivi, linguistici e fonetici. In: Il parlato italiano. Atti del Convegno Nazionale. Napoli, Italy (2004)
13.
go back to reference Canepari, L.: L’intonazione linguistica e paralinguistica. Liguori Editore (1985) Canepari, L.: L’intonazione linguistica e paralinguistica. Liguori Editore (1985)
14.
go back to reference Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)CrossRef Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)CrossRef
15.
go back to reference D’Anna, L., Petrillo, M.: APA: un prototipo di sistema automatico per l’analisi prosodica. In: Atti delle 11me giornate di studio del Gruppo di Fonetica Sperimentale (2001) D’Anna, L., Petrillo, M.: APA: un prototipo di sistema automatico per l’analisi prosodica. In: Atti delle 11me giornate di studio del Gruppo di Fonetica Sperimentale (2001)
16.
go back to reference Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Commun. 30, 145–166 (2000)CrossRef Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Commun. 30, 145–166 (2000)CrossRef
17.
go back to reference Ekman, D., Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford University Press, New York, Oxford (1994) Ekman, D., Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford University Press, New York, Oxford (1994)
18.
go back to reference Gobl, C., Chasaide, A.N.: Testing affective correlates of voice quality through analysis and resynthesis. In: ISCA Workshop on Emotion and Speech (2000) Gobl, C., Chasaide, A.N.: Testing affective correlates of voice quality through analysis and resynthesis. In: ISCA Workshop on Emotion and Speech (2000)
19.
go back to reference Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)CrossRef Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)CrossRef
20.
go back to reference Hastie, H.W., Poesio, M., Isard, S.: Automatically predicting dialog structure using prosodic features. Speech Commun. 36(1–2), 63–79 (2001)MATH Hastie, H.W., Poesio, M., Isard, S.: Automatically predicting dialog structure using prosodic features. Speech Commun. 36(1–2), 63–79 (2001)MATH
21.
go back to reference Hirshberg, J., Avesani, C.: Prosodic disambiguation in English and Italian. In: Botinis, A. (ed.) Intonation. Kluwer, Dordrecht (2000) Hirshberg, J., Avesani, C.: Prosodic disambiguation in English and Italian. In: Botinis, A. (ed.) Intonation. Kluwer, Dordrecht (2000)
22.
go back to reference Hirst, D.: Automatic analysis of prosody for multilingual speech corpora. In: Keller, E., Bailly, G., Terken, J., Huckvale, M. (eds.) Improvements in Speech Synthesis. Wiley, Chichester (2001) Hirst, D.: Automatic analysis of prosody for multilingual speech corpora. In: Keller, E., Bailly, G., Terken, J., Huckvale, M. (eds.) Improvements in Speech Synthesis. Wiley, Chichester (2001)
23.
go back to reference López-de-Ipiña, K., Alonso, J.B., Travieso, C.M., Solé-Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza, A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P., de Lizardui, U.M.: On the selection of non-invasive methods based on speech analysis oriented to automatic alzheimer disease diagnosis. Sensors 13(5), 6730–6745 (2013). http://www.mdpi.com/1424-8220/13/5/6730 CrossRef López-de-Ipiña, K., Alonso, J.B., Travieso, C.M., Solé-Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza, A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P., de Lizardui, U.M.: On the selection of non-invasive methods based on speech analysis oriented to automatic alzheimer disease diagnosis. Sensors 13(5), 6730–6745 (2013). http://​www.​mdpi.​com/​1424-8220/​13/​5/​6730 CrossRef
24.
go back to reference Izard, C.E.: The Face of Emotion. Appleton Century Crofts, New York (1971) Izard, C.E.: The Face of Emotion. Appleton Century Crofts, New York (1971)
25.
go back to reference Juslin, P.N.: Emotional communication in music performance: a functionalist perspective and some data. Music Percept. 14(4), 383–418 (1997)CrossRef Juslin, P.N.: Emotional communication in music performance: a functionalist perspective and some data. Music Percept. 14(4), 383–418 (1997)CrossRef
26.
go back to reference Juslin, P.: A Functionalist Perspective on Emotional Communication in Music Performance, 1st edn. Acta Universitatis Upsaliensis, Uppsala (1998) Juslin, P.: A Functionalist Perspective on Emotional Communication in Music Performance, 1st edn. Acta Universitatis Upsaliensis, Uppsala (1998)
27.
go back to reference Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech emotion recognition using segmental level prosodic analysis. In: IEEE, Devices and Communications (ICDeCom) (2011) Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech emotion recognition using segmental level prosodic analysis. In: IEEE, Devices and Communications (ICDeCom) (2011)
28.
go back to reference Lee, C.M., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)CrossRef Lee, C.M., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)CrossRef
29.
go back to reference Leung, C., Lee, T., Ma, B., Li, H.: Prosodic attribute model for spoken language identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010) (2010) Leung, C., Lee, T., Ma, B., Li, H.: Prosodic attribute model for spoken language identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010) (2010)
30.
go back to reference Mandler, G.: Mind and Body: Psychology of Emotion and Stress. Norton, New York (1984) Mandler, G.: Mind and Body: Psychology of Emotion and Stress. Norton, New York (1984)
31.
go back to reference McGilloway, S., Cowie, R., Cowie, E.D., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: ISCA Workshop on Speech and Emotion (2000) McGilloway, S., Cowie, R., Cowie, E.D., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: ISCA Workshop on Speech and Emotion (2000)
32.
go back to reference McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (2004)MATH McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (2004)MATH
33.
go back to reference Mehrabian, A.: Nonverbal Communication. Aldine-Atherton, Chicago (1972) Mehrabian, A.: Nonverbal Communication. Aldine-Atherton, Chicago (1972)
34.
go back to reference Michel, F.: Assert Yourself. Centre for Clinical Interventions, Perth (2008) Michel, F.: Assert Yourself. Centre for Clinical Interventions, Perth (2008)
35.
go back to reference Moridis, C.N., Economides, A.A.: Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Trans. Affect. Comput. 3(3) (2012) Moridis, C.N., Economides, A.A.: Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Trans. Affect. Comput. 3(3) (2012)
36.
go back to reference Murray, E., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(2), 1097–1108 (1993)CrossRef Murray, E., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(2), 1097–1108 (1993)CrossRef
37.
go back to reference Pinker, S., Prince, A.: Regular and irregular morphology and the psychological status of rules of grammar. In: Lima, S.D., Corrigan, R.L., Iverson, G.K. (eds.) The Reality of Linguistic Rules. John Benjamins Publishing Company, Amsterdam/Philadelphia (1994) Pinker, S., Prince, A.: Regular and irregular morphology and the psychological status of rules of grammar. In: Lima, S.D., Corrigan, R.L., Iverson, G.K. (eds.) The Reality of Linguistic Rules. John Benjamins Publishing Company, Amsterdam/Philadelphia (1994)
38.
go back to reference Planet, S., Iriondo, I.: Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In: Information Systems and Technologies (CISTI) (2012) Planet, S., Iriondo, I.: Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In: Information Systems and Technologies (CISTI) (2012)
39.
go back to reference Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L.: Speech and mobile technologies for cognitive communication and information systems. In: 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom), July 2011, pp. 1–5 (2011) Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L.: Speech and mobile technologies for cognitive communication and information systems. In: 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom), July 2011, pp. 1–5 (2011)
40.
go back to reference Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for F * R * I * E * N * D * S *. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006 Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for F * R * I * E * N * D * S *. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006
41.
go back to reference Russell, J.A., Snodgrass, J.: Emotion and the environment. In: Stokols, D., Altman, I. (eds.) Handbook of Environmental Psychology. Wiley, New York (1987) Russell, J.A., Snodgrass, J.: Emotion and the environment. In: Stokols, D., Altman, I. (eds.) Handbook of Environmental Psychology. Wiley, New York (1987)
42.
go back to reference Sbattella, L.: La Mente Orchestra. Elaborazione della risonanza e autismo, Vita e pensiero (2006) Sbattella, L.: La Mente Orchestra. Elaborazione della risonanza e autismo, Vita e pensiero (2006)
43.
go back to reference Sbattella, L.: Ti penso, dunque suono. Costrutti cognitivi e relazionali del comportamento musicale: un modello di ricerca-azione. Vita e pensiero (2013) Sbattella, L.: Ti penso, dunque suono. Costrutti cognitivi e relazionali del comportamento musicale: un modello di ricerca-azione. Vita e pensiero (2013)
44.
go back to reference Scherer, K.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)CrossRef Scherer, K.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)CrossRef
45.
go back to reference Shi, Y., Song, W.: Speech emotion recognition based on data mining technology. In: Sixth International Conference on Natural Computation (2010) Shi, Y., Song, W.: Speech emotion recognition based on data mining technology. In: Sixth International Conference on Natural Computation (2010)
46.
go back to reference Shriberg, E., Stolcke, A.: Prosody modeling for automatic speech recognition and understanding. In: Proceeding of ISCA Workshop on Prosody in Speech Recognition and Understanding (2001) Shriberg, E., Stolcke, A.: Prosody modeling for automatic speech recognition and understanding. In: Proceeding of ISCA Workshop on Prosody in Speech Recognition and Understanding (2001)
47.
go back to reference Shriberg, E., Stolcke, A., Hakkani-Tr, D., Tr, G.: Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)CrossRef Shriberg, E., Stolcke, A., Hakkani-Tr, D., Tr, G.: Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)CrossRef
48.
go back to reference Stern, D.: Il mondo interpersonale del bambino, 1st edn. Bollati Boringhieri, Torino (1985) Stern, D.: Il mondo interpersonale del bambino, 1st edn. Bollati Boringhieri, Torino (1985)
49.
go back to reference Tesser, F., Cosi, P., Orioli, C., Tisato, G.: Modelli prosodici emotivi per la sintesi dell’italiano. ITC-IRST, ISTC-CNR (2004) Tesser, F., Cosi, P., Orioli, C., Tisato, G.: Modelli prosodici emotivi per la sintesi dell’italiano. ITC-IRST, ISTC-CNR (2004)
50.
go back to reference Tomkins, S.: Affect theory. In: Sherer, K.R., Ekman, P. (eds.) Approaches to Emotion. Lawrence Erlbaum Associates, Hillsdale (1982) Tomkins, S.: Affect theory. In: Sherer, K.R., Ekman, P. (eds.) Approaches to Emotion. Lawrence Erlbaum Associates, Hillsdale (1982)
51.
go back to reference Wang, C., Li, Y.: A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In: International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2012) (2012) Wang, C., Li, Y.: A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In: International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2012) (2012)
Metadata
Title
Extracting Emotions and Communication Styles from Prosody
Authors
Licia Sbattella
Luca Colombo
Carlo Rinaldi
Roberto Tedesco
Matteo Matteucci
Alessandro Trivilini
Copyright Year
2014
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-45686-6_2

Premium Partner