Skip to main content
Erschienen in: International Journal of Speech Technology 1/2018

22.01.2018

Speech emotion recognition research: an analysis of research focus

verfasst von: Mumtaz Begum Mustafa, Mansoor A. M. Yusoof, Zuraidah M. Don, Mehdi Malekzadeh

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This article analyses research in speech emotion recognition (“SER”) from 2006 to 2017 in order to identify the current focus of research, and areas in which research is lacking. The objective is to examine what is being done in this field of research. Searching on selected keywords, we extracted and analysed 260 articles from well-known online databases. The analysis indicates that SER research is an active field of research, dozens of articles being published each year in journals and conference proceedings. The majority of articles concentrate on three critical aspects of SER, namely (1) databases, (2) suitable speech features, and (3) classification techniques to maximize the recognition accuracy of SER systems. Having carried out association analysis of the critical aspects and how they influence the performance of the SER system in term of recognition accuracy, we found that certain combination of databases, speech features and classifiers influence the recognition accuracy of the SER system. We have also suggested aspects of SER that could be taken into consideration in future works based on our review.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abdelwahab, M., & Busso, C. (2017). Incremental adaptation using active learning for acoustic emotion Recognition. In International conference on acoustics, speech and signal processing. Abdelwahab, M., & Busso, C. (2017). Incremental adaptation using active learning for acoustic emotion Recognition. In International conference on acoustics, speech and signal processing.
Zurück zum Zitat Alam, M. J., Attabi, Y., Dumouchel, P., Kenny, P., & O’Shaughnessy, D. D. (2013). Amplitude modulation features for emotion recognition from speech. In INTERSPEECH (pp. 2420–2424). Alam, M. J., Attabi, Y., Dumouchel, P., Kenny, P., & O’Shaughnessy, D. D. (2013). Amplitude modulation features for emotion recognition from speech. In INTERSPEECH (pp. 2420–2424).
Zurück zum Zitat Albornoz, E. M., Crolla, M. B., & Milone, D. H. (2008). Recognition of emotions in speech. In Proceedings of XXXIV CLEI, Santa Fe Argentina, pp. 1120–1129. Albornoz, E. M., Crolla, M. B., & Milone, D. H. (2008). Recognition of emotions in speech. In Proceedings of XXXIV CLEI, Santa Fe Argentina, pp. 1120–1129.
Zurück zum Zitat Albornoz, E. M., Milone, D. H., & Rufiner, H. L. (2011). Spoken emotion recognition using hierarchical classifiers. Computer Speech & Language, 25(3), 556–570.CrossRef Albornoz, E. M., Milone, D. H., & Rufiner, H. L. (2011). Spoken emotion recognition using hierarchical classifiers. Computer Speech & Language, 25(3), 556–570.CrossRef
Zurück zum Zitat Altun, H., & Polat, G. (2009). Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection. Expert Systems with Applications, 36(4), 8197–8203.CrossRef Altun, H., & Polat, G. (2009). Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection. Expert Systems with Applications, 36(4), 8197–8203.CrossRef
Zurück zum Zitat Álvarez, A., Cearreta, I., López, J. M., Arruti, A., Lazkano, E., Sierra, B., & Garay, N. (2007). A comparison using different speech parameters in the automatic emotion recognition using Feature Subset Selection based on Evolutionary Algorithms. In International conference on text, speech and dialogue (pp. 423–430). Berlin: Springer.CrossRef Álvarez, A., Cearreta, I., López, J. M., Arruti, A., Lazkano, E., Sierra, B., & Garay, N. (2007). A comparison using different speech parameters in the automatic emotion recognition using Feature Subset Selection based on Evolutionary Algorithms. In International conference on text, speech and dialogue (pp. 423–430). Berlin: Springer.CrossRef
Zurück zum Zitat Anagnostopoulos, C. N., Iliou, T., & Giannoukos, I. (2012). Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artificial Intelligence Review, 43(2), 155–177.CrossRef Anagnostopoulos, C. N., Iliou, T., & Giannoukos, I. (2012). Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011. Artificial Intelligence Review, 43(2), 155–177.CrossRef
Zurück zum Zitat Ananthakrishnan, S., Vembu, A. N., & Prasad, R. (2011). Model-based parametric features for emotion recognition from speech. In 2011 IEEE workshop on automatic speech recognition and understanding (ASRU), (pp. 529–534). Piscataway: IEEE.CrossRef Ananthakrishnan, S., Vembu, A. N., & Prasad, R. (2011). Model-based parametric features for emotion recognition from speech. In 2011 IEEE workshop on automatic speech recognition and understanding (ASRU), (pp. 529–534). Piscataway: IEEE.CrossRef
Zurück zum Zitat Arias, J. P., Busso, C., & Yoma, N. B. (2013). Energy and F0 contour modeling with functional data analysis for emotional speech detection. In INTERSPEECH (pp. 2871–2875). Arias, J. P., Busso, C., & Yoma, N. B. (2013). Energy and F0 contour modeling with functional data analysis for emotional speech detection. In INTERSPEECH (pp. 2871–2875).
Zurück zum Zitat Arias, J. P., Busso, C., & Yoma, N. B. (2014). Shape-based modeling of the fundamental frequency contour for emotion detection in speech. Computer Speech & Language, 28(1), 278–294.CrossRef Arias, J. P., Busso, C., & Yoma, N. B. (2014). Shape-based modeling of the fundamental frequency contour for emotion detection in speech. Computer Speech & Language, 28(1), 278–294.CrossRef
Zurück zum Zitat Atassi, H., & Esposito, A. (2008). A speaker independent approach to the classification of emotional vocal expressions. In 20th IEEE international conference on tools with artificial intelligence, 2008. ICTAI’08. (Vol. 2, pp. 147–152). Piscataway: IEEE.CrossRef Atassi, H., & Esposito, A. (2008). A speaker independent approach to the classification of emotional vocal expressions. In 20th IEEE international conference on tools with artificial intelligence, 2008. ICTAI08. (Vol. 2, pp. 147–152). Piscataway: IEEE.CrossRef
Zurück zum Zitat Atassi, H., Smekal, Z., & Esposito, A. (2012). Emotion recognition from spontaneous Slavic speech. In 2012 IEEE 3rd international conference on cognitive infocommunications (CogInfoCom) (pp. 389–394). Piscataway: IEEE.CrossRef Atassi, H., Smekal, Z., & Esposito, A. (2012). Emotion recognition from spontaneous Slavic speech. In 2012 IEEE 3rd international conference on cognitive infocommunications (CogInfoCom) (pp. 389–394). Piscataway: IEEE.CrossRef
Zurück zum Zitat Athanaselis, T., Bakamidis, S., Dologlou, I., Cowie, R., Douglas-Cowie, E., & Cox, C. (2005). ASR for emotional speech: Clarifying the issues and enhancing performance. Neural Networks, 18(4), 437–444.CrossRef Athanaselis, T., Bakamidis, S., Dologlou, I., Cowie, R., Douglas-Cowie, E., & Cox, C. (2005). ASR for emotional speech: Clarifying the issues and enhancing performance. Neural Networks, 18(4), 437–444.CrossRef
Zurück zum Zitat Attabi, Y., & Dumouchel, P. (2012). Emotion recognition from speech: WOC-NN and class-interaction. In 2012 11th international conference on information science, signal processing and their applications (ISSPA) (pp. 126–131). Piscataway: IEEE.CrossRef Attabi, Y., & Dumouchel, P. (2012). Emotion recognition from speech: WOC-NN and class-interaction. In 2012 11th international conference on information science, signal processing and their applications (ISSPA) (pp. 126–131). Piscataway: IEEE.CrossRef
Zurück zum Zitat Attabi, Y., & Dumouchel, P. (2013). Anchor models for emotion recognition from speech. IEEE Transactions on Affective Computing, 4(3), 280–290.CrossRef Attabi, Y., & Dumouchel, P. (2013). Anchor models for emotion recognition from speech. IEEE Transactions on Affective Computing, 4(3), 280–290.CrossRef
Zurück zum Zitat Bahreini, K., Nadolski, R., & Westera, W. (2016). Towards real-time speech emotion recognition for affective e-learning. Education and Information Technologies, 21(5), 1367–1386.CrossRef Bahreini, K., Nadolski, R., & Westera, W. (2016). Towards real-time speech emotion recognition for affective e-learning. Education and Information Technologies, 21(5), 1367–1386.CrossRef
Zurück zum Zitat Balti, H., & Elmaghraby, A. S. (2013). Speech emotion detection using time dependent self organizing maps. In 2013 IEEE international symposium on signal processing and information technology (ISSPIT) (pp. 000470–000478). Piscataway: IEEE. Balti, H., & Elmaghraby, A. S. (2013). Speech emotion detection using time dependent self organizing maps. In 2013 IEEE international symposium on signal processing and information technology (ISSPIT) (pp. 000470–000478). Piscataway: IEEE.
Zurück zum Zitat Barra Chicote, R., Fernández Martínez, F., Lutfi, L., Binti, S., Lucas Cuesta, J. M., Macías Guarasa, J., … Pardo Muñoz, J. M. (2009). Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions. ISCA. Barra Chicote, R., Fernández Martínez, F., Lutfi, L., Binti, S., Lucas Cuesta, J. M., Macías Guarasa, J., … Pardo Muñoz, J. M. (2009). Acoustic emotion recognition using dynamic Bayesian networks and multi-space distributions. ISCA.
Zurück zum Zitat Batliner, A., Schuller, B., Seppi, D., Steidl, S., Devillers, L., Vidrascu, L., & Amir, N. (2011). The automatic recognition of emotions in speech. In Emotion-oriented systems (pp. 71–99). Berlin Heidelberg: Springer.CrossRef Batliner, A., Schuller, B., Seppi, D., Steidl, S., Devillers, L., Vidrascu, L., & Amir, N. (2011). The automatic recognition of emotions in speech. In Emotion-oriented systems (pp. 71–99). Berlin Heidelberg: Springer.CrossRef
Zurück zum Zitat Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., … Aharonson, V. (2006). Combining efforts for improving automatic classification of emotional user states. Proc. IS-LTC 240–245. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., … Aharonson, V. (2006). Combining efforts for improving automatic classification of emotional user states. Proc. IS-LTC 240–245.
Zurück zum Zitat Batliner, A., Steidl, S., Schuller, B., Seppi, D., Vogt, T., Devillers, L., … Aharonson, V. (2007). The impact of F0 extraction errors on the classification of prominence and emotion. Proc. ICPhS 2201–2204. Batliner, A., Steidl, S., Schuller, B., Seppi, D., Vogt, T., Devillers, L., … Aharonson, V. (2007). The impact of F0 extraction errors on the classification of prominence and emotion. Proc. ICPhS 2201–2204.
Zurück zum Zitat Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., & Rose, R. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786.CrossRef Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., & Rose, R. (2007). Automatic speech recognition and speech variability: A review. Speech Communication, 49(10), 763–786.CrossRef
Zurück zum Zitat Bertero, D., & Fung, P. (2017). A first look into a Convolutional Neural Network for speech emotion detection. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5115–5119). Piscataway: IEEE.CrossRef Bertero, D., & Fung, P. (2017). A first look into a Convolutional Neural Network for speech emotion detection. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5115–5119). Piscataway: IEEE.CrossRef
Zurück zum Zitat Bhaykar, M., Yadav, J., & Rao, K. S. (2013). Speaker dependent, speaker independent and cross language emotion recognition from speech using GMM and HMM. 2013 national conference on communications (NCC) (pp. 1–5). Piscataway: IEEE. Bhaykar, M., Yadav, J., & Rao, K. S. (2013). Speaker dependent, speaker independent and cross language emotion recognition from speech using GMM and HMM. 2013 national conference on communications (NCC) (pp. 1–5). Piscataway: IEEE.
Zurück zum Zitat Bitouk, D., Nenkova, A., & Verma, R. (2009). Improving emotion recognition using class-level spectral features. In INTERSPEECH (pp. 2023–2026). Bitouk, D., Nenkova, A., & Verma, R. (2009). Improving emotion recognition using class-level spectral features. In INTERSPEECH (pp. 2023–2026).
Zurück zum Zitat Böck, R., Hübner, D., & Wendemuth, A. (2010). Determining optimal signal features and parameters for hmm-based emotion classification. In 15th IEEE mediterranean electrotechnical conference MELECON 2010–2010 (pp. 1586–1590). Piscataway: IEEE.CrossRef Böck, R., Hübner, D., & Wendemuth, A. (2010). Determining optimal signal features and parameters for hmm-based emotion classification. In 15th IEEE mediterranean electrotechnical conference MELECON 2010–2010 (pp. 1586–1590). Piscataway: IEEE.CrossRef
Zurück zum Zitat Bojanić, M., Crnojević, V., & Delić, V. (2012). Application of neural networks in emotional speech recognition. In 2012 11th symposium on neural network applications in electrical engineering (NEUREL) (pp. 223–226). Piscataway: IEEE. Bojanić, M., Crnojević, V., & Delić, V. (2012). Application of neural networks in emotional speech recognition. In 2012 11th symposium on neural network applications in electrical engineering (NEUREL) (pp. 223–226). Piscataway: IEEE.
Zurück zum Zitat Bozkurt, E., Erzin, E., Erdem, C. E., & Erdem, A. T. (2010). Use of line spectral frequencies for emotion recognition from speech. In 2010 20th international conference on pattern recognition (ICPR) (pp. 3708–3711). Piscataway: IEEE.CrossRef Bozkurt, E., Erzin, E., Erdem, C. E., & Erdem, A. T. (2010). Use of line spectral frequencies for emotion recognition from speech. In 2010 20th international conference on pattern recognition (ICPR) (pp. 3708–3711). Piscataway: IEEE.CrossRef
Zurück zum Zitat Bozkurt, E., Erzin, E., Erdem, C. E., & Erdem, A. T. (2011). Formant position based weighted spectral features for emotion recognition. Speech Communication, 53(9), 1186–1197.CrossRef Bozkurt, E., Erzin, E., Erdem, C. E., & Erdem, A. T. (2011). Formant position based weighted spectral features for emotion recognition. Speech Communication, 53(9), 1186–1197.CrossRef
Zurück zum Zitat Bozkurt, E., Erzin, E., Eroğlu Erdem, Ç, & Erdem, T. (2009). Improving automatic emotion recognition from speech signals. In 10th annual conference of the international speech communication association 2009 (INTERSPEECH 2009). International Speech Communications Association. Bozkurt, E., Erzin, E., Eroğlu Erdem, Ç, & Erdem, T. (2009). Improving automatic emotion recognition from speech signals. In 10th annual conference of the international speech communication association 2009 (INTERSPEECH 2009). International Speech Communications Association.
Zurück zum Zitat Brester, C., Semenkin, E., & Sidorov, M. (2016). Multi-objective heuristic feature selection for speech-based multilingual emotion recognition. Journal of Artificial Intelligence and Soft Computing Research, 6(4), 243–253.CrossRef Brester, C., Semenkin, E., & Sidorov, M. (2016). Multi-objective heuristic feature selection for speech-based multilingual emotion recognition. Journal of Artificial Intelligence and Soft Computing Research, 6(4), 243–253.CrossRef
Zurück zum Zitat Brooks, C. A., Thompson, C., & Kovanović, V. (2016). Introduction to data mining for educational researchers. In Proceedings of the 6th international conference on learning analytics & knowledge (pp. 505–506). ACM. Brooks, C. A., Thompson, C., & Kovanović, V. (2016). Introduction to data mining for educational researchers. In Proceedings of the 6th international conference on learning analytics & knowledge (pp. 505–506). ACM.
Zurück zum Zitat Busso, C., Lee, S., & Narayanan, S. (2009). Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 582–596.CrossRef Busso, C., Lee, S., & Narayanan, S. (2009). Analysis of emotionally salient aspects of fundamental frequency for emotion detection. IEEE Transactions on Audio, Speech, and Language Processing, 17(4), 582–596.CrossRef
Zurück zum Zitat Busso, C., Mariooryad, S., Metallinou, A., & Narayanan, S. (2013). Iterative feature normalization scheme for automatic emotion detection from speech. IEEE Transactions on Affective Computing, 4(4), 386–397.CrossRef Busso, C., Mariooryad, S., Metallinou, A., & Narayanan, S. (2013). Iterative feature normalization scheme for automatic emotion detection from speech. IEEE Transactions on Affective Computing, 4(4), 386–397.CrossRef
Zurück zum Zitat Busso, C., Metallinou, A., & Narayanan, S. S. (2011). Iterative feature normalization for emotional speech detection. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5692–5695). Piscataway: IEEE.CrossRef Busso, C., Metallinou, A., & Narayanan, S. S. (2011). Iterative feature normalization for emotional speech detection. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5692–5695). Piscataway: IEEE.CrossRef
Zurück zum Zitat Calvo, R. A., & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18–37.CrossRef Calvo, R. A., & D’Mello, S. (2010). Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, 1(1), 18–37.CrossRef
Zurück zum Zitat Casale, S., Russo, A., Scebba, G., & Serrano, S. (2008). Speech emotion classification using machine learning algorithms. In 2008 IEEE international conference on semantic computing (pp. 158–165). Piscataway: IEEE.CrossRef Casale, S., Russo, A., Scebba, G., & Serrano, S. (2008). Speech emotion classification using machine learning algorithms. In 2008 IEEE international conference on semantic computing (pp. 158–165). Piscataway: IEEE.CrossRef
Zurück zum Zitat Casale, S., Russo, A., & Serrano, S. (2010). Analysis of robustness of attributes selection applied to speech emotion recognition. In 2010 18th European signal processing conference (pp. 1174–1178). Piscataway: IEEE. Casale, S., Russo, A., & Serrano, S. (2010). Analysis of robustness of attributes selection applied to speech emotion recognition. In 2010 18th European signal processing conference (pp. 1174–1178). Piscataway: IEEE.
Zurück zum Zitat Chakraborty, R., Pandharipande, M., & Kopparapu, S. K. (2016). Knowledge-based framework for intelligent emotion recognition in spontaneous speech. Procedia Computer Science, 96, 587–596.CrossRef Chakraborty, R., Pandharipande, M., & Kopparapu, S. K. (2016). Knowledge-based framework for intelligent emotion recognition in spontaneous speech. Procedia Computer Science, 96, 587–596.CrossRef
Zurück zum Zitat Chandaka, S., Chatterjee, A., & Munshi, S. (2009). Support vector machines employing cross-correlation for emotional speech recognition. Measurement, 42(4), 611–618.CrossRef Chandaka, S., Chatterjee, A., & Munshi, S. (2009). Support vector machines employing cross-correlation for emotional speech recognition. Measurement, 42(4), 611–618.CrossRef
Zurück zum Zitat Chandrakala, S., & Sekhar, C. C. (2009). Combination of generative models and SVM based classifier for speech emotion recognition. In International joint conference on neural networks, 2009. IJCNN 2009 (pp. 497–502). Piscataway: IEEE. Chandrakala, S., & Sekhar, C. C. (2009). Combination of generative models and SVM based classifier for speech emotion recognition. In International joint conference on neural networks, 2009. IJCNN 2009 (pp. 497–502). Piscataway: IEEE.
Zurück zum Zitat Chavhan, Y., Dhore, M. L., & Yesaware, P. (2010). Speech emotion recognition using support vector machine. International Journal of Computer Applications, 1(20), 6–9.CrossRef Chavhan, Y., Dhore, M. L., & Yesaware, P. (2010). Speech emotion recognition using support vector machine. International Journal of Computer Applications, 1(20), 6–9.CrossRef
Zurück zum Zitat Chavhan, Y. D., Yelure, B. S., & Tayade, K. N. (2015). Speech emotion recognition using RBF kernel of LIBSVM. In 2015 2nd international conference on electronics and communication systems (ICECS) (pp. 1132–1135). Piscataway: IEEE.CrossRef Chavhan, Y. D., Yelure, B. S., & Tayade, K. N. (2015). Speech emotion recognition using RBF kernel of LIBSVM. In 2015 2nd international conference on electronics and communication systems (ICECS) (pp. 1132–1135). Piscataway: IEEE.CrossRef
Zurück zum Zitat Chen, L., Mao, X., Wei, P., Xue, Y., & Ishizuka, M. (2012). Mandarin emotion recognition combining acoustic and emotional point information. Applied Intelligence, 37(4), 602–612.CrossRef Chen, L., Mao, X., Wei, P., Xue, Y., & Ishizuka, M. (2012). Mandarin emotion recognition combining acoustic and emotional point information. Applied Intelligence, 37(4), 602–612.CrossRef
Zurück zum Zitat Chenchah, F., & Lachiri, Z. (2014). Speech emotion recognition in acted and spontaneous context. Procedia Computer Science, 39, 139–145.CrossRef Chenchah, F., & Lachiri, Z. (2014). Speech emotion recognition in acted and spontaneous context. Procedia Computer Science, 39, 139–145.CrossRef
Zurück zum Zitat Cheng, X., & Duan, Q. (2012). Speech emotion recognition using gaussian mixture model. In The 2nd international conference on computer application and system modeling. Cheng, X., & Duan, Q. (2012). Speech emotion recognition using gaussian mixture model. In The 2nd international conference on computer application and system modeling.
Zurück zum Zitat Chiou, B. C., & Chen, C. P. (2013). Feature space dimension reduction in speech emotion recognition using support vector machine. In Signal and information processing association annual summit and conference (APSIPA), 2013 Asia-Pacific (pp. 1–6). Piscataway: IEEE. Chiou, B. C., & Chen, C. P. (2013). Feature space dimension reduction in speech emotion recognition using support vector machine. In Signal and information processing association annual summit and conference (APSIPA), 2013 Asia-Pacific (pp. 1–6). Piscataway: IEEE.
Zurück zum Zitat Christina, I. J., & Milton, A. (2012). Analysis of all pole model to recognize emotions from speech signal. In 2012 international conference on computing, electronics and electrical technologies (ICCEET) (pp. 723–728). Piscataway: IEEE.CrossRef Christina, I. J., & Milton, A. (2012). Analysis of all pole model to recognize emotions from speech signal. In 2012 international conference on computing, electronics and electrical technologies (ICCEET) (pp. 723–728). Piscataway: IEEE.CrossRef
Zurück zum Zitat Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., & Schroder, M. (2000) FEELTRACE: an instrument for recording perceived emotion in real time. In Proceedings of ISCA speech and emotion workshop, pp 19–24. Cowie, R., Douglas-Cowie, E., Savvidou, S., McMahon, E., Sawey, M., & Schroder, M. (2000) FEELTRACE: an instrument for recording perceived emotion in real time. In Proceedings of ISCA speech and emotion workshop, pp 19–24.
Zurück zum Zitat Cummins, N., Amiriparian, S., Hagerer, G., Batliner, A., Steidl, S., & Schuller, B. (2017). An Image-based deep spectrum feature representation for the recognition of emotional speech. In Proceedings of the 25th ACM international conference on multimedia, MM. Piscataway: IEEE. Cummins, N., Amiriparian, S., Hagerer, G., Batliner, A., Steidl, S., & Schuller, B. (2017). An Image-based deep spectrum feature representation for the recognition of emotional speech. In Proceedings of the 25th ACM international conference on multimedia, MM. Piscataway: IEEE.
Zurück zum Zitat D’Mello, S., & Kory, J. (2012). Consistent but modest: a meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies. In Proceedings of the 14th ACM international conference on multimodal interaction (pp. 31–38). ACM. D’Mello, S., & Kory, J. (2012). Consistent but modest: a meta-analysis on unimodal and multimodal affect detection accuracies from 30 studies. In Proceedings of the 14th ACM international conference on multimodal interaction (pp. 31–38). ACM.
Zurück zum Zitat Dai, K., Fell, H. J., & MacAuslan, J. (2008). Recognizing emotion in speech using neural networks. Telehealth and Assistive Technologies, 31, 38–43. Dai, K., Fell, H. J., & MacAuslan, J. (2008). Recognizing emotion in speech using neural networks. Telehealth and Assistive Technologies, 31, 38–43.
Zurück zum Zitat Delic, V., Bojanic, M., Gnjatovic, M., Secujski, M., & Jovicic, S. T. (2012). Discrimination capability of prosodic and spectral features for emotional speech recognition. Elektronika ir Elektrotechnika, 18(9), 51–54.CrossRef Delic, V., Bojanic, M., Gnjatovic, M., Secujski, M., & Jovicic, S. T. (2012). Discrimination capability of prosodic and spectral features for emotional speech recognition. Elektronika ir Elektrotechnika, 18(9), 51–54.CrossRef
Zurück zum Zitat Deng, J., Han, W., & Schuller, B. (2012). Confidence measures for speech emotion recognition: A start. In Proceedings of speech communication; 10. ITG symposium (pp. 1–4). VDE. Deng, J., Han, W., & Schuller, B. (2012). Confidence measures for speech emotion recognition: A start. In Proceedings of speech communication; 10. ITG symposium (pp. 1–4). VDE.
Zurück zum Zitat Deng, J., Xu, X., Zhang, Z., Frühholz, S., Grandjean, D., & Schuller, B. (2017). Fisher kernels on phase-based features for speech emotion recognition. In Dialogues with social robots (pp. 195–203). Springer: Singapore.CrossRef Deng, J., Xu, X., Zhang, Z., Frühholz, S., Grandjean, D., & Schuller, B. (2017). Fisher kernels on phase-based features for speech emotion recognition. In Dialogues with social robots (pp. 195–203). Springer: Singapore.CrossRef
Zurück zum Zitat Deng, J., Zhang, Z., Eyben, F., & Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9), 1068–1072.CrossRef Deng, J., Zhang, Z., Eyben, F., & Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9), 1068–1072.CrossRef
Zurück zum Zitat Deng, J., Zhang, Z., Marchi, E., & Schuller, B. (2013). Sparse autoencoder-based feature transfer learning for speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (ACII) (pp. 511–516). Piscataway: IEEE.CrossRef Deng, J., Zhang, Z., Marchi, E., & Schuller, B. (2013). Sparse autoencoder-based feature transfer learning for speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (ACII) (pp. 511–516). Piscataway: IEEE.CrossRef
Zurück zum Zitat Devillers, L., & Vidrascu, L. (2006). Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. In 9th international conference on spoken language processing. Devillers, L., & Vidrascu, L. (2006). Real-life emotions detection with lexical and paralinguistic cues on human-human call center dialogs. In 9th international conference on spoken language processing.
Zurück zum Zitat Douglas-Cowie, E., Campbell, N., Cowie, R., & Roach, P. (2003). Emotional speech: towards a new generation of databases. Speech Communication, 40, 33–60.MATHCrossRef Douglas-Cowie, E., Campbell, N., Cowie, R., & Roach, P. (2003). Emotional speech: towards a new generation of databases. Speech Communication, 40, 33–60.MATHCrossRef
Zurück zum Zitat Douglas-Cowie, E., Cowie, R., Sneddon, I., Cox, C., Lowry, O., McRorie, M., Martin, J. C., Devillers, L., Abrilan, S., Batliner, A., Amir, N., & Karpouzis, K. (2007) The HUMAINE database: addressing the collection and annotation of naturalistic and induced emotional data. In Proceedings of international conference affective computing and intelligent interaction, pp 488–500. Douglas-Cowie, E., Cowie, R., Sneddon, I., Cox, C., Lowry, O., McRorie, M., Martin, J. C., Devillers, L., Abrilan, S., Batliner, A., Amir, N., & Karpouzis, K. (2007) The HUMAINE database: addressing the collection and annotation of naturalistic and induced emotional data. In Proceedings of international conference affective computing and intelligent interaction, pp 488–500.
Zurück zum Zitat Ekman, P. (1957). A methodological discussion of non-verbal behavior. Journal of Psychology, 43, 141–149.CrossRef Ekman, P. (1957). A methodological discussion of non-verbal behavior. Journal of Psychology, 43, 141–149.CrossRef
Zurück zum Zitat Ekman, P. (1972). Universals and cultural differences in facial expression of emotion. In J. Cole (Ed.), Nebraska symposium on motivation (pp. 207–283). Lincoln, NE: University of Nebraska Press. Ekman, P. (1972). Universals and cultural differences in facial expression of emotion. In J. Cole (Ed.), Nebraska symposium on motivation (pp. 207–283). Lincoln, NE: University of Nebraska Press.
Zurück zum Zitat Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and emotion. Chichester: Wiley. Ekman, P. (1999). Basic emotions. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and emotion. Chichester: Wiley.
Zurück zum Zitat El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.MATHCrossRef El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.MATHCrossRef
Zurück zum Zitat Elbarougy, R., & Akagi, M. (2012). Speech emotion recognition system based on a dimensional approach using a three-layered model. In Signal & information processing association annual summit and conference (APSIPA ASC), 2012 Asia-Pacific (pp. 1–9). Piscataway: IEEE. Elbarougy, R., & Akagi, M. (2012). Speech emotion recognition system based on a dimensional approach using a three-layered model. In Signal & information processing association annual summit and conference (APSIPA ASC), 2012 Asia-Pacific (pp. 1–9). Piscataway: IEEE.
Zurück zum Zitat Elbarougy, R., & Akagi, M. (2013). Cross-lingual speech emotion recognition system based on a three-layer model for human perception. In Signal and information processing association annual summit and conference (APSIPA), 2013 Asia-Pacific (pp. 1–10). Piscataway: IEEE. Elbarougy, R., & Akagi, M. (2013). Cross-lingual speech emotion recognition system based on a three-layer model for human perception. In Signal and information processing association annual summit and conference (APSIPA), 2013 Asia-Pacific (pp. 1–10). Piscataway: IEEE.
Zurück zum Zitat Erdem, C. E., Bozkurt, E., Erzin, E., & Erdem, A. T. (2010). RANSAC-based training data selection for emotion recognition from spontaneous speech. In Proceedings of the 3rd international workshop on affective interaction in natural environments (pp. 9–14). ACM. Erdem, C. E., Bozkurt, E., Erzin, E., & Erdem, A. T. (2010). RANSAC-based training data selection for emotion recognition from spontaneous speech. In Proceedings of the 3rd international workshop on affective interaction in natural environments (pp. 9–14). ACM.
Zurück zum Zitat Esmaileyan, Z., & Marvi, H. (2014). Recognition of emotion in speech using variogram based features. Malaysian Journal of Computer Science, 27(3), 156–170. Esmaileyan, Z., & Marvi, H. (2014). Recognition of emotion in speech using variogram based features. Malaysian Journal of Computer Science, 27(3), 156–170.
Zurück zum Zitat Espinosa, H. P., García, C. A. R., & Pineda, L. V. (2010). Features selection for primitives estimation on emotional speech. In 2010 IEEE international conference on acoustics speech and signal processing (ICASSP) (pp. 5138–5141). Piscataway: IEEE.CrossRef Espinosa, H. P., García, C. A. R., & Pineda, L. V. (2010). Features selection for primitives estimation on emotional speech. In 2010 IEEE international conference on acoustics speech and signal processing (ICASSP) (pp. 5138–5141). Piscataway: IEEE.CrossRef
Zurück zum Zitat Fayek, H. M., Lech, M., & Cavedon, L. (2016). On the correlation and transferability of features between automatic speech recognition and speech emotion recognition. In INTERSPEECH (pp. 3618–3622). Fayek, H. M., Lech, M., & Cavedon, L. (2016). On the correlation and transferability of features between automatic speech recognition and speech emotion recognition. In INTERSPEECH (pp. 3618–3622).
Zurück zum Zitat Feraru, M., & Zbancioc, M. (2013). Speech emotion recognition for SROL database using weighted KNN algorithm. In 2013 international conference on electronics, computers and artificial intelligence (ECAI) (pp. 1–4). Piscataway: IEEE. Feraru, M., & Zbancioc, M. (2013). Speech emotion recognition for SROL database using weighted KNN algorithm. In 2013 international conference on electronics, computers and artificial intelligence (ECAI) (pp. 1–4). Piscataway: IEEE.
Zurück zum Zitat Fernandez, R., & Picard, R. (2011). Recognizing affect from speech prosody using hierarchical graphical models. Speech Communication, 53(9), 1088–1103.CrossRef Fernandez, R., & Picard, R. (2011). Recognizing affect from speech prosody using hierarchical graphical models. Speech Communication, 53(9), 1088–1103.CrossRef
Zurück zum Zitat Firoz Shah, A., Vimal, K. V. R., Raji, S. A., Jayakumar, A., & Babu, A. P. (2009) Speaker independent automatic emotion recognition from speech: a comparison of MFCCs and discrete wavelet transforms. In Proceedings of international conference on advances in recent technologies in communication and computing, pp 528–531. Firoz Shah, A., Vimal, K. V. R., Raji, S. A., Jayakumar, A., & Babu, A. P. (2009) Speaker independent automatic emotion recognition from speech: a comparison of MFCCs and discrete wavelet transforms. In Proceedings of international conference on advances in recent technologies in communication and computing, pp 528–531.
Zurück zum Zitat Fu, L., Mao, X., & Chen, L. (2008a). Relative speech emotion recognition based artificial neural network. In Pacific-Asia workshop on computational intelligence and industrial application, 2008. PACIIA’08. (Vol. 2, pp. 140–144). Piscataway: IEEE.CrossRef Fu, L., Mao, X., & Chen, L. (2008a). Relative speech emotion recognition based artificial neural network. In Pacific-Asia workshop on computational intelligence and industrial application, 2008. PACIIA’08. (Vol. 2, pp. 140–144). Piscataway: IEEE.CrossRef
Zurück zum Zitat Fu, L., Mao, X., & Chen, L. (2008b). Speaker independent emotion recognition based on SVM/HMMs fusion system. In International conference on audio, language and image processing, 2008. ICALIP 2008 (pp. 61–65). Piscataway: IEEE. Fu, L., Mao, X., & Chen, L. (2008b). Speaker independent emotion recognition based on SVM/HMMs fusion system. In International conference on audio, language and image processing, 2008. ICALIP 2008 (pp. 61–65). Piscataway: IEEE.
Zurück zum Zitat Gamage, K. W., Sethu, V., & Ambikairajah, E. (2017). Salience based lexical features for emotion recognition. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5830–5834). Piscataway: IEEE.CrossRef Gamage, K. W., Sethu, V., & Ambikairajah, E. (2017). Salience based lexical features for emotion recognition. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5830–5834). Piscataway: IEEE.CrossRef
Zurück zum Zitat Garg, V., Kumar, H., & Sinha, R. (2013). Speech based emotion recognition based on hierarchical decision tree with SVM, BLG and SVR classifiers. In 2013 national conference on communications (NCC) (pp. 1–5). Piscataway: IEEE. Garg, V., Kumar, H., & Sinha, R. (2013). Speech based emotion recognition based on hierarchical decision tree with SVM, BLG and SVR classifiers. In 2013 national conference on communications (NCC) (pp. 1–5). Piscataway: IEEE.
Zurück zum Zitat Gaurav, M. (2008). Performance analysis of spectral and prosodic features and their fusion for emotion recognition in speech. In Spoken language technology workshop, 2008. SLT 2008 (pp. 313–316). Piscataway: IEEE.CrossRef Gaurav, M. (2008). Performance analysis of spectral and prosodic features and their fusion for emotion recognition in speech. In Spoken language technology workshop, 2008. SLT 2008 (pp. 313–316). Piscataway: IEEE.CrossRef
Zurück zum Zitat Georgogiannis, A., & Digalakis, V. (2012). Speech emotion recognition using non-linear teager energy based features in noisy environments. In 2012 proceedings of the 20th European signal processing conference (EUSIPCO) (pp. 2045–2049). Piscataway: IEEE. Georgogiannis, A., & Digalakis, V. (2012). Speech emotion recognition using non-linear teager energy based features in noisy environments. In 2012 proceedings of the 20th European signal processing conference (EUSIPCO) (pp. 2045–2049). Piscataway: IEEE.
Zurück zum Zitat Gharavian, D., Sheikhan, M., & Ashoftedel, F. (2013). Emotion recognition improvement using normalized formant supplementary features by hybrid of DTW-MLP-GMM model. Neural Computing and Applications, 22(6), 1181–1191.CrossRef Gharavian, D., Sheikhan, M., & Ashoftedel, F. (2013). Emotion recognition improvement using normalized formant supplementary features by hybrid of DTW-MLP-GMM model. Neural Computing and Applications, 22(6), 1181–1191.CrossRef
Zurück zum Zitat Gharavian, D., Sheikhan, M., & Janipour, M. (2010). Pitch in emotional speech and emotional speech recognition using pitch frequency. Majlesi Journal of Electrical Engineering, 4(1). Gharavian, D., Sheikhan, M., & Janipour, M. (2010). Pitch in emotional speech and emotional speech recognition using pitch frequency. Majlesi Journal of Electrical Engineering, 4(1).
Zurück zum Zitat Gharavian, D., Sheikhan, M., Nazerieh, A., & Garoucy, S. (2012). Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network. Neural Computing and Applications, 21(8), 2115–2126.CrossRef Gharavian, D., Sheikhan, M., Nazerieh, A., & Garoucy, S. (2012). Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network. Neural Computing and Applications, 21(8), 2115–2126.CrossRef
Zurück zum Zitat Gharsellaoui, S., Selouani, S. A., & Dahmane, A. O. (2015). Automatic emotion recognition using auditory and prosodic indicative features. In 2015 IEEE 28th Canadian conference on electrical and computer engineering (CCECE) (pp. 1265–1270). Piscataway: IEEE.CrossRef Gharsellaoui, S., Selouani, S. A., & Dahmane, A. O. (2015). Automatic emotion recognition using auditory and prosodic indicative features. In 2015 IEEE 28th Canadian conference on electrical and computer engineering (CCECE) (pp. 1265–1270). Piscataway: IEEE.CrossRef
Zurück zum Zitat Giannoulis, P., & Potamianos, G. (2012). A hierarchical approach with feature selection for emotion recognition from speech. In LREC (pp. 1203–1206). Giannoulis, P., & Potamianos, G. (2012). A hierarchical approach with feature selection for emotion recognition from speech. In LREC (pp. 1203–1206).
Zurück zum Zitat Glüge, S., Böck, R., & Wendemuth, A. (2011). Segmented-memory recurrent neural networks versus hidden markov models in emotion recognition from speech. In IJCCI (NCTA) (pp. 308–315). Glüge, S., Böck, R., & Wendemuth, A. (2011). Segmented-memory recurrent neural networks versus hidden markov models in emotion recognition from speech. In IJCCI (NCTA) (pp. 308–315).
Zurück zum Zitat Grimm, M., Kroschel, K., Mower, E., & Narayanan, S. (2007a). Primitives-based evaluation and estimation of emotions in speech. Speech Communication, 49(10), 787–800.CrossRef Grimm, M., Kroschel, K., Mower, E., & Narayanan, S. (2007a). Primitives-based evaluation and estimation of emotions in speech. Speech Communication, 49(10), 787–800.CrossRef
Zurück zum Zitat Grimm, M., Kroschel, K., & Narayanan, S. (2007b). Support vector regression for automatic recognition of spontaneous emotions in speech. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–1085). Piscataway: IEEE. Grimm, M., Kroschel, K., & Narayanan, S. (2007b). Support vector regression for automatic recognition of spontaneous emotions in speech. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–1085). Piscataway: IEEE.
Zurück zum Zitat Hamidi, M., & Mansoorizade, M. (2012). Emotion recognition from persian speech with neural network. International Journal of Artificial Intelligence & Applications, 3(5), 107.CrossRef Hamidi, M., & Mansoorizade, M. (2012). Emotion recognition from persian speech with neural network. International Journal of Artificial Intelligence & Applications, 3(5), 107.CrossRef
Zurück zum Zitat Han, J., Zhang, Z., Ringeval, F., & Schuller, B. (2017). Prediction-based learning for continuous emotion recognition in speech. In 42nd IEEE international conference on acoustics, speech, and signal processing, ICASSP 2017. Han, J., Zhang, Z., Ringeval, F., & Schuller, B. (2017). Prediction-based learning for continuous emotion recognition in speech. In 42nd IEEE international conference on acoustics, speech, and signal processing, ICASSP 2017.
Zurück zum Zitat Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In 15th annual conference of the international speech communication association. Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In 15th annual conference of the international speech communication association.
Zurück zum Zitat Harimi, A., Fakhr, H. S., & Bakhshi, A. (2016). Recognition of emotion using reconstructed phase space of speech. Malaysian Journal of Computer Science, 29(4), 262–271.CrossRef Harimi, A., Fakhr, H. S., & Bakhshi, A. (2016). Recognition of emotion using reconstructed phase space of speech. Malaysian Journal of Computer Science, 29(4), 262–271.CrossRef
Zurück zum Zitat Hassan, A., & Damper, R. I. (2009). Emotion recognition from speech using extended feature selection and a simple classifier. In 10th annual conference of the international speech communication association. Hassan, A., & Damper, R. I. (2009). Emotion recognition from speech using extended feature selection and a simple classifier. In 10th annual conference of the international speech communication association.
Zurück zum Zitat He, L., Lech, M., Maddage, N. C., & Allen, N. B. (2011). Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech. Biomedical Signal Processing and Control, 6(2), 139–146.CrossRef He, L., Lech, M., Maddage, N. C., & Allen, N. B. (2011). Study of empirical mode decomposition and spectral analysis for stress and emotion classification in natural speech. Biomedical Signal Processing and Control, 6(2), 139–146.CrossRef
Zurück zum Zitat Henríquez, P., Alonso, J. B., Ferrer, M. A., Travieso, C. M., & Orozco-Arroyave, J. R. (2014). Nonlinear dynamics characterization of emotional speech. Neurocomputing, 132, 126–135.CrossRef Henríquez, P., Alonso, J. B., Ferrer, M. A., Travieso, C. M., & Orozco-Arroyave, J. R. (2014). Nonlinear dynamics characterization of emotional speech. Neurocomputing, 132, 126–135.CrossRef
Zurück zum Zitat Hu, H., Xu, M. X., & Wu, W. (2007). Fusion of global statistical and segmental spectral features for speech emotion recognition. In INTERSPEECH (pp. 2269–2272). Hu, H., Xu, M. X., & Wu, W. (2007). Fusion of global statistical and segmental spectral features for speech emotion recognition. In INTERSPEECH (pp. 2269–2272).
Zurück zum Zitat Hu, H., Xu, M. X., & Wu, W. (2007). GMM supervector based SVM with spectral features for speech emotion recognition. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–413). Piscataway: IEEE. Hu, H., Xu, M. X., & Wu, W. (2007). GMM supervector based SVM with spectral features for speech emotion recognition. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–413). Piscataway: IEEE.
Zurück zum Zitat Huang, R., & Ma, C. (2006). Toward a speaker-independent real-time affect detection system. In 18th international conference on pattern recognition, 2006. ICPR 2006. (Vol. 1, pp. 1204–1207). Piscataway: IEEE. Huang, R., & Ma, C. (2006). Toward a speaker-independent real-time affect detection system. In 18th international conference on pattern recognition, 2006. ICPR 2006. (Vol. 1, pp. 1204–1207). Piscataway: IEEE.
Zurück zum Zitat Huang, Y., Wu, A., Zhang, G., & Li, Y. (2016). Speech emotion recognition based on deep belief networks and wavelet packet cepstral coefficients. International Journal of Simulation: Systems, Science and Technology, 17(28), 28–31. Huang, Y., Wu, A., Zhang, G., & Li, Y. (2016). Speech emotion recognition based on deep belief networks and wavelet packet cepstral coefficients. International Journal of Simulation: Systems, Science and Technology, 17(28), 28–31.
Zurück zum Zitat Huang, Z., Dong, M., Mao, Q., & Zhan, Y. (2014). Speech emotion recognition using CNN. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 801–804). ACM. Huang, Z., Dong, M., Mao, Q., & Zhan, Y. (2014). Speech emotion recognition using CNN. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 801–804). ACM.
Zurück zum Zitat Hussain, L., Shafi, I., Saeed, S., Abbas, A., Awan, I. A., Nadeem, S. A., … Rahman, B. (2017). A radial base neural network approach for emotion recognition in human speech. IJCSNS, 17(8), 52. Hussain, L., Shafi, I., Saeed, S., Abbas, A., Awan, I. A., Nadeem, S. A., … Rahman, B. (2017). A radial base neural network approach for emotion recognition in human speech. IJCSNS, 17(8), 52.
Zurück zum Zitat Iliev, A. I., & Scordilis, M. S. (2008). Emotion recognition in speech using inter-sentence Glottal statistics. In 15th international conference on systems, signals and image processing, 2008. IWSSIP 2008. (pp. 465–468). Piscataway: IEEE.CrossRef Iliev, A. I., & Scordilis, M. S. (2008). Emotion recognition in speech using inter-sentence Glottal statistics. In 15th international conference on systems, signals and image processing, 2008. IWSSIP 2008. (pp. 465–468). Piscataway: IEEE.CrossRef
Zurück zum Zitat Iliev, A. I., Scordilis, M. S., Papa, J. P., & Falcão, A. X. (2010). Spoken emotion recognition through optimum-path forest classification using glottal features. Computer Speech & Language, 24(3), 445–460.CrossRef Iliev, A. I., Scordilis, M. S., Papa, J. P., & Falcão, A. X. (2010). Spoken emotion recognition through optimum-path forest classification using glottal features. Computer Speech & Language, 24(3), 445–460.CrossRef
Zurück zum Zitat Iliou, T., & Anagnostopoulos, C. N. (2009). Comparison of different classifiers for emotion recognition. In 13th panhellenic conference on informatics, 2009. PCI’09. (pp. 102–106). Piscataway: IEEE.CrossRef Iliou, T., & Anagnostopoulos, C. N. (2009). Comparison of different classifiers for emotion recognition. In 13th panhellenic conference on informatics, 2009. PCI’09. (pp. 102–106). Piscataway: IEEE.CrossRef
Zurück zum Zitat Iliou, T., & Anagnostopoulos, C. N. (2010a). SVM-MLP-PNN classifiers on speech emotion recognition field—A comparative study. In 2010 fifth international conference on digital telecommunications (ICDT) (pp. 1–6). Piscataway: IEEE. Iliou, T., & Anagnostopoulos, C. N. (2010a). SVM-MLP-PNN classifiers on speech emotion recognition field—A comparative study. In 2010 fifth international conference on digital telecommunications (ICDT) (pp. 1–6). Piscataway: IEEE.
Zurück zum Zitat Iliou, T., & Anagnostopoulos, C. N. (2010b). Classification on speech emotion recognition-a comparative study. Animation, 4, 5. Iliou, T., & Anagnostopoulos, C. N. (2010b). Classification on speech emotion recognition-a comparative study. Animation, 4, 5.
Zurück zum Zitat Iriondo, I., Planet, S., Alías, F., Socoró, J. C., & Martínez, E. (2007). Validation of an expressive speech corpus by mapping automatic classification to subjective evaluation. Computational and Ambient Intelligence, 646–653. Iriondo, I., Planet, S., Alías, F., Socoró, J. C., & Martínez, E. (2007). Validation of an expressive speech corpus by mapping automatic classification to subjective evaluation. Computational and Ambient Intelligence, 646–653.
Zurück zum Zitat Ivanov, A., & Riccardi, G. (2012). Kolmogorov-Smirnov test for feature selection in emotion recognition from speech. In 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5125–5128). Piscataway: IEEE.CrossRef Ivanov, A., & Riccardi, G. (2012). Kolmogorov-Smirnov test for feature selection in emotion recognition from speech. In 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5125–5128). Piscataway: IEEE.CrossRef
Zurück zum Zitat Javidi, M. M., & Roshan, E. F. (2013). Speech emotion recognition by using combinations of C5. 0, neural network (NN), and support vector machines (SVM) classification methods. International Journal of Applied Mathematics and Computer Science, 6, 191–200. Javidi, M. M., & Roshan, E. F. (2013). Speech emotion recognition by using combinations of C5. 0, neural network (NN), and support vector machines (SVM) classification methods. International Journal of Applied Mathematics and Computer Science, 6, 191–200.
Zurück zum Zitat Jeon, J. H., Xia, R., & Liu, Y. (2011). Sentence level emotion recognition based on decisions from subsentence segments. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4940–4943). Piscataway: IEEE.CrossRef Jeon, J. H., Xia, R., & Liu, Y. (2011). Sentence level emotion recognition based on decisions from subsentence segments. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4940–4943). Piscataway: IEEE.CrossRef
Zurück zum Zitat Jiang, J., Wu, Z., Xu, M., Jia, J., & Cai, L. (2012). Comparison of adaptation methods for GMM-SVM based speech emotion recognition. In 2012 IEEE spoken language technology workshop (SLT) (pp. 269–273). Piscataway: IEEE.CrossRef Jiang, J., Wu, Z., Xu, M., Jia, J., & Cai, L. (2012). Comparison of adaptation methods for GMM-SVM based speech emotion recognition. In 2012 IEEE spoken language technology workshop (SLT) (pp. 269–273). Piscataway: IEEE.CrossRef
Zurück zum Zitat Jin, Q., Li, C., Chen, S., & Wu, H. (2015). Speech emotion recognition with acoustic and lexical features. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4749–4753). Piscataway: IEEE.CrossRef Jin, Q., Li, C., Chen, S., & Wu, H. (2015). Speech emotion recognition with acoustic and lexical features. In 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4749–4753). Piscataway: IEEE.CrossRef
Zurück zum Zitat Kamińska, D., & Pelikant, A. (2012). Recognition of human emotion from a speech signal based on Plutchik’s model. International Journal of Electronics and Telecommunications, 58(2), 165–170.CrossRef Kamińska, D., & Pelikant, A. (2012). Recognition of human emotion from a speech signal based on Plutchik’s model. International Journal of Electronics and Telecommunications, 58(2), 165–170.CrossRef
Zurück zum Zitat Kandali, A. B., Routray, A., & Basu, T. K. (2008). Emotion recognition from Assamese speeches using MFCC features and GMM classifier. In TENCON 2008–2008 IEEE region 10 conference (pp. 1–5). Piscataway: IEEE. Kandali, A. B., Routray, A., & Basu, T. K. (2008). Emotion recognition from Assamese speeches using MFCC features and GMM classifier. In TENCON 2008–2008 IEEE region 10 conference (pp. 1–5). Piscataway: IEEE.
Zurück zum Zitat Khan, M., Goskula, T., Nasiruddin, M., & Quazi, R. (2011). Comparison between k-nn and svm method for speech emotion recognition. International Journal on Computer Science and Engineering, 3(2), 607–611. Khan, M., Goskula, T., Nasiruddin, M., & Quazi, R. (2011). Comparison between k-nn and svm method for speech emotion recognition. International Journal on Computer Science and Engineering, 3(2), 607–611.
Zurück zum Zitat Khanna, P., & Kumar, M. S. (2011). Application of vector quantization in emotion recognition from human speech. In International conference on information intelligence, systems, technology and management (pp. 118–125). Berlin, Heidelberg: Springer.CrossRef Khanna, P., & Kumar, M. S. (2011). Application of vector quantization in emotion recognition from human speech. In International conference on information intelligence, systems, technology and management (pp. 118–125). Berlin, Heidelberg: Springer.CrossRef
Zurück zum Zitat Kim, E. H., Hyun, K. H., Kim, S. H., & Kwak, Y. K. (2009). Improved emotion recognition with a novel speaker-independent feature. IEEE/ASME Transactions on Mechatronics, 14(3), 317–325.CrossRef Kim, E. H., Hyun, K. H., Kim, S. H., & Kwak, Y. K. (2009). Improved emotion recognition with a novel speaker-independent feature. IEEE/ASME Transactions on Mechatronics, 14(3), 317–325.CrossRef
Zurück zum Zitat Kim, E. H., Hyun, K. H., & Kwak, Y. K. (2006). Improvement of emotion recognition from voice by separating of obstruents. In The 15th IEEE international symposium on robot and human interactive communication, 2006. ROMAN 2006. (pp. 564–568). Piscataway: IEEE.CrossRef Kim, E. H., Hyun, K. H., & Kwak, Y. K. (2006). Improvement of emotion recognition from voice by separating of obstruents. In The 15th IEEE international symposium on robot and human interactive communication, 2006. ROMAN 2006. (pp. 564–568). Piscataway: IEEE.CrossRef
Zurück zum Zitat Kim, J. B., Park, J. S., & Oh, Y. H. (2011). On-line speaker adaptation based emotion recognition using incremental emotional information. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4948–4951). Piscataway: IEEE.CrossRef Kim, J. B., Park, J. S., & Oh, Y. H. (2011). On-line speaker adaptation based emotion recognition using incremental emotional information. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4948–4951). Piscataway: IEEE.CrossRef
Zurück zum Zitat Kim, J. B., Park, J. S., & Oh, Y. H. (2012). Speaker-characterized emotion recognition using online and iterative speaker adaptation. Cognitive Computation, 4(4), 398–408.CrossRef Kim, J. B., Park, J. S., & Oh, Y. H. (2012). Speaker-characterized emotion recognition using online and iterative speaker adaptation. Cognitive Computation, 4(4), 398–408.CrossRef
Zurück zum Zitat Kim, S., Georgiou, P. G., Lee, S., & Narayanan, S. (2007). Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In IEEE 9th workshop on multimedia signal processing, 2007. MMSP 2007 (pp. 48–51). Kim, S., Georgiou, P. G., Lee, S., & Narayanan, S. (2007). Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In IEEE 9th workshop on multimedia signal processing, 2007. MMSP 2007 (pp. 48–51).
Zurück zum Zitat Kishore, K. K., & Satish, P. K. (2013). Emotion recognition in speech using MFCC and wavelet features. In 2013 IEEE 3rd international advance computing conference (IACC) (pp. 842–847). Piscataway: IEEE.CrossRef Kishore, K. K., & Satish, P. K. (2013). Emotion recognition in speech using MFCC and wavelet features. In 2013 IEEE 3rd international advance computing conference (IACC) (pp. 842–847). Piscataway: IEEE.CrossRef
Zurück zum Zitat Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele, Keele University 33. Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele, Keele University 33.
Zurück zum Zitat Koolagudi, S. G., & Krothapalli, R. S. (2011). Two stage emotion recognition based on speaking rate. International Journal of Speech Technology, 14(1), 35–48.CrossRef Koolagudi, S. G., & Krothapalli, R. S. (2011). Two stage emotion recognition based on speaking rate. International Journal of Speech Technology, 14(1), 35–48.CrossRef
Zurück zum Zitat Koolagudi, S. G., & Krothapalli, S. R. (2012). Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features. International Journal of Speech Technology, 15(4), 495–511.CrossRef Koolagudi, S. G., & Krothapalli, S. R. (2012). Emotion recognition from speech using sub-syllabic and pitch synchronous spectral features. International Journal of Speech Technology, 15(4), 495–511.CrossRef
Zurück zum Zitat Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech using source, system, and prosodic features. International Journal of Speech Technology, 15(2), 265–289.CrossRef Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech using source, system, and prosodic features. International Journal of Speech Technology, 15(2), 265–289.CrossRef
Zurück zum Zitat Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In 2010 international conference on signal processing and communications (SPCOM) (pp. 1–5). Piscataway: IEEE. Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In 2010 international conference on signal processing and communications (SPCOM) (pp. 1–5). Piscataway: IEEE.
Zurück zum Zitat Kostoulas, T., Ganchev, T., Lazaridis, A., & Fakotakis, N. (2010) Enhancing Emotion recognition from speech through feature selection. In P. Sojka, A. Horák, I. Kopecek & K. Pala (Eds.) Text, speech and dialogue, lecture notes in artificial intelligence, Vol. 6231, pp. 338–344. Kostoulas, T., Ganchev, T., Lazaridis, A., & Fakotakis, N. (2010) Enhancing Emotion recognition from speech through feature selection. In P. Sojka, A. Horák, I. Kopecek & K. Pala (Eds.) Text, speech and dialogue, lecture notes in artificial intelligence, Vol. 6231, pp. 338–344.
Zurück zum Zitat Kostoulas, T., Ganchev, T., Mporas, I., & Fakotakis, N. (2007) Detection of negative emotional states in real-world scenario. In Proceedings of 19th IEEE international conference on tools with artificial intelligence, pp 502–509. Kostoulas, T., Ganchev, T., Mporas, I., & Fakotakis, N. (2007) Detection of negative emotional states in real-world scenario. In Proceedings of 19th IEEE international conference on tools with artificial intelligence, pp 502–509.
Zurück zum Zitat Kotti, M., Paterno, F., & Kotropoulos, C. (2010). Speaker-independent negative emotion recognition. In 2010 2nd international workshop on cognitive information processing (CIP) (pp. 417–422). Piscataway: IEEE. Kotti, M., Paterno, F., & Kotropoulos, C. (2010). Speaker-independent negative emotion recognition. In 2010 2nd international workshop on cognitive information processing (CIP) (pp. 417–422). Piscataway: IEEE.
Zurück zum Zitat Le, D., Aldeneh, Z., & Provost, E. M. (2017). Discretized continuous speech emotion recognition with multi-task deep recurrent neural network. Interspeech, 2017. Le, D., Aldeneh, Z., & Provost, E. M. (2017). Discretized continuous speech emotion recognition with multi-task deep recurrent neural network. Interspeech, 2017.
Zurück zum Zitat Le, D., & Provost, E. M. (2013). Emotion recognition from spontaneous speech using hidden markov models with deep belief networks. In 2013 IEEE workshop on automatic speech recognition and understanding (ASRU) (pp. 216–221). Piscataway: IEEE.CrossRef Le, D., & Provost, E. M. (2013). Emotion recognition from spontaneous speech using hidden markov models with deep belief networks. In 2013 IEEE workshop on automatic speech recognition and understanding (ASRU) (pp. 216–221). Piscataway: IEEE.CrossRef
Zurück zum Zitat Lee, J., & Tashev, I. (2015). High-level feature representation using recurrent neural network for speech emotion recognition. In INTERSPEECH (pp. 1537–1540). Lee, J., & Tashev, I. (2015). High-level feature representation using recurrent neural network for speech emotion recognition. In INTERSPEECH (pp. 1537–1540).
Zurück zum Zitat Lefter, I., Rothkrantz, L. J., Wiggers, P., & Van Leeuwen, D. A. (2010). Emotion recognition from speech by combining databases and fusion of classifiers. In Text, speech and dialogue (pp. 353–360). Berlin Heidelberg: Springer.CrossRef Lefter, I., Rothkrantz, L. J., Wiggers, P., & Van Leeuwen, D. A. (2010). Emotion recognition from speech by combining databases and fusion of classifiers. In Text, speech and dialogue (pp. 353–360). Berlin Heidelberg: Springer.CrossRef
Zurück zum Zitat Li, L., Zhao, Y., Jiang, D., Zhang, Y., Wang, F., Gonzalez, I., … Sahli, H. (2013). Hybrid deep neural network–hidden markov model (DNN-HMM) based speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (ACII) (pp. 312–317). Piscataway: IEEE.CrossRef Li, L., Zhao, Y., Jiang, D., Zhang, Y., Wang, F., Gonzalez, I., … Sahli, H. (2013). Hybrid deep neural network–hidden markov model (DNN-HMM) based speech emotion recognition. In 2013 humaine association conference on affective computing and intelligent interaction (ACII) (pp. 312–317). Piscataway: IEEE.CrossRef
Zurück zum Zitat Li, Y., Chao, L., Liu, Y., Bao, W., & Tao, J. (2015) From simulated speech to natural speech, what are the robust features for emotion recognition? In International conference on affective computing and intelligent interaction (ACII) (pp. 368–373). Piscataway: IEEE Li, Y., Chao, L., Liu, Y., Bao, W., & Tao, J. (2015) From simulated speech to natural speech, what are the robust features for emotion recognition? In International conference on affective computing and intelligent interaction (ACII) (pp. 368–373). Piscataway: IEEE
Zurück zum Zitat Lim, W., Jang, D., & Lee, T. (2016). Speech emotion recognition using convolutional and recurrent neural networks. In Signal and information processing association annual summit and conference (APSIPA), 2016 Asia-Pacific (pp. 1–4). Piscataway: IEEE. Lim, W., Jang, D., & Lee, T. (2016). Speech emotion recognition using convolutional and recurrent neural networks. In Signal and information processing association annual summit and conference (APSIPA), 2016 Asia-Pacific (pp. 1–4). Piscataway: IEEE.
Zurück zum Zitat Litman, D. J., & Forbes-Riley, K. (2006). Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Communication, 48(5), 559–590.CrossRef Litman, D. J., & Forbes-Riley, K. (2006). Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Communication, 48(5), 559–590.CrossRef
Zurück zum Zitat Liu, J., Chen, C., Bu, J., You, M., & Tao, J. (2007). Speech emotion recognition based on a fusion of all-class and pairwise-class feature selection. Computational Science–ICCS 2007, 168–175. Liu, J., Chen, C., Bu, J., You, M., & Tao, J. (2007). Speech emotion recognition based on a fusion of all-class and pairwise-class feature selection. Computational Science–ICCS 2007, 168–175.
Zurück zum Zitat Luengo, I., Navas, E., & Hernáez, I. (2010). Feature analysis and evaluation for automatic emotion identification in speech. IEEE Transactions on Multimedia, 12(6), 490–501.CrossRef Luengo, I., Navas, E., & Hernáez, I. (2010). Feature analysis and evaluation for automatic emotion identification in speech. IEEE Transactions on Multimedia, 12(6), 490–501.CrossRef
Zurück zum Zitat Lugger, M., Janoir, M. E., & Yang, B. (2009). Combining classifiers with diverse feature sets for robust speaker independent emotion recognition. In Signal processing conference, 2009 17th European (1225–1229). Piscataway: IEEE. Lugger, M., Janoir, M. E., & Yang, B. (2009). Combining classifiers with diverse feature sets for robust speaker independent emotion recognition. In Signal processing conference, 2009 17th European (1225–1229). Piscataway: IEEE.
Zurück zum Zitat Lugger, M., & Yang, B. (2007). The relevance of voice quality features in speaker independent emotion recognition. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–17). Piscataway: IEEE. Lugger, M., & Yang, B. (2007). The relevance of voice quality features in speaker independent emotion recognition. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007. (Vol. 4, pp. IV–17). Piscataway: IEEE.
Zurück zum Zitat Lugger, M., & Yang, B. (2007). An incremental analysis of different feature groups in speaker independent emotion recognition. In 16th Int. congress of phonetic sciences. Lugger, M., & Yang, B. (2007). An incremental analysis of different feature groups in speaker independent emotion recognition. In 16th Int. congress of phonetic sciences.
Zurück zum Zitat Mannepalli, K., Sastry, P. N., & Suman, M. (2016). A novel adaptive fractional deep belief networks for speaker emotion recognition. Alexandria Engineering Journal. Mannepalli, K., Sastry, P. N., & Suman, M. (2016). A novel adaptive fractional deep belief networks for speaker emotion recognition. Alexandria Engineering Journal.
Zurück zum Zitat Mao, Q., Xue, W., Rao, Q., Zhang, F., & Zhan, Y. (2016). Domain adaptation for speech emotion recognition by sharing priors between related source and target classes. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2608–2612). Piscataway: IEEE.CrossRef Mao, Q., Xue, W., Rao, Q., Zhang, F., & Zhan, Y. (2016). Domain adaptation for speech emotion recognition by sharing priors between related source and target classes. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2608–2612). Piscataway: IEEE.CrossRef
Zurück zum Zitat Mao, X., Chen, L., & Fu, L. (2009). Multi-level speech emotion recognition based on HMM and ANN. In 2009 WRI World congress on computer science and information engineering (Vol. 7, pp. 225–229). Piscataway: IEEE.CrossRef Mao, X., Chen, L., & Fu, L. (2009). Multi-level speech emotion recognition based on HMM and ANN. In 2009 WRI World congress on computer science and information engineering (Vol. 7, pp. 225–229). Piscataway: IEEE.CrossRef
Zurück zum Zitat Mao, X., Zhang, B., & Luo, Y. (2007). Speech emotion recognition based on a hybrid of HMM/ANN. In Proceedings of the 7th conference on 7th WSEAS international conference on applied informatics and communications (Vol. 7, pp. 367–370). Mao, X., Zhang, B., & Luo, Y. (2007). Speech emotion recognition based on a hybrid of HMM/ANN. In Proceedings of the 7th conference on 7th WSEAS international conference on applied informatics and communications (Vol. 7, pp. 367–370).
Zurück zum Zitat Mencattini, A., Martinelli, E., Ringeval, F., Schuller, B., & Di Natlae, C. (2017). Continuous estimation of emotions in speech by dynamic cooperative speaker models. In IEEE transactions on affective computing. Mencattini, A., Martinelli, E., Ringeval, F., Schuller, B., & Di Natlae, C. (2017). Continuous estimation of emotions in speech by dynamic cooperative speaker models. In IEEE transactions on affective computing.
Zurück zum Zitat Milton, A., Roy, S. S., & Selvi, S. T. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9). Milton, A., Roy, S. S., & Selvi, S. T. (2013). Svm scheme for speech emotion recognition using mfcc feature. International Journal of Computer Applications, 69(9).
Zurück zum Zitat Milton, A., & Selvi, S. T. (2014). Class-specific multiple classifiers scheme to recognize emotions from speech signals. Computer Speech & Language, 28(3), 727–742.CrossRef Milton, A., & Selvi, S. T. (2014). Class-specific multiple classifiers scheme to recognize emotions from speech signals. Computer Speech & Language, 28(3), 727–742.CrossRef
Zurück zum Zitat Mishra, H. K., & Sekhar, C. C. (2009). Variational Gaussian mixture models for speech emotion recognition. In Seventh international conference on advances in pattern recognition, 2009. ICAPR’09. (pp. 183–186). Piscataway: IEEE.CrossRef Mishra, H. K., & Sekhar, C. C. (2009). Variational Gaussian mixture models for speech emotion recognition. In Seventh international conference on advances in pattern recognition, 2009. ICAPR09. (pp. 183–186). Piscataway: IEEE.CrossRef
Zurück zum Zitat Morales-Perez, M., Echeverry-Correa, J., Orozco-Gutierrez, A., & Castellanos-Dominguez, G. (2008). Feature extraction of speech signals in emotion identification. In Engineering in medicine and biology society, 2008. EMBS 2008. 30th annual international conference of the IEEE (pp. 2590–2593). Piscataway: IEEE. Morales-Perez, M., Echeverry-Correa, J., Orozco-Gutierrez, A., & Castellanos-Dominguez, G. (2008). Feature extraction of speech signals in emotion identification. In Engineering in medicine and biology society, 2008. EMBS 2008. 30th annual international conference of the IEEE (pp. 2590–2593). Piscataway: IEEE.
Zurück zum Zitat Morrison, D., Wang, R., & De Silva, L. C. (2007). Ensemble methods for spoken emotion recognition in call-centres. Speech communication, 49(2), 98–112.CrossRef Morrison, D., Wang, R., & De Silva, L. C. (2007). Ensemble methods for spoken emotion recognition in call-centres. Speech communication, 49(2), 98–112.CrossRef
Zurück zum Zitat Navas, E., Hernáez, I., & Luengo, I. (2006). An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. EEE Transactions on Audio, Speech and Language Processing 14, 1117–1127.CrossRef Navas, E., Hernáez, I., & Luengo, I. (2006). An objective and subjective study of the role of semantics and prosodic features in building corpora for emotional TTS. EEE Transactions on Audio, Speech and Language Processing 14, 1117–1127.CrossRef
Zurück zum Zitat Neiberg, D., & Elenius, K. (2008). Automatic recognition of anger in spontaneous speech. In 9th annual conference of the international speech communication association. Neiberg, D., & Elenius, K. (2008). Automatic recognition of anger in spontaneous speech. In 9th annual conference of the international speech communication association.
Zurück zum Zitat Nicolaou, M. A., Gunes, H., & Pantic, M. (2011). Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Transactions on Affective Computing, 2(2), 92–105.CrossRef Nicolaou, M. A., Gunes, H., & Pantic, M. (2011). Continuous prediction of spontaneous affect from multiple cues and modalities in valence-arousal space. IEEE Transactions on Affective Computing, 2(2), 92–105.CrossRef
Zurück zum Zitat Ntalampiras, S., & Fakotakis, N. (2012). Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing, 3(1), 116–125.CrossRef Ntalampiras, S., & Fakotakis, N. (2012). Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing, 3(1), 116–125.CrossRef
Zurück zum Zitat Pan, Y., Shen, P., & Shen, L. (2012). Speech emotion recognition using support vector machine. International Journal of Smart Home, 6(2), 101–108. Pan, Y., Shen, P., & Shen, L. (2012). Speech emotion recognition using support vector machine. International Journal of Smart Home, 6(2), 101–108.
Zurück zum Zitat Pao, T. L., Chien, C. S., Chen, Y. T., Yeh, J. H., Cheng, Y. M., & Liao, W. Y. (2007). Combination of multiple classifiers for improving emotion recognition in Mandarin speech. In 3rd international conference on intelligent information hiding and multimedia signal processing, 2007. IIHMSP 2007 (Vol. 1, pp. 35–38). Piscataway: IEEE. Pao, T. L., Chien, C. S., Chen, Y. T., Yeh, J. H., Cheng, Y. M., & Liao, W. Y. (2007). Combination of multiple classifiers for improving emotion recognition in Mandarin speech. In 3rd international conference on intelligent information hiding and multimedia signal processing, 2007. IIHMSP 2007 (Vol. 1, pp. 35–38). Piscataway: IEEE.
Zurück zum Zitat Pao, T. L., Wang, C. H., & Li, Y. J. (2012). A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In 2012 fifth international symposium on parallel architectures, algorithms and programming (PAAP) (pp. 157–162). Piscataway: IEEE.CrossRef Pao, T. L., Wang, C. H., & Li, Y. J. (2012). A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In 2012 fifth international symposium on parallel architectures, algorithms and programming (PAAP) (pp. 157–162). Piscataway: IEEE.CrossRef
Zurück zum Zitat Pathak, S., & Kulkarni, A. (2011). Recognizing emotions from speech. In 2011 3rd international conference on electronics computer technology (ICECT) (Vol. 4, pp. 107–109). Piscataway: IEEE.CrossRef Pathak, S., & Kulkarni, A. (2011). Recognizing emotions from speech. In 2011 3rd international conference on electronics computer technology (ICECT) (Vol. 4, pp. 107–109). Piscataway: IEEE.CrossRef
Zurück zum Zitat Philippou-Hübner, D., Vlasenko, B., Böck, R., & Wendemuth, A. (2012). The performance of the speaking rate parameter in emotion recognition from speech. In 2012 IEEE international conference on multimedia and expo (ICME) (pp. 248–253). Piscataway: IEEE.CrossRef Philippou-Hübner, D., Vlasenko, B., Böck, R., & Wendemuth, A. (2012). The performance of the speaking rate parameter in emotion recognition from speech. In 2012 IEEE international conference on multimedia and expo (ICME) (pp. 248–253). Piscataway: IEEE.CrossRef
Zurück zum Zitat Picard, R. W., & Picard, R. (1997). Affective computing (252). Cambridge: MIT press. Picard, R. W., & Picard, R. (1997). Affective computing (252). Cambridge: MIT press.
Zurück zum Zitat Pierre-Yves, O. (2003). The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies, 59(1), 157–183.CrossRef Pierre-Yves, O. (2003). The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies, 59(1), 157–183.CrossRef
Zurück zum Zitat Planet, S., & Iriondo, I. (2012). Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In 2012 7th Iberian conference on information systems and technologies (CISTI) (pp. 1–6). Piscataway: IEEE. Planet, S., & Iriondo, I. (2012). Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In 2012 7th Iberian conference on information systems and technologies (CISTI) (pp. 1–6). Piscataway: IEEE.
Zurück zum Zitat Plutchik, R. (1991). The emotions. Lanham, MD: University Press of America. Plutchik, R. (1991). The emotions. Lanham, MD: University Press of America.
Zurück zum Zitat Pohjalainen, J., Fabien Ringeval, F., Zhang, Z., & Schuller, B. (2016). Spectral and cepstral audio noise reduction techniques in speech emotion recognition. In Proceedings of the 2016 ACM on multimedia conference (pp. 670–674). ACM. Pohjalainen, J., Fabien Ringeval, F., Zhang, Z., & Schuller, B. (2016). Spectral and cepstral audio noise reduction techniques in speech emotion recognition. In Proceedings of the 2016 ACM on multimedia conference (pp. 670–674). ACM.
Zurück zum Zitat Polzehl, T., Schmitt, A., Metze, F., & Wagner, M. (2011). Anger recognition in speech using acoustic and linguistic cues. Speech Communication, 53(9), 1198–1209.CrossRef Polzehl, T., Schmitt, A., Metze, F., & Wagner, M. (2011). Anger recognition in speech using acoustic and linguistic cues. Speech Communication, 53(9), 1198–1209.CrossRef
Zurück zum Zitat Přibil, J., & Přibilová, A. (2013). Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech. EURASIP Journal on Audio, Speech, and Music Processing, 2013(1), 8.CrossRef Přibil, J., & Přibilová, A. (2013). Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech. EURASIP Journal on Audio, Speech, and Music Processing, 2013(1), 8.CrossRef
Zurück zum Zitat Rao, K. S., Koolagudi, S. G., & Vempada, R. R. (2013). Emotion recognition from speech using global and local prosodic features. International Journal of Speech Technology, 16(2), 143–160.CrossRef Rao, K. S., Koolagudi, S. G., & Vempada, R. R. (2013). Emotion recognition from speech using global and local prosodic features. International Journal of Speech Technology, 16(2), 143–160.CrossRef
Zurück zum Zitat Rao, K. S., Kumar, T. P., Anusha, K., Leela, B., Bhavana, I., & Gowtham, S. V. S. K. (2012). Emotion recognition from speech. International Journal of Computer Science and Information Technologies, 3(2), 3603–3607. Rao, K. S., Kumar, T. P., Anusha, K., Leela, B., Bhavana, I., & Gowtham, S. V. S. K. (2012). Emotion recognition from speech. International Journal of Computer Science and Information Technologies, 3(2), 3603–3607.
Zurück zum Zitat Rehmam, B., Halim, Z., Abbas, G., & Muhammad, T. (2015). Artificial neural network-based speech recognition using Dwt analysis applied on isolated words from oriental languages. Malaysian Journal of Computer Science, 28(3), 242–262.CrossRef Rehmam, B., Halim, Z., Abbas, G., & Muhammad, T. (2015). Artificial neural network-based speech recognition using Dwt analysis applied on isolated words from oriental languages. Malaysian Journal of Computer Science, 28(3), 242–262.CrossRef
Zurück zum Zitat Ringeval, F., & Chetouani, M. (2008). Exploiting a vowel based approach for acted emotion recognition. In Verbal and nonverbal features of human-human and human-machine interaction, pp. 243–254. Ringeval, F., & Chetouani, M. (2008). Exploiting a vowel based approach for acted emotion recognition. In Verbal and nonverbal features of human-human and human-machine interaction, pp. 243–254.
Zurück zum Zitat Rodríguez, P. H., Hernández, J. B. A., Ballester, M. A. F., González, C. M. T., & Orozco-Arroyave, J. R. (2013). Global selection of features for nonlinear dynamics characterization of emotional speech. Cognitive Computation, 5(4), 517–525.CrossRef Rodríguez, P. H., Hernández, J. B. A., Ballester, M. A. F., González, C. M. T., & Orozco-Arroyave, J. R. (2013). Global selection of features for nonlinear dynamics characterization of emotional speech. Cognitive Computation, 5(4), 517–525.CrossRef
Zurück zum Zitat Rong, J., Li, G., & Chen, Y. P. P. (2009). Acoustic feature selection for automatic emotion recognition from speech. Information Processing & Management, 45(3), 315–328.CrossRef Rong, J., Li, G., & Chen, Y. P. P. (2009). Acoustic feature selection for automatic emotion recognition from speech. Information Processing & Management, 45(3), 315–328.CrossRef
Zurück zum Zitat Sagha, H., Deng, J., Gavryukova, M., Han, J., & Schuller, B. (2016). Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5800–5804). Piscataway: IEEE. Sagha, H., Deng, J., Gavryukova, M., Han, J., & Schuller, B. (2016). Cross lingual speech emotion recognition using canonical correlation analysis on principal component subspace. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5800–5804). Piscataway: IEEE.
Zurück zum Zitat Sánchez-Gutiérrez, M. E., Albornoz, E. M., Martinez-Licona, F., Rufiner, H. L., & Goddard, J. (2014). Deep learning for emotional speech recognition. In Mexican conference on pattern recognition (pp. 311–320). Cham: Springer International Publishing. Sánchez-Gutiérrez, M. E., Albornoz, E. M., Martinez-Licona, F., Rufiner, H. L., & Goddard, J. (2014). Deep learning for emotional speech recognition. In Mexican conference on pattern recognition (pp. 311–320). Cham: Springer International Publishing.
Zurück zum Zitat Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks, pp. 49–70. Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks, pp. 49–70.
Zurück zum Zitat Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks (pp. 49–70). Berlin Heidelberg: Springer.CrossRef Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks (pp. 49–70). Berlin Heidelberg: Springer.CrossRef
Zurück zum Zitat Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks, 49–70. Scherer, S., Schwenker, F., & Palm, G. (2008). Emotion recognition from speech using multi-classifier systems and rbf-ensembles. In Speech, audio, image and biomedical signal processing using neural networks, 49–70.
Zurück zum Zitat Scherer, S., Schwenker, F., & Palm, G. (2009). Classifier fusion for emotion recognition from speech. In Advanced intelligent environments (pp. 95–117). Springer US. Scherer, S., Schwenker, F., & Palm, G. (2009). Classifier fusion for emotion recognition from speech. In Advanced intelligent environments (pp. 95–117). Springer US.
Zurück zum Zitat Schmitt, M., Ringeval, F., & Schuller, B. W. (2016). At the border of acoustics and linguistics: bag-of-audio-words for the recognition of emotions in speech. In INTERSPEECH (pp. 495–499). Schmitt, M., Ringeval, F., & Schuller, B. W. (2016). At the border of acoustics and linguistics: bag-of-audio-words for the recognition of emotions in speech. In INTERSPEECH (pp. 495–499).
Zurück zum Zitat Schuller, B., Seppi, D., Batliner, A., Maier, A., & Steidl, S. (2007). Towards more reality in the recognition of emotional speech. In 2007 IEEE international conference on acoustics, speech and signal processing-ICASSP’07 (Vol. 4, pp. IV–941). Piscataway: IEEE. Schuller, B., Seppi, D., Batliner, A., Maier, A., & Steidl, S. (2007). Towards more reality in the recognition of emotional speech. In 2007 IEEE international conference on acoustics, speech and signal processing-ICASSP’07 (Vol. 4, pp. IV–941). Piscataway: IEEE.
Zurück zum Zitat Schuller, B., Vlasenko, B., Eyben, F., Wollmer, M., Stuhlsatz, A., Wendemuth, A., & Rigoll, G. (2010). Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Transactions on Affective Computing, 1(2), 119–131.CrossRef Schuller, B., Vlasenko, B., Eyben, F., Wollmer, M., Stuhlsatz, A., Wendemuth, A., & Rigoll, G. (2010). Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Transactions on Affective Computing, 1(2), 119–131.CrossRef
Zurück zum Zitat Schuller, B., Vlasenko, B., Minguez, R., Rigoll, G., & Wendemuth, A. (2007). Comparing one and two-stage acoustic modeling in the recognition of emotion in speech. In IEEE workshop on automatic speech recognition & understanding, 2007. ASRU (pp. 596–600). Piscataway: IEEE.CrossRef Schuller, B., Vlasenko, B., Minguez, R., Rigoll, G., & Wendemuth, A. (2007). Comparing one and two-stage acoustic modeling in the recognition of emotion in speech. In IEEE workshop on automatic speech recognition & understanding, 2007. ASRU (pp. 596–600). Piscataway: IEEE.CrossRef
Zurück zum Zitat Schuller, B. W. (2008). Speaker, noise, and acoustic space adaptation for emotion recognition in the automotive environment. In 2008 ITG conference on voice communication (SprachKommunikation) (pp. 1–4). VDE. Schuller, B. W. (2008). Speaker, noise, and acoustic space adaptation for emotion recognition in the automotive environment. In 2008 ITG conference on voice communication (SprachKommunikation) (pp. 1–4). VDE.
Zurück zum Zitat Schwenker, F., Scherer, S., Magdi, Y. M., & Palm, G. (2009). The GMM-SVM supervector approach for the recognition of the emotional status from speech. In International conference on artificial neural networks (pp. 894–903). Berlin, Heidelberg: Springer. Schwenker, F., Scherer, S., Magdi, Y. M., & Palm, G. (2009). The GMM-SVM supervector approach for the recognition of the emotional status from speech. In International conference on artificial neural networks (pp. 894–903). Berlin, Heidelberg: Springer.
Zurück zum Zitat Sedaaghi, M. H., Kotropoulos, C., & Ververidis, D. (2007). Using adaptive genetic algorithms to improve speech emotion recognition. In IEEE 9th workshop on multimedia signal processing, 2007. MMSP 2007. (pp. 461–464). Piscataway: IEEE.CrossRef Sedaaghi, M. H., Kotropoulos, C., & Ververidis, D. (2007). Using adaptive genetic algorithms to improve speech emotion recognition. In IEEE 9th workshop on multimedia signal processing, 2007. MMSP 2007. (pp. 461–464). Piscataway: IEEE.CrossRef
Zurück zum Zitat Seehapoch, T., & Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on knowledge and smart technology (KST) (pp. 86–91). Piscataway: IEEE.CrossRef Seehapoch, T., & Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on knowledge and smart technology (KST) (pp. 86–91). Piscataway: IEEE.CrossRef
Zurück zum Zitat Ser, W., Cen, L., & Yu, Z. L. (2008). A hybrid PNN-GMM classification scheme for speech emotion recognition. In 19th international conference on pattern recognition, 2008. ICPR 2008 (pp. 1–4). Piscataway: IEEE. Ser, W., Cen, L., & Yu, Z. L. (2008). A hybrid PNN-GMM classification scheme for speech emotion recognition. In 19th international conference on pattern recognition, 2008. ICPR 2008 (pp. 1–4). Piscataway: IEEE.
Zurück zum Zitat Sethu, V., Ambikairajah, E., & Epps, J. (2007). Speaker normalisation for speech-based emotion detection. In 2007 15th international conference on digital signal processing (pp. 611–614). Piscataway: IEEE. Sethu, V., Ambikairajah, E., & Epps, J. (2007). Speaker normalisation for speech-based emotion detection. In 2007 15th international conference on digital signal processing (pp. 611–614). Piscataway: IEEE.
Zurück zum Zitat Sethu, V., Ambikairajah, E., & Epps, J. (2008a). Phonetic and speaker variations in automatic emotion classification. In 9th annual conference of the international speech communication association. Sethu, V., Ambikairajah, E., & Epps, J. (2008a). Phonetic and speaker variations in automatic emotion classification. In 9th annual conference of the international speech communication association.
Zurück zum Zitat Sethu, V., Ambikairajah, E., & Epps, J. (2008b). Empirical mode decomposition based weighted frequency feature for speech-based emotion classification. In IEEE international conference on acoustics, speech and signal processing, 2008. ICASSP 2008. (pp. 5017–5020). Piscataway: IEEE.CrossRef Sethu, V., Ambikairajah, E., & Epps, J. (2008b). Empirical mode decomposition based weighted frequency feature for speech-based emotion classification. In IEEE international conference on acoustics, speech and signal processing, 2008. ICASSP 2008. (pp. 5017–5020). Piscataway: IEEE.CrossRef
Zurück zum Zitat Sethu, V., Ambikairajah, E., & Epps, J. (2009). Speaker dependency of spectral features and speech production cues for automatic emotion classification. In IEEE international conference on acoustics, speech and signal processing, 2009. ICASSP 2009. (pp. 4693–4696). Piscataway: IEEE.CrossRef Sethu, V., Ambikairajah, E., & Epps, J. (2009). Speaker dependency of spectral features and speech production cues for automatic emotion classification. In IEEE international conference on acoustics, speech and signal processing, 2009. ICASSP 2009. (pp. 4693–4696). Piscataway: IEEE.CrossRef
Zurück zum Zitat Sethu, V., Ambikairajah, E., & Epps, J. (2013). On the use of speech parameter contours for emotion recognition. EURASIP Journal on Audio, Speech, and Music Processing, 2013(1), 19.CrossRef Sethu, V., Ambikairajah, E., & Epps, J. (2013). On the use of speech parameter contours for emotion recognition. EURASIP Journal on Audio, Speech, and Music Processing, 2013(1), 19.CrossRef
Zurück zum Zitat Shah, F. (2009). Automatic emotion recognition from speech using artificial neural networks with gender-dependent databases. In International conference on advances in computing, control, & telecommunication technologies, 2009. ACT’09. (pp. 162–164). Piscataway: IEEE. Shah, F. (2009). Automatic emotion recognition from speech using artificial neural networks with gender-dependent databases. In International conference on advances in computing, control, & telecommunication technologies, 2009. ACT09. (pp. 162–164). Piscataway: IEEE.
Zurück zum Zitat Shah, M., Miao, L., Chakrabarti, C., & Spanias, A. (2013). A speech emotion recognition framework based on latent Dirichlet allocation: Algorithm and FPGA implementation. In 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2553–2557). Piscataway: IEEE.CrossRef Shah, M., Miao, L., Chakrabarti, C., & Spanias, A. (2013). A speech emotion recognition framework based on latent Dirichlet allocation: Algorithm and FPGA implementation. In 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2553–2557). Piscataway: IEEE.CrossRef
Zurück zum Zitat Shami, M., & Verhelst, W. (2007). An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Communication, 49(3), 201–212.CrossRef Shami, M., & Verhelst, W. (2007). An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Communication, 49(3), 201–212.CrossRef
Zurück zum Zitat Shaukat, A., & Chen, K. (2011). Emotional state recognition from speech via soft-competition on different acoustic representations. In The 2011 international joint conference on neural networks (IJCNN) (pp. 1910–1917). Piscataway: IEEE.CrossRef Shaukat, A., & Chen, K. (2011). Emotional state recognition from speech via soft-competition on different acoustic representations. In The 2011 international joint conference on neural networks (IJCNN) (pp. 1910–1917). Piscataway: IEEE.CrossRef
Zurück zum Zitat Shaw, A., Vardhan, R. K., & Saxena, S. (2016). Emotion recognition and classification in speech using Artificial neural networks. International Journal of Computer Applications, 145(8). Shaw, A., Vardhan, R. K., & Saxena, S. (2016). Emotion recognition and classification in speech using Artificial neural networks. International Journal of Computer Applications, 145(8).
Zurück zum Zitat Sheikhan, M., Bejani, M., & Gharavian, D. (2013). Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method. Neural Computing and Applications, 23(1), 215–227.CrossRef Sheikhan, M., Bejani, M., & Gharavian, D. (2013). Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method. Neural Computing and Applications, 23(1), 215–227.CrossRef
Zurück zum Zitat Sheikhan, M., Gharavian, D., & Ashoftedel, F. (2012). Using DTW neural–based MFCC warping to improve emotional speech recognition. Neural Computing and Applications, 21(7), 1765–1773.CrossRef Sheikhan, M., Gharavian, D., & Ashoftedel, F. (2012). Using DTW neural–based MFCC warping to improve emotional speech recognition. Neural Computing and Applications, 21(7), 1765–1773.CrossRef
Zurück zum Zitat Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 international conference on electronic and mechanical engineering and information technology (EMEIT) (Vol. 2, pp. 621–625). Piscataway: IEEE.CrossRef Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 international conference on electronic and mechanical engineering and information technology (EMEIT) (Vol. 2, pp. 621–625). Piscataway: IEEE.CrossRef
Zurück zum Zitat Sidorov, M., Ultes, S., & Schmitt, A. (2014). Emotions are a personal thing: Towards speaker-adaptive emotion recognition. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4803–4807). Piscataway: IEEE.CrossRef Sidorov, M., Ultes, S., & Schmitt, A. (2014). Emotions are a personal thing: Towards speaker-adaptive emotion recognition. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4803–4807). Piscataway: IEEE.CrossRef
Zurück zum Zitat Soltani, K., & Ainon, R. N. (2007). Speech emotion detection based on neural networks. In 9th international symposium on signal processing and its applications, 2007. ISSPA 2007. (pp. 1–3). Piscataway: IEEE. Soltani, K., & Ainon, R. N. (2007). Speech emotion detection based on neural networks. In 9th international symposium on signal processing and its applications, 2007. ISSPA 2007. (pp. 1–3). Piscataway: IEEE.
Zurück zum Zitat Song, P., Ou, S., Zheng, W., Jin, Y., & Zhao, L. (2016). Speech emotion recognition using transfer non-negative matrix factorization. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5180–5184). Piscataway: IEEE.CrossRef Song, P., Ou, S., Zheng, W., Jin, Y., & Zhao, L. (2016). Speech emotion recognition using transfer non-negative matrix factorization. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 5180–5184). Piscataway: IEEE.CrossRef
Zurück zum Zitat Song, P., Zheng, W., Ou, S., Zhang, X., Jin, Y., Liu, J., & Yu, Y. (2016). Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization. Speech Communication, 83, 34–41.CrossRef Song, P., Zheng, W., Ou, S., Zhang, X., Jin, Y., Liu, J., & Yu, Y. (2016). Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization. Speech Communication, 83, 34–41.CrossRef
Zurück zum Zitat Steidl, S., Batliner, A., Nöth, E., & Hornegger, J. (2008). Quantification of segmentation and F0 errors and their effect on emotion recognition. In Text, speech and dialogue (pp. 525–534). Berlin/Heidelberg: Springer.CrossRef Steidl, S., Batliner, A., Nöth, E., & Hornegger, J. (2008). Quantification of segmentation and F0 errors and their effect on emotion recognition. In Text, speech and dialogue (pp. 525–534). Berlin/Heidelberg: Springer.CrossRef
Zurück zum Zitat Sun, Y., & Wen, G. (2015). Emotion recognition using semi-supervised feature selection with speaker normalization. International Journal of Speech Technology, 18(3), 317–331.CrossRef Sun, Y., & Wen, G. (2015). Emotion recognition using semi-supervised feature selection with speaker normalization. International Journal of Speech Technology, 18(3), 317–331.CrossRef
Zurück zum Zitat Sun, Y., Wen, G., & Wang, J. (2015). Weighted spectral features based on local Hu moments for speech emotion recognition. Biomedical Signal Processing and Control, 18, 80–90.CrossRef Sun, Y., Wen, G., & Wang, J. (2015). Weighted spectral features based on local Hu moments for speech emotion recognition. Biomedical Signal Processing and Control, 18, 80–90.CrossRef
Zurück zum Zitat Sun, Y., Zhou, Y., Zhao, Q., & Yan, Y. (2009). Acoustic feature optimization for emotion affected speech recognition. In International conference on information engineering and computer science, 2009. ICIECS 2009. (pp. 1–4). Piscataway: IEEE. Sun, Y., Zhou, Y., Zhao, Q., & Yan, Y. (2009). Acoustic feature optimization for emotion affected speech recognition. In International conference on information engineering and computer science, 2009. ICIECS 2009. (pp. 1–4). Piscataway: IEEE.
Zurück zum Zitat Swain, M., Sahoo, S., Routray, A., Kabisatpathy, P., & Kundu, J. N. (2015). Study of feature combination using HMM and SVM for multilingual Odiya speech emotion recognition. International Journal of Speech Technology, 18(3), 387–393.CrossRef Swain, M., Sahoo, S., Routray, A., Kabisatpathy, P., & Kundu, J. N. (2015). Study of feature combination using HMM and SVM for multilingual Odiya speech emotion recognition. International Journal of Speech Technology, 18(3), 387–393.CrossRef
Zurück zum Zitat Sztahó, D., Imre, V., & Vicsi, K. (2011). Automatic classification of emotions in spontaneous speech. Analysis of verbal and nonverbal communication and enactment. The Processing Issues, pp. 229–239. Sztahó, D., Imre, V., & Vicsi, K. (2011). Automatic classification of emotions in spontaneous speech. Analysis of verbal and nonverbal communication and enactment. The Processing Issues, pp. 229–239.
Zurück zum Zitat Tabatabaei, T. S., Krishnan, S., & Guergachi, A. (2007). Emotion recognition using novel speech signal features. In IEEE international symposium on circuits and systems, 2007. ISCAS 2007 (pp. 345–348). Piscataway: IEEE.CrossRef Tabatabaei, T. S., Krishnan, S., & Guergachi, A. (2007). Emotion recognition using novel speech signal features. In IEEE international symposium on circuits and systems, 2007. ISCAS 2007 (pp. 345–348). Piscataway: IEEE.CrossRef
Zurück zum Zitat Tahon, M., & Devillers, L. (2015). Towards a small set of robust acoustic features for emotion recognition: IEEE/ACM transactions on challenges audio, speech, and language processing, 24(1), 16–28. Tahon, M., & Devillers, L. (2015). Towards a small set of robust acoustic features for emotion recognition: IEEE/ACM transactions on challenges audio, speech, and language processing, 24(1), 16–28.
Zurück zum Zitat Tamulevicius, G., & Liogiene, T. (2015). Low-order multi-level features for speech emotions recognition. Baltic Journal of Modern Computing, 3(4), 234–247. Tamulevicius, G., & Liogiene, T. (2015). Low-order multi-level features for speech emotions recognition. Baltic Journal of Modern Computing, 3(4), 234–247.
Zurück zum Zitat Tarasov, A., & Delany, S. J. (2011). Benchmarking classification models for emotion recognition in natural speech: A multi-corporal study. In 2011 IEEE international conference on automatic face & gesture recognition and workshops (FG 2011) (pp. 841–846). Piscataway: IEEE. Tarasov, A., & Delany, S. J. (2011). Benchmarking classification models for emotion recognition in natural speech: A multi-corporal study. In 2011 IEEE international conference on automatic face & gesture recognition and workshops (FG 2011) (pp. 841–846). Piscataway: IEEE.
Zurück zum Zitat Ten Bosch, L. (2003). Emotions, speech and the ASR framework. Speech Communication, 40(1), 213–225.MATHCrossRef Ten Bosch, L. (2003). Emotions, speech and the ASR framework. Speech Communication, 40(1), 213–225.MATHCrossRef
Zurück zum Zitat Thapliyal, N., & Amoli, G. (2012). Speech based emotion recognition with gaussian mixture model. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(5), 65. Thapliyal, N., & Amoli, G. (2012). Speech based emotion recognition with gaussian mixture model. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(5), 65.
Zurück zum Zitat Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5200–5204). Piscataway: IEEE.CrossRef Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5200–5204). Piscataway: IEEE.CrossRef
Zurück zum Zitat Truong, K., & Van Leeuwen, D. (2007). An ‘open-set’detection evaluation methodology for automatic emotion recognition in speech. In Workshop on paralinguistic speech-between models and data (pp. 5–10). Truong, K., & Van Leeuwen, D. (2007). An ‘open-set’detection evaluation methodology for automatic emotion recognition in speech. In Workshop on paralinguistic speech-between models and data (pp. 5–10).
Zurück zum Zitat Tseng, M., Hu, Y., Han, W. W., & Bergen, B. (2005). “Searching for happiness” or” Full of Joy”? Source domain activation matters. In annual meeting of the Berkeley linguistics society (Vol. 31, No. 1, pp. 359–370). Tseng, M., Hu, Y., Han, W. W., & Bergen, B. (2005). “Searching for happiness” or” Full of Joy”? Source domain activation matters. In annual meeting of the Berkeley linguistics society (Vol. 31, No. 1, pp. 359–370).
Zurück zum Zitat Utane, A. S., & Nalbalwar, S. L. (2013). Emotion recognition through speech using gaussian mixture model and support vector machine. Emotion, 2, 8. Utane, A. S., & Nalbalwar, S. L. (2013). Emotion recognition through speech using gaussian mixture model and support vector machine. Emotion, 2, 8.
Zurück zum Zitat Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.CrossRef Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.CrossRef
Zurück zum Zitat Vlasenko, B., Philippou-Hübner, D., Prylipko, D., Böck, R., Siegert, I., & Wendemuth, A. (2011a). Vowels formants analysis allows straightforward detection of high arousal emotions. In 2011 IEEE international conference on multimedia and expo (ICME) (pp. 1–6). Piscataway: IEEE. Vlasenko, B., Philippou-Hübner, D., Prylipko, D., Böck, R., Siegert, I., & Wendemuth, A. (2011a). Vowels formants analysis allows straightforward detection of high arousal emotions. In 2011 IEEE international conference on multimedia and expo (ICME) (pp. 1–6). Piscataway: IEEE.
Zurück zum Zitat Vlasenko, B., Prylipko, D., Philippou-Hübner, D., & Wendemuth, A. (2011b). Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions. In 12th annual conference of the international speech communication association. Vlasenko, B., Prylipko, D., Philippou-Hübner, D., & Wendemuth, A. (2011b). Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions. In 12th annual conference of the international speech communication association.
Zurück zum Zitat Vlasenko, B., Schuller, B., Wendemut, A., & Rigoll, G. (2007) Frame vs Turn-level: emotion recognition from speech considering static and dynamic processing. In Proceedings 2nd international conference on affective computing and intelligent interaction, pp 139–147. Vlasenko, B., Schuller, B., Wendemut, A., & Rigoll, G. (2007) Frame vs Turn-level: emotion recognition from speech considering static and dynamic processing. In Proceedings 2nd international conference on affective computing and intelligent interaction, pp 139–147.
Zurück zum Zitat Vogt, T., & André, E. (2006). Improving automatic emotion recognition from speech via gender differentiation. In Proceeding language resources and evaluation conference (LREC 2006), Genoa. Vogt, T., & André, E. (2006). Improving automatic emotion recognition from speech via gender differentiation. In Proceeding language resources and evaluation conference (LREC 2006), Genoa.
Zurück zum Zitat Vogt, T., & André, E. (2009). Exploring the benefits of discretization of acoustic features for speech emotion recognition. In 10th annual conference of the international speech communication association. Vogt, T., & André, E. (2009). Exploring the benefits of discretization of acoustic features for speech emotion recognition. In 10th annual conference of the international speech communication association.
Zurück zum Zitat Vogt, T., & André, E. (2011). An evaluation of emotion units and feature types for real-time speech emotion recognition. KI-Künstliche Intelligenz, 25(3), 213–223.CrossRef Vogt, T., & André, E. (2011). An evaluation of emotion units and feature types for real-time speech emotion recognition. KI-Künstliche Intelligenz, 25(3), 213–223.CrossRef
Zurück zum Zitat Vondra, M., & Vích, R. (2009). Evaluation of speech emotion classification based on GMM and data fusion. In Cross-modal analysis of speech, gestures, gaze and facial expressions, pp. 98–105. Vondra, M., & Vích, R. (2009). Evaluation of speech emotion classification based on GMM and data fusion. In Cross-modal analysis of speech, gestures, gaze and facial expressions, pp. 98–105.
Zurück zum Zitat Wagner, J., Vogt, T., & André, E. (2007). A systematic comparison of different HMM designs for emotion recognition from acted and spontaneous speech. In international conference on affective computing and intelligent interaction (pp. 114–125). Springer, Berlin, Heidelberg. Wagner, J., Vogt, T., & André, E. (2007). A systematic comparison of different HMM designs for emotion recognition from acted and spontaneous speech. In international conference on affective computing and intelligent interaction (pp. 114–125). Springer, Berlin, Heidelberg.
Zurück zum Zitat Wang, F., Verhelst, W., & Sahli, H. (2011). Relevance vector machine based speech emotion recognition. In Affective computing and intelligent interaction, pp. 111–120. Wang, F., Verhelst, W., & Sahli, H. (2011). Relevance vector machine based speech emotion recognition. In Affective computing and intelligent interaction, pp. 111–120.
Zurück zum Zitat Weninger, F., Ringeval, F., Marchi, E., & Schuller, B. W. (2016). Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. In IJCAI (pp. 2196–2202). Weninger, F., Ringeval, F., Marchi, E., & Schuller, B. W. (2016). Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio. In IJCAI (pp. 2196–2202).
Zurück zum Zitat Wenjing, H., Haifeng, L., & Chunyu, G. (2009). A hybrid speech emotion perception method of VQ-based feature processing and ANN recognition. In WRI global congress on intelligent systems, 2009. GCIS’09. (Vol. 2, pp. 145–149). Piscataway: IEEE.CrossRef Wenjing, H., Haifeng, L., & Chunyu, G. (2009). A hybrid speech emotion perception method of VQ-based feature processing and ANN recognition. In WRI global congress on intelligent systems, 2009. GCIS’09. (Vol. 2, pp. 145–149). Piscataway: IEEE.CrossRef
Zurück zum Zitat Womack, B. D., & Hansen, J. H. (1999). N-channel hidden Markov models for combined stressed speech classification and recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 668–677.CrossRef Womack, B. D., & Hansen, J. H. (1999). N-channel hidden Markov models for combined stressed speech classification and recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 668–677.CrossRef
Zurück zum Zitat Wu, C. H., & Liang, W. B. (2011). Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transaction Affective Computing, 2, 10–21.CrossRef Wu, C. H., & Liang, W. B. (2011). Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transaction Affective Computing, 2, 10–21.CrossRef
Zurück zum Zitat Wu, S., Falk, T. H., & Chan, W. Y. (2009). Automatic recognition of speech emotion using long-term spectro-temporal features. In 2009 16th international conference on digital signal processing (pp. 1–6). Piscataway: IEEE. Wu, S., Falk, T. H., & Chan, W. Y. (2009). Automatic recognition of speech emotion using long-term spectro-temporal features. In 2009 16th international conference on digital signal processing (pp. 1–6). Piscataway: IEEE.
Zurück zum Zitat Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53(5), 768–785.CrossRef Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53(5), 768–785.CrossRef
Zurück zum Zitat Wu, T., Yang, Y., Wu, Z., & Li, D. (2006). MASC: A speech corpus in mandarin for emotion analysis and affective speaker recognition. In Speaker and language recognition workshop, 2006. IEEE Odyssey 2006 (pp. 1–5). Piscataway: IEEE. Wu, T., Yang, Y., Wu, Z., & Li, D. (2006). MASC: A speech corpus in mandarin for emotion analysis and affective speaker recognition. In Speaker and language recognition workshop, 2006. IEEE Odyssey 2006 (pp. 1–5). Piscataway: IEEE.
Zurück zum Zitat Xiao, Z., Dellandréa, E., Chen, L., & Dou, W. (2009). Recognition of emotions in speech by a hierarchical approach. In 3rd international conference on affective computing and intelligent interaction and workshops, 2009. ACII 2009. (pp. 1–8). Piscataway: IEEE. Xiao, Z., Dellandréa, E., Chen, L., & Dou, W. (2009). Recognition of emotions in speech by a hierarchical approach. In 3rd international conference on affective computing and intelligent interaction and workshops, 2009. ACII 2009. (pp. 1–8). Piscataway: IEEE.
Zurück zum Zitat Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2006). Two-stage classification of emotional speech. In international conference on digital telecommunications, 2006. ICDT’06. (pp. 32–32). Piscataway: IEEE. Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2006). Two-stage classification of emotional speech. In international conference on digital telecommunications, 2006. ICDT06. (pp. 32–32). Piscataway: IEEE.
Zurück zum Zitat Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2007, December). Automatic hierarchical classification of emotional speech. In 9th IEEE international symposium on multimedia workshops, 2007. ISMW’07. (pp. 291–296). Piscataway: IEEE. Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2007, December). Automatic hierarchical classification of emotional speech. In 9th IEEE international symposium on multimedia workshops, 2007. ISMW07. (pp. 291–296). Piscataway: IEEE.
Zurück zum Zitat Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2007). Hierarchical classification of emotional speech. IEEE Transactions on Multimedia, 37. Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2007). Hierarchical classification of emotional speech. IEEE Transactions on Multimedia, 37.
Zurück zum Zitat Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5), 1415–1423.MATHCrossRef Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5), 1415–1423.MATHCrossRef
Zurück zum Zitat Yang, N., Muraleedharan, R., Kohl, J., Demirkol, I., Heinzelman, W., & Sturge-Apple, M. (2012). Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion. In Spoken Language Technology Workshop (SLT), 2012 IEEE (pp. 455–460). Piscataway: IEEE.CrossRef Yang, N., Muraleedharan, R., Kohl, J., Demirkol, I., Heinzelman, W., & Sturge-Apple, M. (2012). Speech-based emotion classification using multiclass SVM with hybrid kernel and thresholding fusion. In Spoken Language Technology Workshop (SLT), 2012 IEEE (pp. 455–460). Piscataway: IEEE.CrossRef
Zurück zum Zitat Ye, C., Liu, J., Chen, C., Song, M., & Bu, J. (2008). Speech emotion classification on a Riemannian manifold. In Advances in multimedia information processing-PCM 2008, pp. 61–69. Ye, C., Liu, J., Chen, C., Song, M., & Bu, J. (2008). Speech emotion classification on a Riemannian manifold. In Advances in multimedia information processing-PCM 2008, pp. 61–69.
Zurück zum Zitat Yeh, J. H., Pao, T. L., Lin, C. Y., Tsai, Y. W., & Chen, Y. T. (2011). Segment-based emotion recognition from continuous Mandarin Chinese speech. Computers in Human Behavior, 27(5), 1545–1552.CrossRef Yeh, J. H., Pao, T. L., Lin, C. Y., Tsai, Y. W., & Chen, Y. T. (2011). Segment-based emotion recognition from continuous Mandarin Chinese speech. Computers in Human Behavior, 27(5), 1545–1552.CrossRef
Zurück zum Zitat You, M., Chen, C., Bu, J., Liu, J., & Tao, J. (2006). A hierarchical framework for speech emotion recognition. In 2006 IEEE international symposium on industrial electronics (Vol. 1, pp. 515–519). Piscataway: IEEE.CrossRef You, M., Chen, C., Bu, J., Liu, J., & Tao, J. (2006). A hierarchical framework for speech emotion recognition. In 2006 IEEE international symposium on industrial electronics (Vol. 1, pp. 515–519). Piscataway: IEEE.CrossRef
Zurück zum Zitat You, M., Chen, C., Bu, J., Liu, J., & Tao, J. (2006). Emotional speech analysis on nonlinear manifold. In 18th international conference on pattern recognition, 2006. ICPR 2006. (Vol. 3, pp. 91–94). Piscataway: IEEE. You, M., Chen, C., Bu, J., Liu, J., & Tao, J. (2006). Emotional speech analysis on nonlinear manifold. In 18th international conference on pattern recognition, 2006. ICPR 2006. (Vol. 3, pp. 91–94). Piscataway: IEEE.
Zurück zum Zitat Yun, S., & Yoo, C. D. (2012). Loss-scaled large-margin Gaussian mixture models for speech emotion classification. IEEE Transactions on Audio, Speech, and Language Processing, 20(2), 585–598.CrossRef Yun, S., & Yoo, C. D. (2012). Loss-scaled large-margin Gaussian mixture models for speech emotion classification. IEEE Transactions on Audio, Speech, and Language Processing, 20(2), 585–598.CrossRef
Zurück zum Zitat Yüncü, E., Hacihabiboglu, H., & Bozsahin, C. (2014). Automatic speech emotion recognition using auditory models with binary decision tree and svm. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 773–778). Piscataway: IEEE. Yüncü, E., Hacihabiboglu, H., & Bozsahin, C. (2014). Automatic speech emotion recognition using auditory models with binary decision tree and svm. In 2014 22nd international conference on pattern recognition (ICPR) (pp. 773–778). Piscataway: IEEE.
Zurück zum Zitat Zbancioc, M., & Feraru, S. M. (2012). Emotion recognition of the SROL Romanian database using fuzzy KNN algorithm. In 10th international symposium on electronics and telecommunications (ISETC), 2012 (pp. 347–350). Piscataway: IEEE. Zbancioc, M., & Feraru, S. M. (2012). Emotion recognition of the SROL Romanian database using fuzzy KNN algorithm. In 10th international symposium on electronics and telecommunications (ISETC), 2012 (pp. 347–350). Piscataway: IEEE.
Zurück zum Zitat Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 39–58.CrossRef Zeng, Z., Pantic, M., Roisman, G. I., & Huang, T. S. (2009). A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 39–58.CrossRef
Zurück zum Zitat Zha, C., Yang, P., Zhang, X., & Zhao, L. (2016). Spontaneous speech emotion recognition via multiple kernel learning. In 2016 eighth international conference on measuring technology and mechatronics automation (ICMTMA) (pp. 621–623). Piscataway: IEEE. Zha, C., Yang, P., Zhang, X., & Zhao, L. (2016). Spontaneous speech emotion recognition via multiple kernel learning. In 2016 eighth international conference on measuring technology and mechatronics automation (ICMTMA) (pp. 621–623). Piscataway: IEEE.
Zurück zum Zitat Zhang, S., Lei, B., Chen, A., Chen, C., & Chen, Y. (2010). Spoken emotion recognition using local fisher discriminant analysis. In 10th international conference on signal processing (ICSP), 2010 IEEE (pp. 538–540). Piscataway: IEEE. Zhang, S., Lei, B., Chen, A., Chen, C., & Chen, Y. (2010). Spoken emotion recognition using local fisher discriminant analysis. In 10th international conference on signal processing (ICSP), 2010 IEEE (pp. 538–540). Piscataway: IEEE.
Zurück zum Zitat Zhang, S., & Zhao, Z. (2008). Feature selection filtering methods for emotion recognition in Chinese speech signal. In 9th international conference on signal processing, 2008. ICSP 2008. (pp. 1699–1702). Piscataway: IEEE. Zhang, S., & Zhao, Z. (2008). Feature selection filtering methods for emotion recognition in Chinese speech signal. In 9th international conference on signal processing, 2008. ICSP 2008. (pp. 1699–1702). Piscataway: IEEE.
Zurück zum Zitat Zheng, W. Q., Yu, J. S., & Zou, Y. X. (2015). An experimental study of speech emotion recognition based on deep convolutional neural networks. In 2015 international conference on affective computing and intelligent interaction (ACII) (pp. 827–831). Piscataway: IEEE. Zheng, W. Q., Yu, J. S., & Zou, Y. X. (2015). An experimental study of speech emotion recognition based on deep convolutional neural networks. In 2015 international conference on affective computing and intelligent interaction (ACII) (pp. 827–831). Piscataway: IEEE.
Zurück zum Zitat Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and SVM. In 5th IEEE international conference on cognitive informatics, 2006. ICCI 2006. (Vol. 1, pp. 53–61). Piscataway: IEEE. Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and SVM. In 5th IEEE international conference on cognitive informatics, 2006. ICCI 2006. (Vol. 1, pp. 53–61). Piscataway: IEEE.
Zurück zum Zitat Zhou, Y., Sun, Y., Yang, L., & Yan, Y. (2009). Applying articulatory features to speech emotion recognition. In international conference on research challenges in computer science, 2009. ICRCCS’09. (pp. 73–76). Piscataway: IEEE. Zhou, Y., Sun, Y., Yang, L., & Yan, Y. (2009). Applying articulatory features to speech emotion recognition. In international conference on research challenges in computer science, 2009. ICRCCS09. (pp. 73–76). Piscataway: IEEE.
Zurück zum Zitat Zhu, L., Chen, L., Zhao, D., Zhou, J., & Zhang, W. (2017). Emotion recognition from chinese speech for smart affective services using a combination of SVM and DBN. Sensors, 17(7), 1694.CrossRef Zhu, L., Chen, L., Zhao, D., Zhou, J., & Zhang, W. (2017). Emotion recognition from chinese speech for smart affective services using a combination of SVM and DBN. Sensors, 17(7), 1694.CrossRef
Zurück zum Zitat Zong, Y., Zheng, W., Zhang, T., & Huang, X. (2016). Cross-corpus speech emotion recognition based on domain-adaptive least-squares regression. IEEE Signal Processing Letters, 23(5), 585–589.CrossRef Zong, Y., Zheng, W., Zhang, T., & Huang, X. (2016). Cross-corpus speech emotion recognition based on domain-adaptive least-squares regression. IEEE Signal Processing Letters, 23(5), 585–589.CrossRef
Metadaten
Titel
Speech emotion recognition research: an analysis of research focus
verfasst von
Mumtaz Begum Mustafa
Mansoor A. M. Yusoof
Zuraidah M. Don
Mehdi Malekzadeh
Publikationsdatum
22.01.2018
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 1/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-9493-x

Weitere Artikel der Ausgabe 1/2018

International Journal of Speech Technology 1/2018 Zur Ausgabe

Neuer Inhalt