Skip to main content
Top

2016 | OriginalPaper | Chapter

4. Real-time Incremental Processing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The features and the modelling methods used in this thesis have been selected with the goal of on-line processing in mind, however, most of them are all general methods that are suitable both for on-line and off-line processing. This section deals specifically with the issues encountered in on-line (aka incremental) processing, such as segmentation, constraints on feature extraction, and complexity and run-time constraints.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
In openSMILE this behaviour is implemented in the cTurnDetector component.
 
2
Incremental segmentation of music into beats and bars is not part of this thesis. An on-line segmentation approach for music has not been investigated in this thesis, however, an off-line segmentation method basing on the beat tracker presented by Schuller et al. (2007b) and Eyben et al. (2007) has been used.
 
4
According to Google scholar citations.
 
7
Also referred to as tick-loop in the code.
 
8
In practice limited by the range of the ‘long’ data-type (32-bit or 64-bit).
 
Literature
go back to reference A. Batliner, D. Seppi, S. Steidl, B. Schuller, Segmenting into adequate units for automatic recognition of emotion-related episodes: a speech-based approach. Adv. Human Comput. Interact., Special Issue on Emotion-Aware Natural Interaction 2010, 1–15 (2010). Article ID 782802 (on-line) A. Batliner, D. Seppi, S. Steidl, B. Schuller, Segmenting into adequate units for automatic recognition of emotion-related episodes: a speech-based approach. Adv. Human Comput. Interact., Special Issue on Emotion-Aware Natural Interaction 2010, 1–15 (2010). Article ID 782802 (on-line)
go back to reference M. Ben-Ari, Principles of Concurrent and Distributed Programming (Prentice Hall, Englewood Cliffs, 1990). ISBN 0-13-711821-XMATH M. Ben-Ari, Principles of Concurrent and Distributed Programming (Prentice Hall, Englewood Cliffs, 1990). ISBN 0-13-711821-XMATH
go back to reference C. Busso, S. Lee, S. Narayanan, Using neutral speech models for emotional speech analysis, in Proceedings of the INTERSPEECH 2007, Antwerp, Belgium, August 2007. ISCA, pp. 2225–2228 C. Busso, S. Lee, S. Narayanan, Using neutral speech models for emotional speech analysis, in Proceedings of the INTERSPEECH 2007, Antwerp, Belgium, August 2007. ISCA, pp. 2225–2228
go back to reference G. Caridakis, L. Malatesta, L. Kessous, N. Amir, A. Raouzaiou, K. Karpouzis, Modeling naturalistic affective states via facial and vocal expressions recognition, in Proceedings of the 8th International Conference on Multimodal Interfaces (ICMI) 2006, Banff, Canada, 2006. ACM, pp. 146–154 G. Caridakis, L. Malatesta, L. Kessous, N. Amir, A. Raouzaiou, K. Karpouzis, Modeling naturalistic affective states via facial and vocal expressions recognition, in Proceedings of the 8th International Conference on Multimodal Interfaces (ICMI) 2006, Banff, Canada, 2006. ACM, pp. 146–154
go back to reference J. Cohen, P. Cohen, S.G. West, L.S. Aiken, Applied multiple regression/correlation analysis for the behavioral sciences, 2nd edn. (Lawrence Erlbaum Associates, Hillsdale, 2003) J. Cohen, P. Cohen, S.G. West, L.S. Aiken, Applied multiple regression/correlation analysis for the behavioral sciences, 2nd edn. (Lawrence Erlbaum Associates, Hillsdale, 2003)
go back to reference J. Deng, B. Schuller, Confidence measures in speech emotion recognition based on semi-supervised learning, in Proceedings of INTERSPEECH 2012, Portland, September 2012. ISCA J. Deng, B. Schuller, Confidence measures in speech emotion recognition based on semi-supervised learning, in Proceedings of INTERSPEECH 2012, Portland, September 2012. ISCA
go back to reference J. Deng, W. Han, B. Schuller, Confidence measures for speech emotion recognition: a start, in Proceedings of the 10-th ITG Symposium on Speech Communication, ed. by T. Fingscheidt, W. Kellermann (Braunschweig, Germany, September 2012). IEEE, pp. 1–4 J. Deng, W. Han, B. Schuller, Confidence measures for speech emotion recognition: a start, in Proceedings of the 10-th ITG Symposium on Speech Communication, ed. by T. Fingscheidt, W. Kellermann (Braunschweig, Germany, September 2012). IEEE, pp. 1–4
go back to reference E. Douglas-Cowie, R. Cowie, I. Sneddon, C. Cox, O. Lowry, M. McRorie, J.C. Martin, L. Devillers, S. Abrilian, A. Batliner, N. Amir, K. Karpouzis, The HUMAINE Database, vol. 4738, Lecture Notes in Computer Science (Springer, Berlin, 2007), pp. 488–500 E. Douglas-Cowie, R. Cowie, I. Sneddon, C. Cox, O. Lowry, M. McRorie, J.C. Martin, L. Devillers, S. Abrilian, A. Batliner, N. Amir, K. Karpouzis, The HUMAINE Database, vol. 4738, Lecture Notes in Computer Science (Springer, Berlin, 2007), pp. 488–500
go back to reference P. Ekman, W.V. Friesen, Unmasking the Face: A Guide to Recognizing Emotions from Facial Expressions (Prentice Hall, Englewood Cliffs, 1975) P. Ekman, W.V. Friesen, Unmasking the Face: A Guide to Recognizing Emotions from Facial Expressions (Prentice Hall, Englewood Cliffs, 1975)
go back to reference F. Eyben, B. Schuller, S. Reiter, G. Rigoll, Wearable assistance for the ballroom-dance hobbyist – holistic rhythm analysis and dance-style classification, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2007, Bejing, China, July 2007. IEEE, pp. 92–95 F. Eyben, B. Schuller, S. Reiter, G. Rigoll, Wearable assistance for the ballroom-dance hobbyist – holistic rhythm analysis and dance-style classification, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2007, Bejing, China, July 2007. IEEE, pp. 92–95
go back to reference F. Eyben, M. Wöllmer, B. Schuller, openSMILE – the munich versatile and fast open-source audio feature extractor, in Proceedings of the ACM Multimedia 2010, Florence, Italy, 2010a. ACM, pp. 1459–1462 F. Eyben, M. Wöllmer, B. Schuller, openSMILE – the munich versatile and fast open-source audio feature extractor, in Proceedings of the ACM Multimedia 2010, Florence, Italy, 2010a. ACM, pp. 1459–1462
go back to reference F. Eyben, M. Wöllmer, A. Graves, B. Schuller, E. Douglas-Cowie, R. Cowie, On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces (JMUI) 3(1–2), 7–19 (2010d). doi:10.1007/s12193-009-0032-6 F. Eyben, M. Wöllmer, A. Graves, B. Schuller, E. Douglas-Cowie, R. Cowie, On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces (JMUI) 3(1–2), 7–19 (2010d). doi:10.​1007/​s12193-009-0032-6
go back to reference F. Eyben, M. Wöllmer, M. Valstar, H. Gunes, B. Schuller, M. Pantic, String-based audiovisual fusion of behavioural events for the assessment of dimensional affect, in Proceedings of the International Workshop on Emotion Synthesis, Representation, and Analysis in Continuous space (EmoSPACE) 2011, held in conjunction with FG 2011, Santa Barbara, March 2011. IEEE, pp. 322–329 F. Eyben, M. Wöllmer, M. Valstar, H. Gunes, B. Schuller, M. Pantic, String-based audiovisual fusion of behavioural events for the assessment of dimensional affect, in Proceedings of the International Workshop on Emotion Synthesis, Representation, and Analysis in Continuous space (EmoSPACE) 2011, held in conjunction with FG 2011, Santa Barbara, March 2011. IEEE, pp. 322–329
go back to reference F. Eyben, M. Wöllmer, B. Schuller, A multi-task approach to continuous five-dimensional affect sensing in natural speech. ACM Trans. Interact. Intell. Syst., Special Issue on Affective Interaction in Natural Environments 2(1), 29 (2012). Article No. 6 F. Eyben, M. Wöllmer, B. Schuller, A multi-task approach to continuous five-dimensional affect sensing in natural speech. ACM Trans. Interact. Intell. Syst., Special Issue on Affective Interaction in Natural Environments 2(1), 29 (2012). Article No. 6
go back to reference F. Eyben, F. Weninger, F. Gross, B. Schuller, Recent developments in openSMILE, the munich open-source multimedia feature extractor, in Proceedings of ACM Multimedia 2013, Barcelona, Spain, 2013a. ACM, pp. 835–838 F. Eyben, F. Weninger, F. Gross, B. Schuller, Recent developments in openSMILE, the munich open-source multimedia feature extractor, in Proceedings of ACM Multimedia 2013, Barcelona, Spain, 2013a. ACM, pp. 835–838
go back to reference S. Fernandez, A. Graves, J. Schmidhuber, Phoneme recognition in TIMIT with BLSTM-CTC, Technical report, IDSIA, Switzerland, 2008 S. Fernandez, A. Graves, J. Schmidhuber, Phoneme recognition in TIMIT with BLSTM-CTC, Technical report, IDSIA, Switzerland, 2008
go back to reference J.R.J. Fontaine, K.R. Scherer, E.B. Roesch, P.C. Ellsworth, The world of emotions is not two-dimensional. Psychol. Sci. 18(2), 1050–1057 (2007)CrossRef J.R.J. Fontaine, K.R. Scherer, E.B. Roesch, P.C. Ellsworth, The world of emotions is not two-dimensional. Psychol. Sci. 18(2), 1050–1057 (2007)CrossRef
go back to reference N. Fragopanagos, J.G. Taylor, Emotion recognition in human-computer interaction. Neural Netw., 2005 Special Issue on Emotion and Brain 18(4), 389–405 (2005) N. Fragopanagos, J.G. Taylor, Emotion recognition in human-computer interaction. Neural Netw., 2005 Special Issue on Emotion and Brain 18(4), 389–405 (2005)
go back to reference D. Glowinski, A. Camurri, G. Volpe, N. Dael, K. Scherer, Technique for automatic emotion recognition by body gesture analysis, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008 (CVPRW’08), Anchorage, June 2008. IEEE, pp. 1–6 D. Glowinski, A. Camurri, G. Volpe, N. Dael, K. Scherer, Technique for automatic emotion recognition by body gesture analysis, in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops 2008 (CVPRW’08), Anchorage, June 2008. IEEE, pp. 1–6
go back to reference A. Graves, J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)CrossRef A. Graves, J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 18(5–6), 602–610 (2005)CrossRef
go back to reference M. Grimm, K. Kroschel, S. Narayanan, Support vector regression for automatic recognition of spontaneous emotions in speech, in Proceedings of the ICASSP 2007, vol. 4, Honolulu, April 2007a. IEEE, pp. 1085–1088 M. Grimm, K. Kroschel, S. Narayanan, Support vector regression for automatic recognition of spontaneous emotions in speech, in Proceedings of the ICASSP 2007, vol. 4, Honolulu, April 2007a. IEEE, pp. 1085–1088
go back to reference M. Grimm, E. Mower, K. Kroschel, S. Narayanan, Primitives based estimation and evaluation of emotions in speech. Speech Commun. 49, 787–800 (2007b)CrossRef M. Grimm, E. Mower, K. Kroschel, S. Narayanan, Primitives based estimation and evaluation of emotions in speech. Speech Commun. 49, 787–800 (2007b)CrossRef
go back to reference H. Gunes, M. Pantic, Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners, in Proceedings of the International Conference on Intelligent Virtual Agents (IVA), . (Springer, Berlin, 2010a), pp. 371–377. ISBN 978-3-642-15891-9 H. Gunes, M. Pantic, Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners, in Proceedings of the International Conference on Intelligent Virtual Agents (IVA), . (Springer, Berlin, 2010a), pp. 371–377. ISBN 978-3-642-15891-9
go back to reference H. Gunes, M. Pantic, Automatic, dimensional and continuous emotion recognition. Int. J. Synth. Emot. (IJSE) 1(1), 68–99 (2010b)CrossRef H. Gunes, M. Pantic, Automatic, dimensional and continuous emotion recognition. Int. J. Synth. Emot. (IJSE) 1(1), 68–99 (2010b)CrossRef
go back to reference M.A. Hall, Correlation-based Feature Subset Selection for Machine Learning, Doctoral thesis, University of Waikato, Hamilton, New Zealand, 1998 M.A. Hall, Correlation-based Feature Subset Selection for Machine Learning, Doctoral thesis, University of Waikato, Hamilton, New Zealand, 1998
go back to reference W. Han, H. Li, H. Ruan, L. Ma, J. Sun, B. Schuller. Active learning for dimensional speech emotion recognition, in Proceedings of INTERSPEECH 2013, Lyon, France, August 2013. ISCA, pp. 2856–2859 W. Han, H. Li, H. Ruan, L. Ma, J. Sun, B. Schuller. Active learning for dimensional speech emotion recognition, in Proceedings of INTERSPEECH 2013, Lyon, France, August 2013. ISCA, pp. 2856–2859
go back to reference S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
go back to reference S. Ioannou, A. Raouzaiou, V. Tzouvaras, T. Mailis, K. Karpouzis, S. Kollias, Emotion recognition through facial expression analysis based on a neurofuzzy method. Neural Netw., 2005 Special Issue on Emotion and Brain 18(4), 423–435 (2005) S. Ioannou, A. Raouzaiou, V. Tzouvaras, T. Mailis, K. Karpouzis, S. Kollias, Emotion recognition through facial expression analysis based on a neurofuzzy method. Neural Netw., 2005 Special Issue on Emotion and Brain 18(4), 423–435 (2005)
go back to reference C.-C. Lee, C. Busso, S. Lee, S.S. Narayanan, Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009. ISCA, pp. 1983–1986 C.-C. Lee, C. Busso, S. Lee, S.S. Narayanan, Modeling mutual influence of interlocutor emotion states in dyadic spoken interactions, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009. ISCA, pp. 1983–1986
go back to reference G. McKeown, M. Valstar, R. Cowie, M. Pantic, M. Schroder, The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3(1), 5–17 (2012). doi:10.1109/T-AFFC.2011.20. ISSN 1949-3045CrossRef G. McKeown, M. Valstar, R. Cowie, M. Pantic, M. Schroder, The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3(1), 5–17 (2012). doi:10.​1109/​T-AFFC.​2011.​20. ISSN 1949-3045CrossRef
go back to reference E. Mower, S.S. Narayanan, A hierarchical static-dynamic framework for emotion classification, in Proceedings of the ICASSP 2011, Prague, Czech Republic, May 2011. IEEE, pp. 2372–2375 E. Mower, S.S. Narayanan, A hierarchical static-dynamic framework for emotion classification, in Proceedings of the ICASSP 2011, Prague, Czech Republic, May 2011. IEEE, pp. 2372–2375
go back to reference M. Nicolaou, H. Gunes, M. Pantic, Audio-visual classification and fusion of spontaneous affective data in likelihood space, in Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, August 2010. IEEE, pp. 3695–3699 M. Nicolaou, H. Gunes, M. Pantic, Audio-visual classification and fusion of spontaneous affective data in likelihood space, in Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, August 2010. IEEE, pp. 3695–3699
go back to reference V. Parsa, D. Jamieson, Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J. Speech, Lang. Hear. Res. 44, 327–339 (2001)CrossRef V. Parsa, D. Jamieson, Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. J. Speech, Lang. Hear. Res. 44, 327–339 (2001)CrossRef
go back to reference C. Peters, C. O’Sullivan, Synthetic vision and memory for autonomous virtual humans. Comput. Graph. Forum 21(4), 743–753 (2002)CrossRef C. Peters, C. O’Sullivan, Synthetic vision and memory for autonomous virtual humans. Comput. Graph. Forum 21(4), 743–753 (2002)CrossRef
go back to reference M. Riedmiller, H. Braun, A direct adaptive method for faster backpropagation learning: the RPROP algorithm, Proceedings of the IEEE International Conference on Neural Networks, vol. 1 (San Francisco, 1993). IEEE, pp. 586–591. doi:10.1109/icnn.1993.298623 M. Riedmiller, H. Braun, A direct adaptive method for faster backpropagation learning: the RPROP algorithm, Proceedings of the IEEE International Conference on Neural Networks, vol. 1 (San Francisco, 1993). IEEE, pp. 586–591. doi:10.​1109/​icnn.​1993.​298623
go back to reference E.M. Schmidt, Y.E. Kim, Prediction of time-varying musical mood distributions from audio, in Proceedings of ISMIR 2010, Utrecht, The Netherlands, 2010. ISMIR E.M. Schmidt, Y.E. Kim, Prediction of time-varying musical mood distributions from audio, in Proceedings of ISMIR 2010, Utrecht, The Netherlands, 2010. ISMIR
go back to reference M. Schröder, E. Bevacqua, R. Cowie, F. Eyben, H. Gunes, D. Heylen, M. ter Maat, G. McKeown, S. Pammi, M. Pantic, C. Pelachaud, B. Schuller, E. de Sevin, M. Valstar, M. Wöllmer, Building autonomous sensitive artificial listeners. IEEE Trans. Affect. Comput. 3(2), 165–183 (2012)CrossRef M. Schröder, E. Bevacqua, R. Cowie, F. Eyben, H. Gunes, D. Heylen, M. ter Maat, G. McKeown, S. Pammi, M. Pantic, C. Pelachaud, B. Schuller, E. de Sevin, M. Valstar, M. Wöllmer, Building autonomous sensitive artificial listeners. IEEE Trans. Affect. Comput. 3(2), 165–183 (2012)CrossRef
go back to reference B. Schuller, G. Rigoll, Timing levels in segment-based speech emotion recognition, in Proceedings of the INTERSPEECH-ICSLP 2006, Pittsburgh, September 2006. ISCA, pp. 1818–1821 B. Schuller, G. Rigoll, Timing levels in segment-based speech emotion recognition, in Proceedings of the INTERSPEECH-ICSLP 2006, Pittsburgh, September 2006. ISCA, pp. 1818–1821
go back to reference B. Schuller, B. Vlasenko, R. Minguez, G. Rigoll, A. Wendemuth, Comparing one and two-stage acoustic modeling in the recognition of emotion in speech, in Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2007, Kyoto, Japan, 2007a. IEEE, pp. 596–600 B. Schuller, B. Vlasenko, R. Minguez, G. Rigoll, A. Wendemuth, Comparing one and two-stage acoustic modeling in the recognition of emotion in speech, in Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2007, Kyoto, Japan, 2007a. IEEE, pp. 596–600
go back to reference B. Schuller, F. Eyben, G. Rigoll, Fast and robust meter and tempo recognition for the automatic discrimination of ballroom dance styles, in Proceedings of the ICASSP 2007, vol. I, Honolulu, April 2007b. IEEE, pp. 217–220 B. Schuller, F. Eyben, G. Rigoll, Fast and robust meter and tempo recognition for the automatic discrimination of ballroom dance styles, in Proceedings of the ICASSP 2007, vol. I, Honolulu, April 2007b. IEEE, pp. 217–220
go back to reference B. Schuller, D. Seppi, A. Batliner, A. Maier, S. Steidl, Towards more reality in the recognition of emotional speech, in Proceedings of the ICASSP 2007, vol. IV, Honolulu, 2007c. IEEE, pp. 941–944 B. Schuller, D. Seppi, A. Batliner, A. Maier, S. Steidl, Towards more reality in the recognition of emotional speech, in Proceedings of the ICASSP 2007, vol. IV, Honolulu, 2007c. IEEE, pp. 941–944
go back to reference B. Schuller, F. Eyben, G. Rigoll, Beat-synchronous data-driven automatic chord labeling, in Proceedings of the 34 Jahrestagung für Akustik (DAGA) 2008, Dresden, Germany, March 2008. DEGA pp. 555–556 B. Schuller, F. Eyben, G. Rigoll, Beat-synchronous data-driven automatic chord labeling, in Proceedings of the 34 Jahrestagung für Akustik (DAGA) 2008, Dresden, Germany, March 2008. DEGA pp. 555–556
go back to reference B. Schuller, R. Müller, F. Eyben, J. Gast, B. Hörnler, M. Wöllmer, G. Rigoll, A. Höthker, H. Konosu, Being bored? Recognising natural interest by extensive audiovisual integration for real-life application. Image Vis. Comput., Special Issue onVisual and Multimodal Analysis of Human Spontaneous Behavior 27(12), 1760–1774 (2009a) B. Schuller, R. Müller, F. Eyben, J. Gast, B. Hörnler, M. Wöllmer, G. Rigoll, A. Höthker, H. Konosu, Being bored? Recognising natural interest by extensive audiovisual integration for real-life application. Image Vis. Comput., Special Issue onVisual and Multimodal Analysis of Human Spontaneous Behavior 27(12), 1760–1774 (2009a)
go back to reference B. Schuller, S. Steidl, A. Batliner, F. Jurcicek, The INTERSPEECH 2009 emotion challenge, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009b. pp. 312–315 B. Schuller, S. Steidl, A. Batliner, F. Jurcicek, The INTERSPEECH 2009 emotion challenge, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009b. pp. 312–315
go back to reference B. Schuller, B. Vlasenko, F. Eyben, G. Rigoll, A. Wendemuth, Acoustic emotion recognition: a benchmark comparison of performances, in Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2009, Merano, Italy, December 2009c. IEEE,pp. 552–557 B. Schuller, B. Vlasenko, F. Eyben, G. Rigoll, A. Wendemuth, Acoustic emotion recognition: a benchmark comparison of performances, in Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2009, Merano, Italy, December 2009c. IEEE,pp. 552–557
go back to reference B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, S. Narayanan, The INTERSPEECH 2010 paralinguistic challenge, in Proceedings of INTERSPEECH 2010, Makuhari, Japan, September 2010. ISCA, pp. 2794–2797 B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, S. Narayanan, The INTERSPEECH 2010 paralinguistic challenge, in Proceedings of INTERSPEECH 2010, Makuhari, Japan, September 2010. ISCA, pp. 2794–2797
go back to reference B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, M. Pantic, AVEC 2011 - the first international audio/visual emotion challenge, in Proceedings of the First International Audio/Visual Emotion Challenge and Workshop, AVEC 2011, held in conjunction with the International HUMAINE Association Conference on Affective Computing and Intelligent Interaction (ACII) 2011, vol. II, ed. by B. Schuller, M. Valstar, R. Cowie, M. Pantic (Springer, Memphis, 2011a), pp. 415–424 B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, M. Pantic, AVEC 2011 - the first international audio/visual emotion challenge, in Proceedings of the First International Audio/Visual Emotion Challenge and Workshop, AVEC 2011, held in conjunction with the International HUMAINE Association Conference on Affective Computing and Intelligent Interaction (ACII) 2011, vol. II, ed. by B. Schuller, M. Valstar, R. Cowie, M. Pantic (Springer, Memphis, 2011a), pp. 415–424
go back to reference B. Schuller, A. Batliner, S. Steidl, F. Schiel, J. Krajewski, The INTERSPEECH 2011 speaker state challenge, in Proceedings of INTERSPEECH 2011, Florence, Italy, August 2011b. ISCA, pp. 3201–3204 B. Schuller, A. Batliner, S. Steidl, F. Schiel, J. Krajewski, The INTERSPEECH 2011 speaker state challenge, in Proceedings of INTERSPEECH 2011, Florence, Italy, August 2011b. ISCA, pp. 3201–3204
go back to reference B. Schuller, A. Batliner, S. Steidl, D. Seppi, Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun., Special Issue on Sensing Emotion and Affect – Facing Realism in 53(9/10), 1062–1087 (2011c) B. Schuller, A. Batliner, S. Steidl, D. Seppi, Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun., Special Issue on Sensing Emotion and Affect – Facing Realism in 53(9/10), 1062–1087 (2011c)
go back to reference B. Schuller, M. Valstar, R. Cowie, M. Pantic, AVEC 2012: the continuous audio/visual emotion challenge - an introduction, in Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI) 2012, ed. by L.-P. Morency, D. Bohus, H.K. Aghajan, J. Cassell, A. Nijholt, J. Epps (ACM, Santa Monica, 2012a), pp. 361–362 B. Schuller, M. Valstar, R. Cowie, M. Pantic, AVEC 2012: the continuous audio/visual emotion challenge - an introduction, in Proceedings of the 14th ACM International Conference on Multimodal Interaction (ICMI) 2012, ed. by L.-P. Morency, D. Bohus, H.K. Aghajan, J. Cassell, A. Nijholt, J. Epps (ACM, Santa Monica, 2012a), pp. 361–362
go back to reference B. Schuller, S. Steidl, A. Batliner, E. Nöth, A. Vinciarelli, F. Burkhardt, R. van Son, F. Weninger, F. Eyben, T. Bocklet, G. Mohammadi, B. Weiss, The INTERSPEECH 2012 speaker trait challenge, in Proceedings of INTERSPEECH 2012, Portland, OR, USA, September 2012b. ISCA B. Schuller, S. Steidl, A. Batliner, E. Nöth, A. Vinciarelli, F. Burkhardt, R. van Son, F. Weninger, F. Eyben, T. Bocklet, G. Mohammadi, B. Weiss, The INTERSPEECH 2012 speaker trait challenge, in Proceedings of INTERSPEECH 2012, Portland, OR, USA, September 2012b. ISCA
go back to reference B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, et al, The INTERSPEECH 2013 computational paralinguistics challenge: Social Signals, Conflict, Emotion, Autism, in Proceedings of INTERSPEECH 2013, Lyon, France, 2013. ISCA, pp. 148–152 B. Schuller, S. Steidl, A. Batliner, A. Vinciarelli, K. Scherer, F. Ringeval, M. Chetouani, et al, The INTERSPEECH 2013 computational paralinguistics challenge: Social Signals, Conflict, Emotion, Autism, in Proceedings of INTERSPEECH 2013, Lyon, France, 2013. ISCA, pp. 148–152
go back to reference S. Steidl, Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech (Logos Verlag, Berlin, 2009) S. Steidl, Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech (Logos Verlag, Berlin, 2009)
go back to reference S. Steidl, B. Schuller, A. Batliner, D. Seppi, The hinterland of emotions: facing the open-microphone challenge, in Proceedings of the 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction (ACII), vol. I, Amsterdam, The Netherlands, 2009. IEEE, pp. 690–697 S. Steidl, B. Schuller, A. Batliner, D. Seppi, The hinterland of emotions: facing the open-microphone challenge, in Proceedings of the 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction (ACII), vol. I, Amsterdam, The Netherlands, 2009. IEEE, pp. 690–697
go back to reference P. Werbos, Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990)CrossRef P. Werbos, Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 (1990)CrossRef
go back to reference M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, R. Cowie. Abandoning emotion classes – towards continuous emotion recognition with modelling of long-range dependencies, in Proceedings of the INTERSPEECH 2008, Brisbane, Australia, September 2008. ISCA, pp. 597–600 M. Wöllmer, F. Eyben, S. Reiter, B. Schuller, C. Cox, E. Douglas-Cowie, R. Cowie. Abandoning emotion classes – towards continuous emotion recognition with modelling of long-range dependencies, in Proceedings of the INTERSPEECH 2008, Brisbane, Australia, September 2008. ISCA, pp. 597–600
go back to reference M. Wöllmer, F. Eyben, B. Schuller, E. Douglas-Cowie, R. Cowie, Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009. ISCA, pp. 1595–1598 M. Wöllmer, F. Eyben, B. Schuller, E. Douglas-Cowie, R. Cowie, Data-driven clustering in emotional space for affect recognition using discriminatively trained LSTM networks, in Proceedings of INTERSPEECH 2009, Brighton, UK, September 2009. ISCA, pp. 1595–1598
go back to reference M. Wöllmer, B. Schuller, F. Eyben, G. Rigoll, Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. IEEE J. Sel. Top. Signal Process., Special Issue on “Speech Processing for Natural Interaction with Intelligent Environments” 4(5), 867–881 (2010) M. Wöllmer, B. Schuller, F. Eyben, G. Rigoll, Combining long short-term memory and dynamic Bayesian networks for incremental emotion-sensitive artificial listening. IEEE J. Sel. Top. Signal Process., Special Issue on “Speech Processing for Natural Interaction with Intelligent Environments” 4(5), 867–881 (2010)
go back to reference D. Wu, T. Parsons, E. Mower, S.S. Narayanan, Speech emotion estimation in 3d space, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2010, Singapore, July 2010a. IEEE, pp. 737–742 D. Wu, T. Parsons, E. Mower, S.S. Narayanan, Speech emotion estimation in 3d space, in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME) 2010, Singapore, July 2010a. IEEE, pp. 737–742
go back to reference D. Wu, T. Parsons, S.S. Narayanan, Acoustic feature analysis in speech emotion primitives estimation, in Proceedings of the INTERSPEECH 2010, Makuhari, Japan, September 2010b. ISCA, pp. 785–788 D. Wu, T. Parsons, S.S. Narayanan, Acoustic feature analysis in speech emotion primitives estimation, in Proceedings of the INTERSPEECH 2010, Makuhari, Japan, September 2010b. ISCA, pp. 785–788
go back to reference P.V. Yee, S. Haykin, Regularized Radial Basis Function Networks: Theory and Applications (Wiley, New York, 2001), 208 p. ISBN 0-471-35349-3 P.V. Yee, S. Haykin, Regularized Radial Basis Function Networks: Theory and Applications (Wiley, New York, 2001), 208 p. ISBN 0-471-35349-3
go back to reference Z. Zeng, M. Pantic, G.I. Rosiman, T.S. Huang, A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)CrossRef Z. Zeng, M. Pantic, G.I. Rosiman, T.S. Huang, A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)CrossRef
go back to reference Z. Zhang, B. Schuller, Semi-supervised learning helps in sound event classification, in Proceedings of ICASSP 2012, Kyoto, March 2012. IEEE, pp. 333–336 Z. Zhang, B. Schuller, Semi-supervised learning helps in sound event classification, in Proceedings of ICASSP 2012, Kyoto, March 2012. IEEE, pp. 333–336
Metadata
Title
Real-time Incremental Processing
Author
Florian Eyben
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-27299-3_4