Skip to main content
Top

2015 | OriginalPaper | Chapter

Design of Neural Network Model for Emotional Speech Recognition

Authors : H. K. Palo, Mihir Narayana Mohanty, Mahesh Chandra

Published in: Artificial Intelligence and Evolutionary Algorithms in Engineering Systems

Publisher: Springer India

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Human–computer interaction (HCI) needs to be improved for the field of recognition and detection. Exclusively, the emotion recognition has major impact on social, engineering, and medical science applications. This paper presents an approach for emotion recognition of emotional speech based on neural network. Linear predictive coefficients and radial basis function network are used as features and classification techniques, respectively, for emotion recognition. Results reveal that the approach is effective in recognition of human speech emotions. Speech utterances are directly extracted from audio channel including background noise. Totally, 75 utterances from 05 speakers were collected based on five emotion categories. Fifteen utterances have been considered for training and rest are for test. The proposed approach has been tested and verified for newly developed dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005) C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
2.
go back to reference D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef
3.
go back to reference N. Fragopanagos, G. Taylor, Emotional speech recognition: resources, features, and methods. Neural Networks 18, 389–405 (2005)CrossRef N. Fragopanagos, G. Taylor, Emotional speech recognition: resources, features, and methods. Neural Networks 18, 389–405 (2005)CrossRef
4.
go back to reference F. Eyben et al., On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)CrossRef F. Eyben et al., On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)CrossRef
5.
go back to reference T. Polzehl, A. Schmitt, F. Metze, M. Wagner, Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9–10), 1198–1209 (2011)CrossRef T. Polzehl, A. Schmitt, F. Metze, M. Wagner, Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9–10), 1198–1209 (2011)CrossRef
6.
go back to reference F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech, in ICSLP (1996), pp. 1970–1973 F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech, in ICSLP (1996), pp. 1970–1973
7.
go back to reference B.S. Atal, Automatic recognition of speakers from their voices. IEEE 64(4), 460–476 (1976)CrossRef B.S. Atal, Automatic recognition of speakers from their voices. IEEE 64(4), 460–476 (1976)CrossRef
8.
go back to reference M.M. Javidi, F. Roshan, Speech emotion recognition by using combinations of C5.0, neural network (NN), and support vector machines (SVM) classification methods. J. Math. Comput. Sci. 6, 191–200 (2013) M.M. Javidi, F. Roshan, Speech emotion recognition by using combinations of C5.0, neural network (NN), and support vector machines (SVM) classification methods. J. Math. Comput. Sci. 6, 191–200 (2013)
9.
go back to reference M.N. Mohanty, B. Jena, Analysis of stressed human speech. Int. J. Comput. Vision Robot. 2(2), 180–187 (2011) M.N. Mohanty, B. Jena, Analysis of stressed human speech. Int. J. Comput. Vision Robot. 2(2), 180–187 (2011)
10.
go back to reference M.N. Mohanty, A. Routray, P. Kabisatpathy, Voice detection using statistical method. Int. J. Eng. Techsci. 2(1), 120–124 (2010) M.N. Mohanty, A. Routray, P. Kabisatpathy, Voice detection using statistical method. Int. J. Eng. Techsci. 2(1), 120–124 (2010)
11.
go back to reference J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561–580 (1975)CrossRef J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561–580 (1975)CrossRef
12.
go back to reference B.S. Atal, S.L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)CrossRef B.S. Atal, S.L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)CrossRef
13.
go back to reference T.F. Quatieri, Discrete-Time Speech Signal Processing, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996) T.F. Quatieri, Discrete-Time Speech Signal Processing, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996)
14.
go back to reference A. Samal, D. Parida, M.R. Satpathy, M.N. Mohanty, On the use of MFCC feature vectors clustering for efficient text dependent speaker recognition, in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Application (FICTA)-2013, vol. 247 (2013), pp. 305–312 A. Samal, D. Parida, M.R. Satpathy, M.N. Mohanty, On the use of MFCC feature vectors clustering for efficient text dependent speaker recognition, in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Application (FICTA)-2013, vol. 247 (2013), pp. 305–312
15.
go back to reference S. Haykins, Neural Networks (Prentice-Hall, Upper Saddle River, 1999) S. Haykins, Neural Networks (Prentice-Hall, Upper Saddle River, 1999)
16.
go back to reference J.H.L. Hansen, B.D. Womack, Feature analysis and neural network based classification of speech under stress. IEEE Trans. Speech Audio Process. 4, 307–313 (1996)CrossRef J.H.L. Hansen, B.D. Womack, Feature analysis and neural network based classification of speech under stress. IEEE Trans. Speech Audio Process. 4, 307–313 (1996)CrossRef
Metadata
Title
Design of Neural Network Model for Emotional Speech Recognition
Authors
H. K. Palo
Mihir Narayana Mohanty
Mahesh Chandra
Copyright Year
2015
Publisher
Springer India
DOI
https://doi.org/10.1007/978-81-322-2135-7_32

Premium Partner