Skip to main content

2015 | OriginalPaper | Buchkapitel

Design of Neural Network Model for Emotional Speech Recognition

verfasst von : H. K. Palo, Mihir Narayana Mohanty, Mahesh Chandra

Erschienen in: Artificial Intelligence and Evolutionary Algorithms in Engineering Systems

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Human–computer interaction (HCI) needs to be improved for the field of recognition and detection. Exclusively, the emotion recognition has major impact on social, engineering, and medical science applications. This paper presents an approach for emotion recognition of emotional speech based on neural network. Linear predictive coefficients and radial basis function network are used as features and classification techniques, respectively, for emotion recognition. Results reveal that the approach is effective in recognition of human speech emotions. Speech utterances are directly extracted from audio channel including background noise. Totally, 75 utterances from 05 speakers were collected based on five emotion categories. Fifteen utterances have been considered for training and rest are for test. The proposed approach has been tested and verified for newly developed dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005) C.M. Lee, S.S. Narayanan, Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)
2.
Zurück zum Zitat D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef
3.
Zurück zum Zitat N. Fragopanagos, G. Taylor, Emotional speech recognition: resources, features, and methods. Neural Networks 18, 389–405 (2005)CrossRef N. Fragopanagos, G. Taylor, Emotional speech recognition: resources, features, and methods. Neural Networks 18, 389–405 (2005)CrossRef
4.
Zurück zum Zitat F. Eyben et al., On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)CrossRef F. Eyben et al., On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguistic cues. J. Multimodal User Interfaces 3, 7–19 (2010)CrossRef
5.
Zurück zum Zitat T. Polzehl, A. Schmitt, F. Metze, M. Wagner, Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9–10), 1198–1209 (2011)CrossRef T. Polzehl, A. Schmitt, F. Metze, M. Wagner, Anger recognition in speech using acoustic and linguistic cues. Speech Commun. 53(9–10), 1198–1209 (2011)CrossRef
6.
Zurück zum Zitat F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech, in ICSLP (1996), pp. 1970–1973 F. Dellaert, T. Polzin, A. Waibel, Recognizing emotion in speech, in ICSLP (1996), pp. 1970–1973
7.
Zurück zum Zitat B.S. Atal, Automatic recognition of speakers from their voices. IEEE 64(4), 460–476 (1976)CrossRef B.S. Atal, Automatic recognition of speakers from their voices. IEEE 64(4), 460–476 (1976)CrossRef
8.
Zurück zum Zitat M.M. Javidi, F. Roshan, Speech emotion recognition by using combinations of C5.0, neural network (NN), and support vector machines (SVM) classification methods. J. Math. Comput. Sci. 6, 191–200 (2013) M.M. Javidi, F. Roshan, Speech emotion recognition by using combinations of C5.0, neural network (NN), and support vector machines (SVM) classification methods. J. Math. Comput. Sci. 6, 191–200 (2013)
9.
Zurück zum Zitat M.N. Mohanty, B. Jena, Analysis of stressed human speech. Int. J. Comput. Vision Robot. 2(2), 180–187 (2011) M.N. Mohanty, B. Jena, Analysis of stressed human speech. Int. J. Comput. Vision Robot. 2(2), 180–187 (2011)
10.
Zurück zum Zitat M.N. Mohanty, A. Routray, P. Kabisatpathy, Voice detection using statistical method. Int. J. Eng. Techsci. 2(1), 120–124 (2010) M.N. Mohanty, A. Routray, P. Kabisatpathy, Voice detection using statistical method. Int. J. Eng. Techsci. 2(1), 120–124 (2010)
11.
Zurück zum Zitat J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561–580 (1975)CrossRef J. Makhoul, Linear prediction: a tutorial review. Proc. IEEE 63, 561–580 (1975)CrossRef
12.
Zurück zum Zitat B.S. Atal, S.L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)CrossRef B.S. Atal, S.L. Hanauer, Speech analysis and synthesis by linear prediction of the speech wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)CrossRef
13.
Zurück zum Zitat T.F. Quatieri, Discrete-Time Speech Signal Processing, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996) T.F. Quatieri, Discrete-Time Speech Signal Processing, 3rd edn. (Prentice-Hall, Upper Saddle River, 1996)
14.
Zurück zum Zitat A. Samal, D. Parida, M.R. Satpathy, M.N. Mohanty, On the use of MFCC feature vectors clustering for efficient text dependent speaker recognition, in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Application (FICTA)-2013, vol. 247 (2013), pp. 305–312 A. Samal, D. Parida, M.R. Satpathy, M.N. Mohanty, On the use of MFCC feature vectors clustering for efficient text dependent speaker recognition, in Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Application (FICTA)-2013, vol. 247 (2013), pp. 305–312
15.
Zurück zum Zitat S. Haykins, Neural Networks (Prentice-Hall, Upper Saddle River, 1999) S. Haykins, Neural Networks (Prentice-Hall, Upper Saddle River, 1999)
16.
Zurück zum Zitat J.H.L. Hansen, B.D. Womack, Feature analysis and neural network based classification of speech under stress. IEEE Trans. Speech Audio Process. 4, 307–313 (1996)CrossRef J.H.L. Hansen, B.D. Womack, Feature analysis and neural network based classification of speech under stress. IEEE Trans. Speech Audio Process. 4, 307–313 (1996)CrossRef
Metadaten
Titel
Design of Neural Network Model for Emotional Speech Recognition
verfasst von
H. K. Palo
Mihir Narayana Mohanty
Mahesh Chandra
Copyright-Jahr
2015
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2135-7_32

Premium Partner