Skip to main content
Erschienen in: Cognitive Neurodynamics 6/2012

01.12.2012 | Research Article

A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech

verfasst von: Soo-Young Lee, Hyun-Ah Song, Shun-ichi Amari

Erschienen in: Cognitive Neurodynamics | Ausgabe 6/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this study we propose a new feature extraction algorithm, dNMF (discriminant non-negative matrix factorization), to learn subtle class-related differences while maintaining an accurate generative capability. In addition to the minimum representation error for the standard NMF (non-negative matrix factorization) algorithm, the dNMF algorithm also results in higher between-class variance for discriminant power. The multiplicative NMF learning algorithm has been modified to cope with this additional constraint. The cost function was carefully designed so that the extraction of feature coefficients from a single testing pattern with pre-trained feature vectors resulted in a quadratic convex optimization problem in non-negative space for uniqueness. It also resolves issues related to the previous discriminant NMF algorithms. The developed dNMF algorithm has been applied to the emotion recognition task for speech, where it needs to emphasize the emotional differences while de-emphasizing the dominant phonetic components. The dNMF algorithm successfully extracted subtle emotional differences, demonstrated much better recognition performance and showed a smaller representation error from an emotional speech database.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bell A, Sejnowski TJ (1997) The “independent components” of natural scenes are edge filters. Vis Res 37(23):3327–3338PubMedCrossRef Bell A, Sejnowski TJ (1997) The “independent components” of natural scenes are edge filters. Vis Res 37(23):3327–3338PubMedCrossRef
Zurück zum Zitat Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. Proc Interspeech 2005:1517–1520 Burkhardt F, Paeschke A, Rolfes M, Sendlmeier W, Weiss B (2005) A database of German emotional speech. Proc Interspeech 2005:1517–1520
Zurück zum Zitat Dhir CS, Lee SY (2011) Discriminant independent component analysis. IEEE Trans Neural Netw 22(6):827–845CrossRef Dhir CS, Lee SY (2011) Discriminant independent component analysis. IEEE Trans Neural Netw 22(6):827–845CrossRef
Zurück zum Zitat Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469 Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
Zurück zum Zitat Kim T, Lee SY (2005) Learning self-organized topology-preserving complex speech features at primary auditory cortex. Neurocomputing 65–66:793–800CrossRef Kim T, Lee SY (2005) Learning self-organized topology-preserving complex speech features at primary auditory cortex. Neurocomputing 65–66:793–800CrossRef
Zurück zum Zitat Kim D, Lee SY, Amari S (2009) Representative and discriminant feature extraction based on NMF for emotion recognition in speech. Neural Inf Process LNCS 5863:649–656CrossRef Kim D, Lee SY, Amari S (2009) Representative and discriminant feature extraction based on NMF for emotion recognition in speech. Neural Inf Process LNCS 5863:649–656CrossRef
Zurück zum Zitat Kotsia I, Zafeiriou S, Pitas I (2007) A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Trans Inf Forensics Secur 2(3):588–595CrossRef Kotsia I, Zafeiriou S, Pitas I (2007) A novel discriminant non-negative matrix factorization algorithm with applications to facial image characterization problems. IEEE Trans Inf Forensics Secur 2(3):588–595CrossRef
Zurück zum Zitat Laurberg H, Christensen MG, Plumbley MD, Hansen LK, Jensen SH (2008) Theorem of positive data: on the uniqueness of NMF. Comput Intell Neurosci 2008:704206CrossRef Laurberg H, Christensen MG, Plumbley MD, Hansen LK, Jensen SH (2008) Theorem of positive data: on the uniqueness of NMF. Comput Intell Neurosci 2008:704206CrossRef
Zurück zum Zitat Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791PubMedCrossRef Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791PubMedCrossRef
Zurück zum Zitat Lee JH, Lee TW, Jung HY, Lee SY (2002) On the efficient speech feature extraction based on independent component analysis. Neural Process Lett 15(3):235–245CrossRef Lee JH, Lee TW, Jung HY, Lee SY (2002) On the efficient speech feature extraction based on independent component analysis. Neural Process Lett 15(3):235–245CrossRef
Zurück zum Zitat Lin Y, Wei G (2005) Speech emotion recognition based on HMM and SVM. Proc Fourth Int Conf Mach Learn Cybern 8:4898–4901CrossRef Lin Y, Wei G (2005) Speech emotion recognition based on HMM and SVM. Proc Fourth Int Conf Mach Learn Cybern 8:4898–4901CrossRef
Zurück zum Zitat Long J, Gu Z, Li Y, Yu T, Li F, Fu M (2011) Semi-supervised joint spatio-temporal feature selection for P300-based BCI speller. Cogn Neurodyn 5:387–398PubMedCrossRef Long J, Gu Z, Li Y, Yu T, Li F, Fu M (2011) Semi-supervised joint spatio-temporal feature selection for P300-based BCI speller. Cogn Neurodyn 5:387–398PubMedCrossRef
Zurück zum Zitat Martinez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233CrossRef Martinez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(2):228–233CrossRef
Zurück zum Zitat Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1):157–183CrossRef Oudeyer PY (2003) The production and recognition of emotions in speech: features and algorithms. Int J Hum Comput Stud 59(1):157–183CrossRef
Zurück zum Zitat Slaney M, McRoberts G (2003) Baby ears: a recognition system for affective vocalizations. Speech Commun 39:367–384CrossRef Slaney M, McRoberts G (2003) Baby ears: a recognition system for affective vocalizations. Speech Commun 39:367–384CrossRef
Zurück zum Zitat Ververidisa D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Communications 48(9):1162–1181 Ververidisa D, Kotropoulos C (2006) Emotional speech recognition: resources, features, and methods. Communications 48(9):1162–1181
Zurück zum Zitat Wang Y, Jia Y, Hu C, Turk M (2005) Non-negative matrix factorization framework for face recognition. Int J Pattern Recogn Artif Intell 19(4):495–511CrossRef Wang Y, Jia Y, Hu C, Turk M (2005) Non-negative matrix factorization framework for face recognition. Int J Pattern Recogn Artif Intell 19(4):495–511CrossRef
Zurück zum Zitat Wang C, Zou J, Zhang J, Wang M, Wang R (2010) Feature extraction and recognition of epileptiform activity in EEG by combining PCA with ApEn. Cogn Neurodyn 4:233–240PubMedCrossRef Wang C, Zou J, Zhang J, Wang M, Wang R (2010) Feature extraction and recognition of epileptiform activity in EEG by combining PCA with ApEn. Cogn Neurodyn 4:233–240PubMedCrossRef
Zurück zum Zitat Wu S, Falk T, Chan W (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785CrossRef Wu S, Falk T, Chan W (2011) Automatic speech emotion recognition using modulation spectral features. Speech Commun 53(5):768–785CrossRef
Zurück zum Zitat Yang Z, Oja E (2010) Linear and nonlinear projective nonnegative matrix factorization. IEEE Trans Neural Netw 21(5):734–749PubMedCrossRef Yang Z, Oja E (2010) Linear and nonlinear projective nonnegative matrix factorization. IEEE Trans Neural Netw 21(5):734–749PubMedCrossRef
Zurück zum Zitat You M, Chen C, Bu J, Liu J, Tao J (2006) Emotional speech analysis on nonlinear manifold. In: Proceedings of the 18th international conference on pattern recognition, vol 3, pp 91–94 You M, Chen C, Bu J, Liu J, Tao J (2006) Emotional speech analysis on nonlinear manifold. In: Proceedings of the 18th international conference on pattern recognition, vol 3, pp 91–94
Zurück zum Zitat Zafeiriou S, Petrou M (2010) Nonlinear non-negative component analysis algorithms. IEEE Trans Image Process 19(4):1050–1066PubMedCrossRef Zafeiriou S, Petrou M (2010) Nonlinear non-negative component analysis algorithms. IEEE Trans Image Process 19(4):1050–1066PubMedCrossRef
Zurück zum Zitat Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw 17(3):683–695PubMedCrossRef Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw 17(3):683–695PubMedCrossRef
Zurück zum Zitat Zhao Q, Rutkowski M, Zhang L, Cichocki A (2010) Generalized optimal spatial filtering using a kernel approach with application to EEG classification. Cogn Neurodyn 4:355–358PubMedCrossRef Zhao Q, Rutkowski M, Zhang L, Cichocki A (2010) Generalized optimal spatial filtering using a kernel approach with application to EEG classification. Cogn Neurodyn 4:355–358PubMedCrossRef
Zurück zum Zitat Zhou G, Hansen JHL, Kaiser JF (2001) Nonlinear feature based classification of speech under stress. IEEE Trans Speech Audio Process 9:201–216CrossRef Zhou G, Hansen JHL, Kaiser JF (2001) Nonlinear feature based classification of speech under stress. IEEE Trans Speech Audio Process 9:201–216CrossRef
Metadaten
Titel
A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech
verfasst von
Soo-Young Lee
Hyun-Ah Song
Shun-ichi Amari
Publikationsdatum
01.12.2012
Verlag
Springer Netherlands
Erschienen in
Cognitive Neurodynamics / Ausgabe 6/2012
Print ISSN: 1871-4080
Elektronische ISSN: 1871-4099
DOI
https://doi.org/10.1007/s11571-012-9213-1

Weitere Artikel der Ausgabe 6/2012

Cognitive Neurodynamics 6/2012 Zur Ausgabe

Neuer Inhalt