nach oben

International Journal of Speech Technology

Erschienen in:

01.03.2013

Emotion modeling from speech signal based on wavelet packet transform

verfasst von: Varsha N. Degaonkar, Shaila D. Apte

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The recognition of emotion in human speech has gained increasing attention in recent years due to the wide variety of applications that benefit from such technology. Detecting emotion from speech can be viewed as a classification task. It consists of assigning, out of a fixed set, an emotion category e.g. happiness, anger, to a speech utterance. In this paper, we have tackled two emotions namely happiness and anger. The parameters extracted from speech signal depend on speaker, spoken word as well as emotion. To detect the emotion, we have kept the spoken utterance and the speaker constant and only the emotion is changed. Different features are extracted to identify the parameters responsible for emotion. Wavelet packet transform (WPT) is found to be emotion specific. We have performed the experiments using three methods. Method uses WPT and compares the number of coefficients greater than threshold in different bands. Second method uses energy ratios of different bands using WPT and compares the energy ratios in different bands. The third method is a conventional method using MFCC. The results obtained using WPT for angry, happy and neutral mode are 85 %, 65 % and 80 % respectively as compared to results obtained using MFCC i.e. 75 %, 45 % and 60 % respectively for the three emotions. Based on WPT features a model is proposed for emotion conversion namely neutral to angry and neutral to happy emotion.

Nächster Artikel Blind separation of audio signals using trigonometric transforms and Kalman filtering

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Agarwal, A., Jain, A., Prakash, N., & Agrawal, S. S. (2010). Word based emotion conversion in Hindi language. In ICCSIT’2010 proceedings, Chengdu (pp. 419–423).

Burkhardt, F., Polzehl, T., Stegmann, J., Metze, F., & Huber, R. (2009). Detecting real life anger. In ICASSP’09 proceedings, Taipei, Taiwan (pp. 4761–4764).

Hidayati, R., Purnama, I. K. E., & Purnomo, M. H. (2009). The extraction of acoustic features of infant cry for emotion detection based on pitch and formants. In ICICI-BME’09, proceedings, Bandung (pp. 1–5).

Krajewski, J., Batliner, A., & Kessel, S. (2010). Comparing multiple classifiers for speech-based detection of self-confidence—a pilot study. In ICPR’2010 proceedings, Istanbul (pp. 3716–3719).

Laskowski, K. (2010). Finding emotionally involved speech using implicitly proximity-annotated laughter. In ICASSP’2010 proceedings, Dallas, TX (pp. 5226–5229).

Meshram, A. P., Shirbahadurkar, S. D., Kohok, A., & Jadhav, S. (2010). An overview and preparation for recognition of emotion from speech signal with multi modal fusion. In ICCAE’2010 proceedings, Singapore (pp. 446–452).

Metze, F., Polzehl, T., & Wagner, M. (2009). Fusion of acoustic and linguistic features for emotion detection. In ICSC’09 proceedings, Berkeley (pp. 153–160).

Tawari, A., & Trivedi, M. M. (2010). Speech emotion analysis: exploring the role of context. IEEE Transactions on Multimedia, 12(6), 502–509. CrossRef

Wooil, K., & Hansen, J. H. L. (2010). Angry emotion detection from real-life conversational speech by leveraging content structure. In ICASSP’2010, proceedings, Dallas, TX (pp. 5166–5169).

Titel: Emotion modeling from speech signal based on wavelet packet transform
verfasst von: Varsha N. Degaonkar
Shaila D. Apte
Publikationsdatum: 01.03.2013
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 1/2013
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-012-9142-8

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Interview Entropie Bild 1/© Bernhard Weßling, Joerg Schweinsberg/© Datacore Software, Smart Factory Symbolbild/© TensorSpark | Generated with AI | Getty Images, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2013

A hybrid VQ-GMM approach for identifying Indian languages

Improving the performance of speaker and language identification tasks using unique characteristics of a class

Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization

Phoneme recognition using zerocrossing interval distribution of speech patterns and ANN

Dynamic prosody modification using zero frequency filtered signal

Effect of aging on speech features and phoneme recognition: a study on Bengali voicing vowels

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.