nach oben

International Journal of Speech Technology

Erschienen in:

08.02.2022

English speech emotion recognition method based on speech recognition

verfasst von: Man Liu

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech emotion reflects important information other than text content in speech signal, while traditional speech recognition often ignores the emotion of text content, so it is difficult to understand more abundant emotional content from English text. In order to change this situation and get more emotional information from English texts, it is necessary to understand English speech emotion recognition. However, at present, the research on speech emotion recognition technology in China mainly focuses on Chinese, while the research on English speech emotion recognition is relatively few. Therefore, this paper studies English speech emotion recognition. The digital processing of speech signal is based on speech recognition. The digitization of speech signal is the premise of computer processing and analysis of speech signal. The preprocessing of speech signal can also be called front-end processing. The specific steps are: sampling and quantization, pre intensity and windowing. Voice endpoint detection is based on high-order differentiation of volume and waveform. In feature extraction, open smile is selected as the tool to directly extract features, libsvm is selected to establish speech emotion recognition model, and finally an experimental environment is built to verify the design method. The experimental results show that this method can better recognize the emotion of English speech and realize a high degree of human–computer interaction.

Vorheriger Artikel RETRACTED ARTICLE: Computer vision for facial analysis using human–computer interaction models

Nächster Artikel Automatic annotation method of VR speech corpus based on artificial intelligence

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Alotaibi, Y. A., et al. (2019). A canonicalization of distinctive phonetic features to improve arabic speech recognition. Acta Acustica United with Acustica, 105(6), 1269–1277.CrossRef

Cai, X., Yin, Y., & Zhang, Q. (2020). A cross-language study on feedforward and feedback control of voice intensity in Chinese-English bilinguals. Applied Psycholinguistics, 41(4), 771–795.CrossRef

Cui, X., et al. (2020). Distributed training of deep neural network acoustic models for automatic speech recognition: A comparison of current training strategies. IEEE Signal Processing Magazine, 37(3), 39–49.CrossRef

Dong, Y., et al. (2019). Bidirectional convolutional recurrent sparse network (bcrsn): An efficient model for music emotion recognition. IEEE Transactions on Multimedia, 21(12), 3150–3163.CrossRef

Elizabeth et al. (2019). Illusions of transitive expletives in middle English. Journal of Comparative Germanic Linguistics, 22(3), 211–246.MathSciNetCrossRef

Haeb-Umbach, R., et al. (2019). Speech processing for digital home assistants: Combining signal processing with deep-learning techniques. IEEE Signal Processing Magazine, 36(6), 111–124.CrossRef

Hu, S., et al. (2019). Adversarial examples for automatic speech recognition: Attacks and countermeasures. IEEE Communications Magazine, 57(99), 120–126.CrossRef

Kumar, A., & Aggarwal, R. K. (2020). Discriminatively trained continuous Hindi speech recognition using integrated acoustic features and recurrent neural network language modeling. Journal of Intelligent Systems, 30(1), 165–179.MathSciNetCrossRef

Liliana, D. Y., et al. (2019). Fuzzy emotion: A natural approach to automatic facial expression recognition from psychological perspective using fuzzy system. Cognitive Processing, 20(4), 391–403.CrossRef

Martin-Key, N. A., Allison, G., & Fairchild, G. (2020). Empathic accuracy in female adolescents with conduct disorder and sex differences in the relationship between conduct disorder and empathy. Journal of Abnormal Child Psychology, 48(9), 1155–1167.CrossRef

Mcdonough, K., et al. (2019). The occurrence and perception of listener visual cues during nonunderstanding episodes. Studies in Second Language Acquisition, 41(5), 1–15.CrossRef

Nordström, H., & Laukka, P. (2019). The time course of emotion recognition in speech and music. The Journal of the Acoustical Society of America, 145(5), 3058–3074.CrossRef

Priya, R. V., Vijayakumar, V., & Ta, V. J. (2020). MQSMER: A mixed quadratic shape model with optimal fuzzy membership functions for emotion recognition. Neural Computing and Applications, 32(8), 3165–3182.CrossRef

Senthil, K. T. (2021). Construction of hybrid deep learning model for predicting children behavior based on their emotional reaction. Journal of Information Technology and Digital World, 3(1), 29–43.MathSciNetCrossRef

Smys, S., & Raj, J. S. (2021). Analysis of deep learning techniques for early detection of depression on social media network—A comparative study. Journal of Trends in Computer Science and Smart Technology (TCSST), 3(1), 24–39.CrossRef

Song, Z. (2020). English speech recognition based on deep learning with multiple features. Computing, 102(99), 1–20.MathSciNetMATH

Ton-That, A. H., & Cao, N. T. (2019). Speech emotion recognition using a fuzzy approach. Journal of Intelligent and Fuzzy Systems, 36(2), 1587–1597.CrossRef

Tsikandilakis, M., et al. (2019). “There Is NoFaceLike Home”: Ratings for Cultural Familiarity to Own and Other FacialDialectsof Emotion With and Without Conscious Awareness in a British Sample. Perception, 48(10), 918–947.CrossRef

Wang, Y. (2019). The function development of network teaching system to English pronunciation and tone in the background of internet of things. Journal of Intelligent and Fuzzy Systems, 37(5), 5965–5972.CrossRef

Wei, J., et al. (2019). Lifelong learning for tactile emotion recognition. Interaction Studies, 20(1), 25–41.CrossRef

Yazdani, R., Arnau, J. M., & Gonzalez, A. (2019). A low-power, high-performance speech recognition accelerator. IEEE Transactions on Computers, 68(12), 1817–1831.CrossRef

Titel: English speech emotion recognition method based on speech recognition
verfasst von: Man Liu
Publikationsdatum: 08.02.2022
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 2/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-021-09955-4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2022

Automatic annotation method of VR speech corpus based on artificial intelligence

Towards a novel machine learning approach to support augmentative and alternative communication (AAC)

AdvIris: a hybrid approach to detecting adversarial iris examples using wavelet transform

Pertinent feature selection techniques for automatic emotion recognition in stressed speech

English language teaching based on big data analytics in augmentative and alternative communication system

Emotional speech analysis and classification using variational mode decomposition