nach oben

International Journal of Speech Technology

Erschienen in:

23.01.2023

A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech

verfasst von: Haihua Jiang, Bin Hu, Zhenyu Liu, Gang Wang, Lan Zhang

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Early intervention for depression could provide a means to reducing the disease burden, but there is a lack of objective diagnostic methods. This study investigated automatic depression classification on a speech dataset of 85 healthy controls (51 females and 34 males) and 85 depressed patients (53 females and 32 males). Considering that there are obvious differences in the performance of different types of speech features, we propose a radius-incorporated localized multiple kernel learning (trLMKL) algorithm for detecting depression in speech to make the best use of speech features. To improve the classification accuracy, we combine the information of both the margin and the radius of the MEB to learn the gating model parameters in our algorithm. Furthermore, we do not directly incorporate the radius of the MEB, but incorporate the trace of the total scattering matrix of training data. This method can avoid the time cost of calculating the radius at each iteration and decrease the computational complexity. Comprehensive experiments were carried out on our depressed speech dataset and 10 UCI datasets. Our algorithm achieved better classification performance overall than SimpleMKL and LMKL, and it was efficient at detecting depression, indicating its potential for use as a diagnostic method for depression.

Vorheriger Artikel SHO based Deep Residual network and hierarchical speech features for speech enhancement

Nächster Artikel Different attacks presence considerations: analyzing the simple and efficient self-marked algorithm performance for highly-sensitive audio signals contents verification

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Airas, M. (2008). TKK Aparat: An environment for voice inverse filtering and parameterization. Logopedics Phoniatrics Vocology, 33, 49–64.CrossRef

Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Gedeon, T., Breakspear, M., & Parker, G. (2013). A comparative study of different classifiers for detecting depression from spontaneous speech. In Proceedings of ICASSP 2013, (pp. 8022–8026). IEEE

Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46, 31–159.CrossRefMATH

Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.CrossRef

Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the RBF kernel. Neural Computation, 15, 2643–2681.CrossRefMATH

Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.CrossRef

Cummins, N., Epps, J., Sethu, V., & Krajewski, J. (2014). Variability compensation in small data: Oversampled extraction of I-vectors for the classification of depressed speech. In Proceedings of ICASSP 2014, (pp. 970–974). IEEE

Dua, D., & Karra Taniskidou, E. UCI machine learning repository. University of California, School of Information and Computer Science. Retrieved 2021, from http://archive.ics.uci.edu/ml.

Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile-The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on multimedia, (pp. 1459–1462). Association for Computing Machinery

Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings of the 5th international conference on machine learning, (pp. 352–359). Springer-Verlag

Gönen, M., & Alpaydın, E. (2013). Localized algorithms for multiple kernel learning. Pattern Recognition, 46, 795–807.CrossRefMATH

Hawton, K., Comabella, C. C. I., Haw, C., & Saunders, K. (2013). Risk factors for suicide in individuals with depression: A systematic review. Journal of Affective Disorders, 147, 17–28.CrossRef

He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111.CrossRef

Hu, M., Chen, Y., & Kwok, J. T. Y. (2009). Building sparse multiple kernel SVM classifiers. IEEE Transactions on Neural Networks, 20, 827–839.CrossRef

Huang, K. Y., Wu, C. H., Su, M. H., & Kuo, Y. T. (2020). Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Transcactions on Affective Computing, 11, 393–404.CrossRef

Jiang, H. H., Hu, B., Liu, Z. Y., Wang, G., Zhang, L., Li, X. Y., & Kang, H. Y. (2018). Detecting depression using an ensemble logistic regression model based on multiple speech features. Computational and Mathematical Method, 9, 1–9.MATH

Jiang, H. H., Hu, B., Liu, Z. Y., Yan, L. H., Wang, T. Y., Liu, F., Kang, H. Y., & Li, X. Y. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Communication, 90, 39–46.CrossRef

Liu, X. W., Wang, L., Yin, J. P., Zhu, E., & Zhang, J. (2013). An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics., 43, 557–569.CrossRef

Low, L. A., Maddage, N. C., Lech, M., Sheeber, L. B., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Bio-Medical Engineering, 58, 574–586.CrossRef

Moore, E., Clements, M., Peifer, J. W., & Weisser, L. (2008). Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Bio-Medical Engineering, 55, 96–107.CrossRef

Nolenhoeksema, S., & Girgus, J. S. (1994). The emergence of gender differences in depression during adolescence. Psychological Bulletin, 115, 424–443.CrossRef

Ooi, K. E. B., Lech, M., & Allen, N. B. (2014). Prediction of major depression in adolescents using an optimized multi-channel weighted speech classification system. Biomedical Signal Processing, 14, 228–239.CrossRef

Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.MathSciNetMATH

Scherer, S., Stratou, G., Gratch, J., & Morency, L. P. (2013). Investigating voice quality as a speaker-independent indicator of depression and PTSD. In Proceedings of Interspeech, 2013, (pp. 847–851). ISCA

Sobin, C., & Sackeim, H. A. (1997). Psychomotor symptoms of depression. American Journal of Psychiatry., 154, 4–17.CrossRef

Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis, 30, 1534–1546.CrossRef

World Health Organization. (2021, September 13). Depression fact sheet. WHO, Geneva, Switzerland. Retrieved January 27, 2022, from http://www.who.int/en/news-room/fact-sheets/detail/depression.

Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks, 24, 749–761.CrossRef

Xu, Z., Jin, R., Yang, H., King, I., & Lyu, M. R. (2010). Simple and efficient multiple kernel learning by group Lasso. In Proceedings of the 27th international conference on machine learning, (pp. 1175–1182). Omnipress

Zhao, Z., Bao, Z., Zhang, Z., Cummins, N., & Schuller, B. (2020). Hierarchical attention transfer networks for depression assessment from speech. In Proceedings of ICASSP 2020, (pp. 7159–7163). IEEE

Titel: A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech
verfasst von: Haihua Jiang
Bin Hu
Zhenyu Liu
Gang Wang
Lan Zhang
Publikationsdatum: 23.01.2023
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 2/2023
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-023-10017-0

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Internationaler Motorenkongress/© [M] ATZlive | Chisnikov / Fotolia.com, Search Icon, Banner Hanser, Gardiner von Trapp/© Alpega Group, Benny Hahn/© ZEP GmbH, Customer Experience/© © oatawa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2023

SHO based Deep Residual network and hierarchical speech features for speech enhancement

Ensemble machine learning regression model based predictive framework for Parkinson’s UPDRS motor score prediction from speech data

Noise robust automatic speech recognition: review and analysis

An approach for speech enhancement with dysarthric speech recognition using optimization based machine learning frameworks

An efficient speech emotion recognition based on a dual-stream CNN-transformer fusion network

Stuttering detection using speaker representations and self-supervised contextual embeddings

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.