Skip to main content
Erschienen in: International Journal of Speech Technology 2/2023

23.01.2023

A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech

verfasst von: Haihua Jiang, Bin Hu, Zhenyu Liu, Gang Wang, Lan Zhang

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Early intervention for depression could provide a means to reducing the disease burden, but there is a lack of objective diagnostic methods. This study investigated automatic depression classification on a speech dataset of 85 healthy controls (51 females and 34 males) and 85 depressed patients (53 females and 32 males). Considering that there are obvious differences in the performance of different types of speech features, we propose a radius-incorporated localized multiple kernel learning (trLMKL) algorithm for detecting depression in speech to make the best use of speech features. To improve the classification accuracy, we combine the information of both the margin and the radius of the MEB to learn the gating model parameters in our algorithm. Furthermore, we do not directly incorporate the radius of the MEB, but incorporate the trace of the total scattering matrix of training data. This method can avoid the time cost of calculating the radius at each iteration and decrease the computational complexity. Comprehensive experiments were carried out on our depressed speech dataset and 10 UCI datasets. Our algorithm achieved better classification performance overall than SimpleMKL and LMKL, and it was efficient at detecting depression, indicating its potential for use as a diagnostic method for depression.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Airas, M. (2008). TKK Aparat: An environment for voice inverse filtering and parameterization. Logopedics Phoniatrics Vocology, 33, 49–64.CrossRef Airas, M. (2008). TKK Aparat: An environment for voice inverse filtering and parameterization. Logopedics Phoniatrics Vocology, 33, 49–64.CrossRef
Zurück zum Zitat Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Gedeon, T., Breakspear, M., & Parker, G. (2013). A comparative study of different classifiers for detecting depression from spontaneous speech. In Proceedings of ICASSP 2013, (pp. 8022–8026). IEEE Alghowinem, S., Goecke, R., Wagner, M., Epps, J., Gedeon, T., Breakspear, M., & Parker, G. (2013). A comparative study of different classifiers for detecting depression from spontaneous speech. In Proceedings of ICASSP 2013, (pp. 8022–8026). IEEE
Zurück zum Zitat Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46, 31–159.CrossRefMATH Chapelle, O., Vapnik, V., Bousquet, O., & Mukherjee, S. (2002). Choosing multiple parameters for support vector machines. Machine Learning, 46, 31–159.CrossRefMATH
Zurück zum Zitat Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.CrossRef Chen, J., & Liu, Y. (2011). Locally linear embedding: A survey. Artificial Intelligence Review, 36, 29–48.CrossRef
Zurück zum Zitat Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the RBF kernel. Neural Computation, 15, 2643–2681.CrossRefMATH Chung, K. M., Kao, W. C., Sun, C. L., Wang, L. L., & Lin, C. J. (2003). Radius margin bounds for support vector machines with the RBF kernel. Neural Computation, 15, 2643–2681.CrossRefMATH
Zurück zum Zitat Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.CrossRef Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49.CrossRef
Zurück zum Zitat Cummins, N., Epps, J., Sethu, V., & Krajewski, J. (2014). Variability compensation in small data: Oversampled extraction of I-vectors for the classification of depressed speech. In Proceedings of ICASSP 2014, (pp. 970–974). IEEE Cummins, N., Epps, J., Sethu, V., & Krajewski, J. (2014). Variability compensation in small data: Oversampled extraction of I-vectors for the classification of depressed speech. In Proceedings of ICASSP 2014, (pp. 970–974). IEEE
Zurück zum Zitat Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile-The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on multimedia, (pp. 1459–1462). Association for Computing Machinery Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile-The Munich versatile and fast open-source audio feature extractor. In Proceedings of the 18th ACM international conference on multimedia, (pp. 1459–1462). Association for Computing Machinery
Zurück zum Zitat Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings of the 5th international conference on machine learning, (pp. 352–359). Springer-Verlag Gönen, M., & Alpaydin, E. (2008). Localized multiple kernel learning. In Proceedings of the 5th international conference on machine learning, (pp. 352–359). Springer-Verlag
Zurück zum Zitat Gönen, M., & Alpaydın, E. (2013). Localized algorithms for multiple kernel learning. Pattern Recognition, 46, 795–807.CrossRefMATH Gönen, M., & Alpaydın, E. (2013). Localized algorithms for multiple kernel learning. Pattern Recognition, 46, 795–807.CrossRefMATH
Zurück zum Zitat Hawton, K., Comabella, C. C. I., Haw, C., & Saunders, K. (2013). Risk factors for suicide in individuals with depression: A systematic review. Journal of Affective Disorders, 147, 17–28.CrossRef Hawton, K., Comabella, C. C. I., Haw, C., & Saunders, K. (2013). Risk factors for suicide in individuals with depression: A systematic review. Journal of Affective Disorders, 147, 17–28.CrossRef
Zurück zum Zitat He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111.CrossRef He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111.CrossRef
Zurück zum Zitat Hu, M., Chen, Y., & Kwok, J. T. Y. (2009). Building sparse multiple kernel SVM classifiers. IEEE Transactions on Neural Networks, 20, 827–839.CrossRef Hu, M., Chen, Y., & Kwok, J. T. Y. (2009). Building sparse multiple kernel SVM classifiers. IEEE Transactions on Neural Networks, 20, 827–839.CrossRef
Zurück zum Zitat Huang, K. Y., Wu, C. H., Su, M. H., & Kuo, Y. T. (2020). Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Transcactions on Affective Computing, 11, 393–404.CrossRef Huang, K. Y., Wu, C. H., Su, M. H., & Kuo, Y. T. (2020). Detecting unipolar and bipolar depressive disorders from elicited speech responses using latent affective structure model. IEEE Transcactions on Affective Computing, 11, 393–404.CrossRef
Zurück zum Zitat Jiang, H. H., Hu, B., Liu, Z. Y., Wang, G., Zhang, L., Li, X. Y., & Kang, H. Y. (2018). Detecting depression using an ensemble logistic regression model based on multiple speech features. Computational and Mathematical Method, 9, 1–9.MATH Jiang, H. H., Hu, B., Liu, Z. Y., Wang, G., Zhang, L., Li, X. Y., & Kang, H. Y. (2018). Detecting depression using an ensemble logistic regression model based on multiple speech features. Computational and Mathematical Method, 9, 1–9.MATH
Zurück zum Zitat Jiang, H. H., Hu, B., Liu, Z. Y., Yan, L. H., Wang, T. Y., Liu, F., Kang, H. Y., & Li, X. Y. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Communication, 90, 39–46.CrossRef Jiang, H. H., Hu, B., Liu, Z. Y., Yan, L. H., Wang, T. Y., Liu, F., Kang, H. Y., & Li, X. Y. (2017). Investigation of different speech types and emotions for detecting depression using different classifiers. Speech Communication, 90, 39–46.CrossRef
Zurück zum Zitat Liu, X. W., Wang, L., Yin, J. P., Zhu, E., & Zhang, J. (2013). An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics., 43, 557–569.CrossRef Liu, X. W., Wang, L., Yin, J. P., Zhu, E., & Zhang, J. (2013). An efficient approach to integrating radius information into multiple kernel learning. IEEE Transactions on Cybernetics., 43, 557–569.CrossRef
Zurück zum Zitat Low, L. A., Maddage, N. C., Lech, M., Sheeber, L. B., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Bio-Medical Engineering, 58, 574–586.CrossRef Low, L. A., Maddage, N. C., Lech, M., Sheeber, L. B., & Allen, N. B. (2011). Detection of clinical depression in adolescents’ speech during family interactions. IEEE Transactions on Bio-Medical Engineering, 58, 574–586.CrossRef
Zurück zum Zitat Moore, E., Clements, M., Peifer, J. W., & Weisser, L. (2008). Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Bio-Medical Engineering, 55, 96–107.CrossRef Moore, E., Clements, M., Peifer, J. W., & Weisser, L. (2008). Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Transactions on Bio-Medical Engineering, 55, 96–107.CrossRef
Zurück zum Zitat Nolenhoeksema, S., & Girgus, J. S. (1994). The emergence of gender differences in depression during adolescence. Psychological Bulletin, 115, 424–443.CrossRef Nolenhoeksema, S., & Girgus, J. S. (1994). The emergence of gender differences in depression during adolescence. Psychological Bulletin, 115, 424–443.CrossRef
Zurück zum Zitat Ooi, K. E. B., Lech, M., & Allen, N. B. (2014). Prediction of major depression in adolescents using an optimized multi-channel weighted speech classification system. Biomedical Signal Processing, 14, 228–239.CrossRef Ooi, K. E. B., Lech, M., & Allen, N. B. (2014). Prediction of major depression in adolescents using an optimized multi-channel weighted speech classification system. Biomedical Signal Processing, 14, 228–239.CrossRef
Zurück zum Zitat Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.MathSciNetMATH Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.MathSciNetMATH
Zurück zum Zitat Scherer, S., Stratou, G., Gratch, J., & Morency, L. P. (2013). Investigating voice quality as a speaker-independent indicator of depression and PTSD. In Proceedings of Interspeech, 2013, (pp. 847–851). ISCA Scherer, S., Stratou, G., Gratch, J., & Morency, L. P. (2013). Investigating voice quality as a speaker-independent indicator of depression and PTSD. In Proceedings of Interspeech, 2013, (pp. 847–851). ISCA
Zurück zum Zitat Sobin, C., & Sackeim, H. A. (1997). Psychomotor symptoms of depression. American Journal of Psychiatry., 154, 4–17.CrossRef Sobin, C., & Sackeim, H. A. (1997). Psychomotor symptoms of depression. American Journal of Psychiatry., 154, 4–17.CrossRef
Zurück zum Zitat Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis, 30, 1534–1546.CrossRef Wang, L. (2008). Feature selection with kernel class separability. IEEE Transactions on Pattern Analysis, 30, 1534–1546.CrossRef
Zurück zum Zitat Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks, 24, 749–761.CrossRef Xu, X., Tsang, I. W., & Xu, D. (2013). Soft margin multiple kernel learning. IEEE Transactions on Neural Networks, 24, 749–761.CrossRef
Zurück zum Zitat Xu, Z., Jin, R., Yang, H., King, I., & Lyu, M. R. (2010). Simple and efficient multiple kernel learning by group Lasso. In Proceedings of the 27th international conference on machine learning, (pp. 1175–1182). Omnipress Xu, Z., Jin, R., Yang, H., King, I., & Lyu, M. R. (2010). Simple and efficient multiple kernel learning by group Lasso. In Proceedings of the 27th international conference on machine learning, (pp. 1175–1182). Omnipress
Zurück zum Zitat Zhao, Z., Bao, Z., Zhang, Z., Cummins, N., & Schuller, B. (2020). Hierarchical attention transfer networks for depression assessment from speech. In Proceedings of ICASSP 2020, (pp. 7159–7163). IEEE Zhao, Z., Bao, Z., Zhang, Z., Cummins, N., & Schuller, B. (2020). Hierarchical attention transfer networks for depression assessment from speech. In Proceedings of ICASSP 2020, (pp. 7159–7163). IEEE
Metadaten
Titel
A radius-incorporated localized multiple kernel learning algorithm for detecting depression in speech
verfasst von
Haihua Jiang
Bin Hu
Zhenyu Liu
Gang Wang
Lan Zhang
Publikationsdatum
23.01.2023
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 2/2023
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-023-10017-0

Weitere Artikel der Ausgabe 2/2023

International Journal of Speech Technology 2/2023 Zur Ausgabe

Neuer Inhalt