Top

Cluster Computing

Published in:

09-12-2017

Research on depression detection algorithm combine acoustic rhythm with sparse face recognition

Authors: Jian Zhao, Weiwen Su, Jian Jia, Chao Zhang, Tingting Lu

Published in: Cluster Computing | Special Issue 4/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Due to the existence of false positive rate of the traditional depression diagnosis method, this paper proposes a multi-modal fusion algorithm based on speech signal and facial image sequence for depression diagnosis. Introduced spectrum subtraction to enhance depressed speech signal, and use cepstrum method to extract pitch frequency features with large variation rate and formant features with significant difference, the short time energy and Mel-frequency cepstral coefficients characteristic parameters for different emotion speeches are analyzed in both time domain and frequency domain, and establish a model for training and identification. Meanwhile, this paper implements the orthogonal match pursuit algorithm to obtain a sparse linear combination of face test samples, and cascade with voice and facial emotions based proportion. The experimental results show that the recognition rate based on the depression detection algorithm of fusion speech and facial emotions has reached 81.14%. Compared to the existing doctor’s accuracy rate of 47.3%, the accuracy can bring extra 71.54% improvement by combining with the proposed method of this paper. Additionally, it can easily apply to the hardware and software on the existing hospital instruments with low cost. Therefore, it is an accurate and effective method for diagnosing depression.

previous article A novel complex network-based modeling method for heterogeneous product design

next article The precision of Mach–Zehnder interferometer in the presence of bit-phase flip

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Ionescu, D.F., et al.: Defining anxious depression: a review of the literature. CNS Spectr. 18(5), 252–260 (2013)CrossRef

Erschens, R., et al.: Methodological aspects of international research on the burden of anxiety and depression in medical students. Ment. Health Prev. 4(1), 31–35 (2016)CrossRef

Melton, T.H., et al.: Comorbid anxiety and depressive symptoms in children and adolescents: a systematic review and analysis. J. Psychiatr. Pract. 22(2), 84 (2016)CrossRef

Potapova, R., Grigorieva, M.: Crosslinguistic intelligibility of Russian and German speech in noisy environment. J. Electr. Comput. Eng. 2017, 1–9 (2017)CrossRef

Vrbova, K., et al.: Quality of life, self-stigma, and hope in schizophrenia spectrum disorders: a cross-sectional study. Neuropsychiatr. Dis. Treat. 13, 567 (2017)CrossRef

Hernández-Mena, C.D., Meza-Ruiz, I.V., Herrera-Camacho, J.A.: Automatic speech recognizers for Mexican Spanish and its open resources. J. Appl. Res. Technol. 15(3) (2017)

Huang, Y.B., et al.: Hash authentication algorithm of compressed domain speech perception based on MFCC and NMF. Appl. Mech. Mater. 719–720, 1166–1170 (2015)CrossRef

Yang, A.Y., et al.: Distributed sensor perception via sparse representation. Proc. IEEE 98(6), 1077–1088 (2010)CrossRef

Maas, A.L., et al.: Building DNN acoustic models for large vocabulary speech recognition. Comput. Speech Lang. 41(C), 195–213 (2017)

10.

Ozdas, A., et al.: Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk. IEEE Trans. Biomed. Eng. 51(9), 1530–1540 (2004)CrossRef

11.

Vihari, S., et al.: Comparison of speech enhancement algorithms. Procedia Comput. Sci. 89, 666–676 (2016)CrossRef

12.

Liu, Y.H., Zhou, D.M., Jiang, Z.J.: Improved spectral subtraction speech enhancement algorithm. Adv. Mater. Res. 760–762, 536–541 (2013)CrossRef

13.

Tohidypour, H.R., Ahadi, S.M.: New features for speech enhancement using bivariate shrinkage based on redundant wavelet filter-banks. Comput. Speech Lang. 35(C), 93–115 (2016)

14.

You, C.H., Bin, M.A.: Spectral-domain speech enhancement for speech recognition. Speech Commun. 94, 30–41 (2017)CrossRef

15.

Sahu, S., Espywilson, C.: Effects of depression on speech. J. Acoust. Soc. Am. 136(4), 2312–2312 (2014)CrossRef

16.

ChinnaRao, M., Murthy, A.V.S.N., Satyanarayana, Ch.: Emotion recognition system based on skew gaussian mixture model and MFCC coefficients. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 4, 51–57 (2015)

17.

Yang, Y., Fairbairn, C., Cohn, J.F.: Detecting depression severity from vocal prosody. IEEE Trans. Affect. Comput. 4(2), 142–150 (2013)CrossRef

18.

Schuller, B., et al.: Recognising realistic emotions and affect in speech: state of the art and lessons learnt from the first challenge. Speech Commun. 53(9–10), 1062–1087 (2011)CrossRef

19.

Laukkanen, A.M., Björkner, E., Sundberg, J.: Throaty voice quality: subglottal pressure, voice source, and formant characteristics. J. Voice 20(1), 25–37 (2006)CrossRef

20.

Hou, L.M., Xiao-Ning, H.U., Xie, J.M.: Application of formant instantaneous characteristics to speech recognition and speaker identification. J. Shanghai Univ. (English Edition) 15(2), 123–127 (2011)CrossRef

21.

Vijayan, K., Reddy, P.R., Murty, K.S.R.: Significance of analytic phase of speech signals in speaker verification. Speech Commun. 81, 54–71 (2016)CrossRef

22.

He, L., Guo, L.H., Li, H.Z.: Emotion speech recognition under sadness conditions. Adv. Mater. Res. 488–489, 1329–1334 (2012)CrossRef

23.

Jian, Z., et al.: A fast iterative pursuit algorithm in robust face recognition based on sparse representation. Math. Probl. Eng. 2, 1–11 (2014)

24.

Yin, A.H., Jiang, H.M., Zhang, Q.M.: Application of improved OMP algorithm in face recognition. Comput. Eng. 38(12), 275–278 (2012)

Title: Research on depression detection algorithm combine acoustic rhythm with sparse face recognition
Authors: Jian Zhao
Weiwen Su
Jian Jia
Chao Zhang
Tingting Lu
Publication date: 09-12-2017
Publisher: Springer US
Published in: Cluster Computing / Issue Special Issue 4/2019
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI: https://doi.org/10.1007/s10586-017-1469-0

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Special Issue 4/2019

Configuration optimization method of Hadoop system performance based on genetic simulated annealing algorithm

Coordinated dispatch of the wind-thermal power system by optimizing electric vehicle charging

Design of integrated steel production scheduling knowledge network system

Cooperative approach to artificial bee colony algorithm for optimal power flow

Energy DOA estimation of MUSIC symmetrical compressed spectrum on vector sensors array

SLAM estimation method for uncertain model noise parameters

Premium Partner