Pathophysiological Voice Analysis for Diagnosis and Monitoring of Depression

Tokuno, Shinichi

doi:10.1007/978-981-10-6577-4_6

Shinichi Tokuno²

3290 Accesses
10 Citations

Abstract

Self-assessment questionnaires are commonly used for screening for stress and depression. However, there are problems of reporting bias that respondents underestimate or overestimate consciously or unconsciously. On the other hand, various biomarkers of depression and stress are being studied. These are often necessary for expensive equipment and chemicals and are useful for definitive diagnosis and elucidation of mechanisms, but they are not suitable for screening for many populations.

It is a known fact that various diseases change the voice. The relationship between disease and voice has been studied in the field of acoustic phonetics since long ago. They have been studied mainly in the frequency band (F1, F2, etc.) which are obtained by Cepstrum analysis of voice. They are influenced by the shape of the vocal tract called the formant (the cavity from the vocal cord to the mouth). On the other hand, studies using the fundamental frequency (F0) which is obtained as a lowest frequency by FFT also have been reported. F0 is affected vocal cord vibration, and currently there are various methods of F0 analysis. F0 contains a lot of involuntary components compared to the formant. Therefore, analysis of F0 is potentially available to diagnose various diseases. Now, the range of adaptation of voice analysis has expanded from the otolaryngology area to psychiatric areas such as depression and neurological diseases such as Parkinson’s disease. In addition, research such as differential diagnosis by voice and measurement of therapeutic effect has started.

Such developments are largely due to the development of computers, especially the spread of smartphones. In other words, voice collection and analysis became possible in everyday life. For example, several smartphone applications that measure stress and depression by analyzing everyday speech have been published. In Japan, the movement to utilize such applications in the fields of healthcare and industrial medicine is becoming active. Our group has already developed Mind Monitoring System using smartphone and operates that system in Japan. This system is based on the emotion recognition technology instead of directly voice analysis. Pathophysiological analysis by voice is noninvasive, remote and continuously, without requiring special equipment. Therefore, this technique is effective as screening for many subjects and long-term continuous monitoring at home. This means that this technology can be a bridge between healthcare and medical treatment. In clinical, it is also possible to give objective indicators to medical areas that had only subjective indicators.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Beck AT. A systematic investigation of depression. Compr Psychiatry. 1961;2(3):163–70.
Article CAS Google Scholar
Beck AT, Ward CH, Mendelson M, Mock J, Erbaugh J. An inventory for measuring depression. Arch Gen Psychiatry. 1961;4(6):561–71.
Article CAS Google Scholar
Burkhardt F, Sendlmeier WF. Verification of acoustical correlates of emotional speech using formant-synthesis. In: ISCA Tutorial and Research Workshop (ITRW) on speech and emotion, 2000.
Google Scholar
Cahn JE. The generation of affect in synthesized speech. J Am Voice I/O Soc. 1990;8:1–19.
Google Scholar
Cobb S, Lindemann E. Neuropsychiatric observations (in a symposium on the management of the cocoanut grove burns at the Massachusetts General Hospital). Ann Surg. 1943;117(2):814.
Article CAS Google Scholar
Cummins N, Epps J, Breakspear M, Goecke R. An investigation of depressed speech detection: features and normalization. In: Interspeech, 2011. P. 2997–3000.
Google Scholar
Darby JK, editor. Speech evaluation in psychiatry. New York: Grune and Stratton; 1981.
Google Scholar
Darby JK, Hollien H. Vocal and speech patterns of depressive patients. Folia Phoniatr. 1977;2(9):279–91.
Article Google Scholar
Darby JK, Simmons N, Berger P. Speech and voice parameters in depression a: pilot study. J Commun Disord. 1984;17:87–94.
Article Google Scholar
Eyben F, Wöllmer M, Schuller B. Opensmile: the Munich versatile and fast open-source audio feature extractor. In: Proceedings of the 18th ACM international conference on multimedia, ACM, Oct 2010. P. 1459–62.
Google Scholar
Flint AJ, Black SE, Campbell-Taylor I, Gailey GF, Levinton C. Abnormal speech articulation, psychomotor retardation, and subcortical dysfunction in major depression. J Psychiatr Res. 1993;27(3):309–19.
Article CAS Google Scholar
Goldberg DP, Blackwell B. Psychiatric illness in general practice: a detailed study using a new method of case identification. BMJ. 1970;2(5707):439–43.
Article Google Scholar
Hagiwara N, Omiya Y, Shinohara S, Nakamura M, Kogure U, Mitsuyoshi S, Tokuno S. Effectiveness verification by the difference of the recording method in the monitoring system of the mental health state by voice using the smartphone [Japanese]. In: Japan Biomedical Engineering Symposium 2016 (JBEMS 2016), Asahikawa, Sept 2016a.
Google Scholar
Hagiwara N, Omiya Y, Shinohara S, Nakamura M, Yasunaga H, Mitsuyoshi S, Tokuno S. Validity of the mind monitoring system as a mental health indicator. In: 2016 IEEE 16th international conference on Bioinformatics and Bioengineering (BIBE), Taichung, Oct 2016b. P. 262–5.
Google Scholar
Hargreaves W, Starkweather J, Blacker K. Voice quality in depression. J Abnorm Psychol. 1965;70:218–20.
Article CAS Google Scholar
Hoge CW, Castro CA, Messer SC, McGurk D, Cotting DI, Koffman RL. Combat duty in Iraq and Afghanistan, mental health problems, and barriers to care. N Engl J Med. 2004;351(1):13–22.
Article CAS Google Scholar
Kessler RC, Andrews G, Colpe LJ, Hiripi E, Mroczek DK, Normand SL, Walters EE, Zaslavsky AM. Short screening scales to monitor population prevalences and trends in non-specific psychological distress. Psychol Med. 2002;32(6):959–76.
Article CAS Google Scholar
Low LSA, Maddage NC, Lech M, Sheeber LB, Allen NB. Detection of clinical depression in adolescents’ speech during family interactions. IEEE Trans Biomed Eng. 2011;58(3):574–86.
Article Google Scholar
Maxhuni A, Muñoz-Meléndez A, Osmani V, Perez H, Mayora O, Morales EF. Classification of bipolar disorder episodes based on analysis of voice and motor activity of patients. Pervasive Mob Comput. 2016;31:50–66.
Article Google Scholar
McLay RN, Deal WE, Murphy JA, Center KB, Kolkow TT, Grieger TA. On-the-record screenings versus anonymous surveys in reporting PTSD. Am J Psychiatry. 2008;165(6):775–6.
Article Google Scholar
Mitsuyoshi S. Emotion recognizing method, sensibility creating method, device, and software. WO0223524, Mar 2002.
Google Scholar
Mitsuyoshi S. Development of verbal analysis pathophysiology. Econophys Sociophys Other Multidiscip Sci J. 2015;5(1):11–6.
Google Scholar
Mitsuyoshi S. Development of voice pathophysiology analysis technology: joint symposium with IT companies. In: 7th Asia Pacific regional conference of the International Association for Suicide Prevention, Tokyo, May 2016.
Google Scholar
Mitsuyoshi S, Ren F, Tanaka Y, Kuroiwa S. Non-verbal voice emotion analysis system. Int J Innov Comput Inf Control. 2006;2(4):819–30.
Google Scholar
Mitsuyoshi S, Tanaka Y, Ren F, Shibasaki K, Kato M, Murata T, Minami T, Yagura H. Emotion voice analysis system connected to the human brain. In: IEEE NLP-KE 2007, 2007. P. 479–84.
Google Scholar
Mitsuyoshi S, Monnma F, Tanaka Y, Minami T, Kato M, Murata T. Identifying neural components of emotion in free conversation with fMRI. In: Defense Science Research conference and expo (DSR) 2011, IEEE, Singapore, Aug 2011. P. 1–4.
Google Scholar
Miyazaki K. Verbal analysis of pathophysiology in stress resilience program: joint symposium with IT companies. In: 7th Asia Pacific regional conference of the International Association for Suicide Prevention, Tokyo, May 2016.
Google Scholar
Moses PJ. The voice of neurosis. New York: Grune and Stratton; 1954.
Google Scholar
Mundt JC, Greist JH, Gelenberg AJ, Katzelnick DJ, Jefferson JW, Model JG. Feasibility and validation of a computer-automated Columbia-suicide severity rating scale using interactive voice response technology. J Psychiatr Res. 2010;44(16):1224–8.
Article Google Scholar
Mundt JC, Greist JH, Jefferson JW, Federico M, Mann JJ, Posner K. Prediction of suicidal behavior in clinical research by lifetime suicidal ideation and behavior ascertained by the electronic Columbia-suicide severity rating scale. J Clin Psychiatry. 2013;74(9):887–93.
Article Google Scholar
Murray IR, Arnott JL. Implementation and testing of a system for producing emotion-by-rule in synthetic speech. Speech Comm. 1995;16(4):369–90.
Article Google Scholar
Nakamura M, Shinohara S, Omiya Y, Mitsuyoshi S, Takagi H, Ushiwatari A, Tokuno S. Correlation between self-administered psychological test and emotion measured by voice analysis. In: International Conference on Information Science and Management Engineering 1 (ICISME 2015), Phuket, Dec 2015.
Google Scholar
Newman SS, Mather VG. Analysis of spoken language of patients with affective disorders. Am J Psychiatry. 1938;94:912–42.
Article Google Scholar
Nilsonne Å, Sundberg J, Ternström S, Askenfelt A. Measuring the rate of change of voice fundamental frequency in fluent speech during mental depression. J Acoust Soc Am. 1988;83(2):716–28.
Article CAS Google Scholar
Omiya Y. Development of the Mind Monitoring System (MIMOSYS) which can be able to monitor mental health status using call voice with a smartphone: joint symposium with IT companies. In: 7th Asia Pacific regional conference of the International Association for Suicide Prevention, Tokyo, May 2016.
Google Scholar
Omiya Y, Hagiwara N, Shinohara S, Nakamura M, Mitsuyoshi S, Tokuno S. Development of mind monitoring system using call voice. In: Neuroscience 2016, San Diego, Nov 2016.
Google Scholar
Perrin M, DiGrande L, Wheeler K, Thorpe L, Farfel M, Brackbill R. Differences in PTSD prevalence and associated risk factors among World Trade Center disaster rescue and recovery workers. Am J Psychiatr. 2007;164(9):1385–94.
Article Google Scholar
Radloff LS. The CES-D scale: a self report depression scale for research in the general population. Appl Psychol Measur. 1977;1:385–401.
Article Google Scholar
Scherer KR. Vocal assessment of affective disorders. In: Maser JD, editor. Depression and expressive behavior. Hillsdale: Lawrence Erlbaum Associates; 1987. p. 57–82.
Google Scholar
Shinohara S, Mitsuyoshi S, Nakamura M, Omiya Y, Tsumatori G, Tokuno S. Validity of a voice-based evaluation method for effectiveness of behavioural therapy. In: Pervasive computing paradigms for mental health. Cham: Springer; 2015. p. 43–51.
Google Scholar
Shinohara S, Omiya Y, Nakamura M, Hagiwara N, Mitsuyoshi S, Tokuno S. Voice disability index using pitch rate. In: 2016 IEEE EMBS conference on Biomedical Engineering and Sciences (IECBES), IEEE, Kuala Lumpur, Dec 2016. P. 557–60.
Google Scholar
Suzuki G, Tokuno S, Nibuya M, Ishida T, Yamamoto T, Mukai Y, Mitani K, Tsumatori G, Scott D, Shimizu K. Decreased plasma brain-derived neurotrophic factor and vascular endothelial growth factor concentrations during military training. PLoS One. 2014;9(2):e89455.
Article Google Scholar
Szabadi E, Bradshaw CM, Besson JAO. Elongation of pause-time in speech: a simple, objective measure of motor retardation in depression. Br J Psychiatry. 1976;129:592–7.
Article CAS Google Scholar
Tokuno S. Stress evaluation by voice: from prevention to treatment in mental health care. Econophys Sociophys Other Multidiscip Sci J. 2015a;5(1):30–5.
Google Scholar
Tokuno S. Medical evidence of voice pathophysiology analysis technology: joint symposium with IT companies. In: 7th Asia Pacific regional conference of the International Association for Suicide Prevention, Tokyo, May 2015b.
Google Scholar
Tokuno S. Verbal analysis of pathophysiology [Japanese]. Saibou. 2016;48(14):9–12.
Google Scholar
Tokuno S, Tsumatori G, Shono S, Takei E, Suzuki G, Yamamoto T, Shimura M. Usage of emotion recognition in military health care. In: Defense Science Research conference and expo (DSR) 2011, IEEE, Singapore. P. 1–4.
Google Scholar
Tokuno S, Shimozono S, Tsumatori G. Usage of emotion recognition in stress resilience program. In: 40th WCMM (World Congress in Military Medicine), Saudi Arabia, Dec 2013.
Google Scholar
Tokuno S, Mitsuyoshi S, Suzuki G, Tsumatori G. Stress evaluation using voice emotion recognition technology: a novel stress evaluation technology for disaster responders. In: XVI World Congress of Psychiatry, Madrid, Sept 2014.
Google Scholar
Tokuno S, Omiya Y, Shinohara S, Nakamura M, Hagiwara N, Mitsuyoshi S. Psychological impact of Kumamoto earthquake by voice analysis using a smart phone application. In: Neuroscience 2016, San Diego, Nov 2016.
Google Scholar
Tolkmitt F, Helfrich H, Standke R, Scherer KR. Vocal indicators of psychiatric treatment effects in depressive and schizophrenics. J Commun Disord. 1982;15:209–22.
Article CAS Google Scholar
Weintraub W, Aronson H. The application of verbal behavior analysis to the study of psychological defense mechanisms: IV. Speech patterns associated with depressive behavior. J Nerv Ment Disord. 1967;144:22–8.
Article CAS Google Scholar
Weiss DS. The impact of event scale-revised. In: Wilson JP, Keane TM, editors. Assessing psychological trauma and PTSD. 2nd ed. New York: Guilford Press; 2004. p. 168–89.
Google Scholar

Download references

Acknowledgments

I appreciate Shunji Mitsuyoshi, Shuji Shinohara, Mitsuteru Nakamura, Masakazu Higuchi, Yasuhiro Omiya, and Naoki Hagiwara. They are my team and each working on research with original ideas and outstanding skills.

Author information

Authors and Affiliations

Department of Verbal Analysis of Pathophysiology, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
Shinichi Tokuno

Authors

Shinichi Tokuno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinichi Tokuno .

Editor information

Editors and Affiliations

College of Medicine, Korea University, Kyonggi-do, Korea (Republic of)
Yong-Ku Kim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tokuno, S. (2018). Pathophysiological Voice Analysis for Diagnosis and Monitoring of Depression. In: Kim, YK. (eds) Understanding Depression. Springer, Singapore. https://doi.org/10.1007/978-981-10-6577-4_6

Download citation

DOI: https://doi.org/10.1007/978-981-10-6577-4_6
Published: 24 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6576-7
Online ISBN: 978-981-10-6577-4
eBook Packages: MedicineMedicine (R0)

Publish with us

Policies and ethics