Skip to main content
Top

2018 | OriginalPaper | Chapter

29. Automatic Detection of Depressive States from Speech

Authors : Aditi Mendiratta, Filomena Scibelli, Antonietta M. Esposito, Vincenzo Capuano, Laurence Likforman-Sulem, Mauro N. Maldonato, Alessandro Vinciarelli, Anna Esposito

Published in: Multidisciplinary Approaches to Neural Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper investigates the acoustical and perceptual speech features that differentiate a depressed individual from a healthy one. The speech data gathered was a collection from both healthy and depressed subjects in the Italian language, each comprising of a read and spontaneous narrative. The pre-processing of this dataset was done using Mel Frequency Cepstral Coefficient (MFCC). The speech samples were further processed using Principal Component Analysis (PCA) for correlation and dimensionality reduction. It was found that both groups differed with respect to the extracted speech features. To distinguish the depressed group from the healthy one on the basis the proposed speech processing algorithm the Self Organizing Map (SOM) algorithm was used. The clustering accuracy given by SOM’s was 80.67%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Alpert, M., Pouget, E.R., Silva, R.R.: Reflections of depression in acoustic measures of the patient’s speech. J. Affect. Disord. 66, 59–69 (2001)CrossRef Alpert, M., Pouget, E.R., Silva, R.R.: Reflections of depression in acoustic measures of the patient’s speech. J. Affect. Disord. 66, 59–69 (2001)CrossRef
2.
go back to reference Beale, M.H., Hagan, M.T., Demuth, H.B.: Neural network toolbox. User’s Guide, The Mathworks Inc., 7–39 (2010) Beale, M.H., Hagan, M.T., Demuth, H.B.: Neural network toolbox. User’s Guide, The Mathworks Inc., 7–39 (2010)
3.
go back to reference Cordasco, G., Esposito, M., Masucci, F., Riviello, M.T., Esposito, A., Chollet, G., Schlögl, S., Milhorat, P., Pelosi, G.: Assessing voice user interfaces: the assist system prototype. In: Proceedings of 5th IEEE international Conference on Cognitive Info Communications, Vietri sul Mare, 5–7 Nov, pp. 91–96 (2014) Cordasco, G., Esposito, M., Masucci, F., Riviello, M.T., Esposito, A., Chollet, G., Schlögl, S., Milhorat, P., Pelosi, G.: Assessing voice user interfaces: the assist system prototype. In: Proceedings of 5th IEEE international Conference on Cognitive Info Communications, Vietri sul Mare, 5–7 Nov, pp. 91–96 (2014)
4.
go back to reference Esposito, A., Esposito, A.M., Likforman-Sulem, L., Maldonato, N.M., Vinciarelli, A.: On the significance of speech pauses in depressive disorders: results on read and spontaneous narratives. In: Esposito, A., et al. (eds.) Springer SIST series on Recent Advances in Nonlinear Speech Processing, vol. 48, pp. 73–82 (2016) Esposito, A., Esposito, A.M., Likforman-Sulem, L., Maldonato, N.M., Vinciarelli, A.: On the significance of speech pauses in depressive disorders: results on read and spontaneous narratives. In: Esposito, A., et al. (eds.) Springer SIST series on Recent Advances in Nonlinear Speech Processing, vol. 48, pp. 73–82 (2016)
5.
go back to reference Esposito, A., Jain, L.C.: Modeling social signals and contexts in robotic socially believable behaving systems. In Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems Volume II—“Modeling Social Signals” Springer International Publishing Switzerland, ISRL series 106, pp. 5–13 (2016) Esposito, A., Jain, L.C.: Modeling social signals and contexts in robotic socially believable behaving systems. In Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems Volume II—“Modeling Social Signals” Springer International Publishing Switzerland, ISRL series 106, pp. 5–13 (2016)
6.
go back to reference Esposito, A., Esposito, A.M., Vogel, C.: Needs and challenges in human computer interaction for processing social emotional information. Pattern Recogn. Lett. 66, 41–51 (2015)CrossRef Esposito, A., Esposito, A.M., Vogel, C.: Needs and challenges in human computer interaction for processing social emotional information. Pattern Recogn. Lett. 66, 41–51 (2015)CrossRef
7.
go back to reference Esposito, A., Esposito, A.M.: On the recognition of emotional vocal expressions: motivations for an holistic approach. Cogn. Process. J. 13(2), 541–550 (2012)CrossRef Esposito, A., Esposito, A.M.: On the recognition of emotional vocal expressions: motivations for an holistic approach. Cogn. Process. J. 13(2), 541–550 (2012)CrossRef
8.
go back to reference Esposito, A.M., D’Auria, L., Angelillo, A, Giudicepietro, F., Martini, M.: Predictive analysis of the seismicity level at Campi Flegrei volcano using a data-driven approach. In: Bassis, et al. (eds.) Recent Advances of Neural Network Models and Applications, Springer Series in Smart Innovation, Systems and Technologies, vol. 19, pp. 133–145 (2014) Esposito, A.M., D’Auria, L., Angelillo, A, Giudicepietro, F., Martini, M.: Predictive analysis of the seismicity level at Campi Flegrei volcano using a data-driven approach. In: Bassis, et al. (eds.) Recent Advances of Neural Network Models and Applications, Springer Series in Smart Innovation, Systems and Technologies, vol. 19, pp. 133–145 (2014)
9.
go back to reference Esposito, A.M., D’Auria, L., Angelillo, A, Giudicepietro, F., Martini, M.: Waveform variation of the explosion-quakes as a function of the eruptive activity at Stromboli volcano. In: Bassis, et al. (eds.) Neural Nets and Surroundings, Springer Series in Smart Innovation, Systems and Technologies, vol. 19, pp. 111–119 (2013) Esposito, A.M., D’Auria, L., Angelillo, A, Giudicepietro, F., Martini, M.: Waveform variation of the explosion-quakes as a function of the eruptive activity at Stromboli volcano. In: Bassis, et al. (eds.) Neural Nets and Surroundings, Springer Series in Smart Innovation, Systems and Technologies, vol. 19, pp. 111–119 (2013)
10.
go back to reference Gupta, S., Jaafar, J., Ahmad, W.F., Bansal, A.: Feature extraction using MFCC. Signal Image Process. (SIPIJ) 4(4), 101–108 (2013) Gupta, S., Jaafar, J., Ahmad, W.F., Bansal, A.: Feature extraction using MFCC. Signal Image Process. (SIPIJ) 4(4), 101–108 (2013)
11.
go back to reference Ghisi, M., Flebus, G.B., Montano, A., Sanavio, E., Sica, C.: Beck Depression Inventory-II. Manuale Italiano. Firenze, Organizzazioni Speciali (2006) Ghisi, M., Flebus, G.B., Montano, A., Sanavio, E., Sica, C.: Beck Depression Inventory-II. Manuale Italiano. Firenze, Organizzazioni Speciali (2006)
12.
go back to reference Jackson, J.E.: A User’s Guide to Principal Components, p. 592. Wiley (1991) Jackson, J.E.: A User’s Guide to Principal Components, p. 592. Wiley (1991)
13.
go back to reference Jolliffe, I.T.: Principal Component Analysis, 2nd edn. pp. 299–316. Springer (2002) Jolliffe, I.T.: Principal Component Analysis, 2nd edn. pp. 299–316. Springer (2002)
14.
go back to reference Kakumanu, P., Esposito, A., Gutierrez-Osuna, R., Garcia, O.N.: A comparison of acoustic coding models for speech-driven facial animation. Speech Commun. 48(6), 598–615 (2006)CrossRef Kakumanu, P., Esposito, A., Gutierrez-Osuna, R., Garcia, O.N.: A comparison of acoustic coding models for speech-driven facial animation. Speech Commun. 48(6), 598–615 (2006)CrossRef
15.
go back to reference Kiss, G.C., Tulics, M.G., Sztahó, D., Esposito, A., Vicsi, K.: Language independent detection possibilities of depression by speech. In: Esposito, A., et al. (eds.) Springer SIST series on Recent Advances in Nonlinear Speech Processing, vol. 48, pp. 103–114 (2016) Kiss, G.C., Tulics, M.G., Sztahó, D., Esposito, A., Vicsi, K.: Language independent detection possibilities of depression by speech. In: Esposito, A., et al. (eds.) Springer SIST series on Recent Advances in Nonlinear Speech Processing, vol. 48, pp. 103–114 (2016)
16.
go back to reference Kopparapu, K.S., Laxminarayana, M.: Choice of Mel filter bank in computing MFCC of a resampled speech. In: IEEE International Conference on Information Sciences Signal Processing and their Applications (ISSPA 2010), Malaysia 10–13 May, pp. 121–124 (2010) Kopparapu, K.S., Laxminarayana, M.: Choice of Mel filter bank in computing MFCC of a resampled speech. In: IEEE International Conference on Information Sciences Signal Processing and their Applications (ISSPA 2010), Malaysia 10–13 May, pp. 121–124 (2010)
17.
go back to reference Maldonato, N.M., Dell’Orco, S.: Making decision under uncertainty, emotions, risk and biases. In: Bassis, S., Esposito, A., Morabito, F.C. (eds.) Advances in Neural Networks: Computational and Theoretical Issues, SIST Series 37, pp. 293–302. Springer International Publishing Switzerland (2015) Maldonato, N.M., Dell’Orco, S.: Making decision under uncertainty, emotions, risk and biases. In: Bassis, S., Esposito, A., Morabito, F.C. (eds.) Advances in Neural Networks: Computational and Theoretical Issues, SIST Series 37, pp. 293–302. Springer International Publishing Switzerland (2015)
18.
go back to reference Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval: Evaluation of Clustering, pp. 349–356. Cambridge University Press (2008) Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval: Evaluation of Clustering, pp. 349–356. Cambridge University Press (2008)
19.
go back to reference Marazziti, D., Consoli, G., Picchetti, M., Carlini, M., Faravelli, L.: Cognitive impairment in major depression. Eur. J. Pharmacol. 626, 83–86 (2010)CrossRef Marazziti, D., Consoli, G., Picchetti, M., Carlini, M., Faravelli, L.: Cognitive impairment in major depression. Eur. J. Pharmacol. 626, 83–86 (2010)CrossRef
20.
go back to reference Moore, E., Clements, M., Peifer, J., Weisser L.: Investigating the role of glottal parameters in classifying clinical depression. In: Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol 3, pp. 2849–2852 (2003) Moore, E., Clements, M., Peifer, J., Weisser L.: Investigating the role of glottal parameters in classifying clinical depression. In: Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol 3, pp. 2849–2852 (2003)
21.
go back to reference Moore, E., Clements, M.A., Peifer, J.W., Weisser, L.: Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Trans. Biomed. Eng. 55, 96–107 (2008) Moore, E., Clements, M.A., Peifer, J.W., Weisser, L.: Critical analysis of the impact of glottal features in the classification of clinical depression in speech. IEEE Trans. Biomed. Eng. 55, 96–107 (2008)
22.
go back to reference Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J. Comput. 2(3), 138–143 (2010) Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. J. Comput. 2(3), 138–143 (2010)
23.
go back to reference Mundt, J.C., Snyder, P.J., Cannizzaro, M.S., Chappie, K., Geralts, D.S.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguist. 20, 50–64 (2007)CrossRef Mundt, J.C., Snyder, P.J., Cannizzaro, M.S., Chappie, K., Geralts, D.S.: Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. J. Neurolinguist. 20, 50–64 (2007)CrossRef
24.
go back to reference Mundt, J.C., Vogel, A.P., Feltner, D.E., Lenderking, W.R.: Vocal acoustic biomarkers of depression severity and treatment response. Biol. Psychiatry 72, 580–587 (2012)CrossRef Mundt, J.C., Vogel, A.P., Feltner, D.E., Lenderking, W.R.: Vocal acoustic biomarkers of depression severity and treatment response. Biol. Psychiatry 72, 580–587 (2012)CrossRef
25.
go back to reference Rosser, B.A., Vowles, K.E., Keogh, E., Eccleston, C., Mountain, G.A.: Technologically-assisted behaviour change: a systematic review of studies of novel technologies for the management of chronic illness. Telemed. Telecare 15(7), 327–338 (2009)CrossRef Rosser, B.A., Vowles, K.E., Keogh, E., Eccleston, C., Mountain, G.A.: Technologically-assisted behaviour change: a systematic review of studies of novel technologies for the management of chronic illness. Telemed. Telecare 15(7), 327–338 (2009)CrossRef
26.
go back to reference Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 19–22 (2010) Tiwari, V.: MFCC and its applications in speaker recognition. Int. J. Emerg. Technol. 19–22 (2010)
27.
go back to reference Troncone, A., Palumbo, D., Esposito, A.: Mood effects on the decoding of emotional voices. In: Bassis, S., et al. (eds.) Recent Advances of Neural Network Models and Applications, SIST 26, pp. 325–332. International Publishing Switzerland (2014) Troncone, A., Palumbo, D., Esposito, A.: Mood effects on the decoding of emotional voices. In: Bassis, S., et al. (eds.) Recent Advances of Neural Network Models and Applications, SIST 26, pp. 325–332. International Publishing Switzerland (2014)
28.
go back to reference Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Trans. Neural Netw. 11(3), 586–600 (2000)CrossRef Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Trans. Neural Netw. 11(3), 586–600 (2000)CrossRef
29.
go back to reference Viszlay, P., Pleva, M., Juhár, J.: Dimension reduction with principal component analysis applied to speech supervectors. J. Electr. Electron. Eng. 4(1), 245–250 (2011) Viszlay, P., Pleva, M., Juhár, J.: Dimension reduction with principal component analysis applied to speech supervectors. J. Electr. Electron. Eng. 4(1), 245–250 (2011)
30.
go back to reference Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.4.1). Engineering Department, Cambridge University, pp. 56–80 (2006) Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.4.1). Engineering Department, Cambridge University, pp. 56–80 (2006)
Metadata
Title
Automatic Detection of Depressive States from Speech
Authors
Aditi Mendiratta
Filomena Scibelli
Antonietta M. Esposito
Vincenzo Capuano
Laurence Likforman-Sulem
Mauro N. Maldonato
Alessandro Vinciarelli
Anna Esposito
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-56904-8_29

Premium Partner