Elsevier

Brain and Cognition

Volume 55, Issue 2, July 2004, Pages 383-386
Brain and Cognition

Feigned depression and feigned sleepiness: A voice acoustical analysis

https://doi.org/10.1016/j.bandc.2004.02.052Get rights and content

Abstract

We sought to profile the voice acoustical correlates of simulated, or feigned depression by neurologically and psychiatrically healthy control subjects. We also sought to identify the voice acoustical correlates of feigned sleepiness for these same subjects. Twenty-two participants were asked to speak freely about a cartoon, to count from 1 to 10, and to sustain an “a” sound for approximately 5 s. These exercises were completed three times (within the same testing session) with three differing sets of instructions to the participants. These three conditions were presented in pseudo-random order to control for any order effects, and all subjects were naı̈ve to the intended purpose of this study. For all three conditions, mean speaking rates and pitch ranges were calculated. A series of paired t tests showed significant differences in the speaking rates (counting numbers and free-speech exercises) between the ‘normal’ and feigned sleepy conditions, and between the normal and feigned depression conditions, but not between the ‘sleepy’ and ‘depressed’ conditions. The results for pitch range, for all speech exercises, were not different between the normal and either the feigned depression or feigned sleepiness conditions. These results indicate that persons feigning depression and sleepiness demonstrate some level of conscious control of their speech rate, but they did not convincingly alter their pitch ranges while feigning depression or sleepiness.

Introduction

The literature on the application of quantifiable and objective voice acoustical measures as biomarkers of disease severity and/or treatment response has been growing rapidly in recent years. For example, voice acoustical measures may be sensitive to changes in symptom severity for major depressive disorder (Stassen, Kuny, & Hell, 1998). Since various forms of psychomotor retardation are commonly seen in patients with depression, it is reasonable to suppose that the neuromuscular control of the larynx might also be affected similarly (Nilsonne, 1987). Specifically, since the pitch of voice is determined by the frequency of the vibrations of the vocal folds, which result from activation of the muscles of the diaphragm and larynx (Nilsonne, 1988), narrowed pitch alterations most probably lead to the monotonous speech that is characteristic of individuals with depression (Stassen, Bomben, & Güenther, 1991). Fundamental frequency (F0) is a measure of this vocal fold vibration frequency, and it corresponds directly to the perceived pitch of the voice (Nilsonne, 1987). As a result, it has become the basis for pitch range analyses.

To measure F0 of voice, a variety of speech exercises are routinely relied upon, including tasks of automatic speech (e.g., counting from 1 to 10), of free speech (e.g., asking a subject to describe an event), and of reading standard passages. This repertoire of speech exercises is thought to best represent a continuum of speech performance from contrived and acontextual automatic speech to the natural discourse of free speech. Short segments of automatic speech are quick and easy to obtain requiring minimal effort on the part of the subject while providing pertinent data (Nilsonne, 1987). Conversely, free speech elicits more acoustic information via a naturalistic approach for obtaining ecologically valid speech samples. This method is recommended as standard practice for a well rounded examination of voice (cf. Titze, 1995). Other measures elicited from these tasks include: (1) total reading time for a standard passage; (2) the percent pause time per speech sample; and (3) the overall rate of speech. Each of these measures have previously been shown to be altered in the speech of depressed individuals, since common symptoms of depression include diminished prosody of voice, an increase in speech pause time and slower rates of speech (cf. Nilsonne, 1987, Nilsonne, 1988). Ellgring and Scherer (1996) followed 16 patients (11 females, 5 males) with major depressive disorder (MDD) longitudinally, with voice recordings obtained via a structured interview conducted twice per week over the duration of their clinical treatment. These results showed that the rate of speech increased markedly over the transition from depressed to non-depressed in patients who were successfully treated for MDD. The mean duration of pauses also decreased with treatment between 56 and 59 percent, and the average number of pauses per interview declined between 25 and 32 percent over the course of treatment.

The purpose of the present study was to determine, how neurologically and psychiatrically healthy individuals alter their speech, as measured by several of the aforementioned voice acoustical metrics, in order to simulate, or feign depressed mood. We also sought to answer the same set of questions for simulated sleepiness. One future goal of this work is to determine whether such individuals can modify their voice in a manner that is identical to that found for patients with documented MDD. However, the absence of raw data for MDD patients prevented direct comparisons to our feigning data at the present time. Future studies will implement tasks of a comparable nature in order to make direct comparisons between the acoustic characteristics of feigners and genuinely depressed individuals.

Section snippets

Participants

Twenty-two subjects (11 males and 11 females), between the ages of 19- and 54-years-old were recruited for the study. All participants completed the Stanford sleepiness scale (SSS) and the Beck depression inventory (BDI), as well as relevant demographic data, such as age, education level, caffeine, and prescription medication use, presence of any current cold or upper respiratory infection, and history of neurological or psychiatric illness. Participants were excluded if they scored higher than

Results

The participants’ mean score on the SSS was 2.27 (SD=.88) and the overall mean BDI was 5.0 (SD=4.0). The mean values for all voice acoustical measures obtained from the automatic speech (counting) and free speech (response to Cookie Theft Card) and across all three behavioral conditions, are provided below in Table 1B. As indicated by the low SSS scores and verbal self-report, all subjects reported the subjective feeling of being fully alert and not at all sleepy at the times that the voice

Discussion

Neurologically and psychiatrically healthy adults are successfully able to alter (slow) their speaking rate while feigning a depressed mood or sleepiness, but they are unable to physiologically modulate their voices to significantly alter the pitch ranges from their ‘normal’ state. For the automatic counting task, our participants were capable of significantly altering their normal speaking rate when pretending to be sleepy and depressed. By comparison, there were no differences between any of

References (8)

  • Cannizzaro, M., Harel, B., Reilly, N., Chappell, P., & Snyder, P. J. (in press). Voice acoustical measurement of the...
  • H. Ellgring et al.

    Vocal indicators of mood change in depression

    Journal of Nonverbal Behavior

    (1996)
  • H. Goodglass et al.

    Boston diagnostic aphasia exam

    (2000)
  • A. Nilsonne

    Acoustic analysis of speech variables during depression and after improvement

    Acta Psychiatrica Scandinavica

    (1987)
There are more references available in the full text version of this article.

Cited by (6)

  • Paralinguistics in speech and language - State-of-the-art and the challenge

    2013, Computer Speech and Language
    Citation Excerpt :

    These phenomena have, however, almost exclusively been addressed for affect analysis so far—other speaker classification tasks are yet to follow. Non-technical robustness refers to phenomena such as: attempted fraud (feigning a speaker trait such as identity or age, or a speaker state such as degree of intoxication, or emotion which is not one's own (Cannizzaro et al., 2004; Reilly et al., 2004)), correct identification even if influenced or distorted by intervening factors such as tiredness or emotion (Shahin, 2009), or the use of non-idiosyncratic/non-native language (Chen and Bond, 2010). More standardisation: This is sorely needed as measures of performance, for instance, vary widely between studies.

  • Computational paralinguistics: Emotion, affect and personality in speech and language processing

    2013, Computational Paralinguistics: Emotion, Affect and Personality in Speech and Language Processing
  • Discussion

    2013, Signals and Communication Technology
  • Ten recent trends in computational paralinguistics

    2012, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
View full text