Abstract
This article describes a general framework for detecting sleepiness states on the basis of prosody, articulation, and speech-quality-related speech characteristics. The advantages of this automatic real-time approach are that obtaining speech data is nonobstrusive and is free from sensor application and calibration efforts. Different types of acoustic features derived from speech, speaker, and emotion recognition were employed (frame-level-based speech features). Combing these features with high-level contour descriptors, which capture the temporal information of frame-level descriptor contours, results in 45,088 features per speech sample. In general, the measurement proceß follows the speech-adapted steps of pattern recognition: (1) recording speech, (2) preproceßing, (3) feature computation (using perceptual and signal-proceßing-related features such as, e.g., fundamental frequency, intensity, pause patterns, formants, and cepstral coefficients), (4) dimensionality reduction, (5) claßification, and (6) evaluation. After a correlation-filter-based feature subset selection employed on the feature space in order to find most relevant features, different claßification models were trained. The best model—namely, the support-vector machine—achieved 86.1% claßification accuracy in predicting sleepineß in a sleep deprivation study (two-claß problem, N 5 12; 01.00-08.00 a.m.).
Similar content being viewed by others
References
Batliner, A., Hacker, C., Steidl, S., Nöth, E., & Haas, J. (2003). User states, user strategies, and system performance: How to match the one with the other. In Proceedings of an ISCA Tutorial and Research Workshop on Error Handling in Spoken Dialogue Systems (Vol. 1, pp. 5–10).
Batliner, A., Nutt, M., Warnke, V., Nöth, E., Buckow, J., Huber, R., & Niemann, H. (1999). Automatic annotation and claßification of phrase accents in spontaneous speech. In Proceedings of the European Conference on Speech Communication and Technology (Vol. 6, pp. 519–522).
Batliner, A., Steidl, S., & Nöth, E. (2008). Releasing a thoroughly annotated and proceßed spontaneous emotional database: The FAU Aibo Emotion Corpus. In Proceedings of a Satellite Workshop of LREC 2008 on Corpora for Research on Emotion and Affect (Vol. 1, pp. 28–31).
Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., et al. (2006). Combining efforts for improving automatic claßification of emotional user states. In T. Erjavec & J. Z. Gros (Eds.),Language technologies, IS-LTC 2006 (pp. 240–245). Ljubljana, Slovenia: Infornacijska Druzba.
Boersma, P. (2001). PRAAT, a system for doing phonetics by computer. Glot International, 5, 341–345.
Bratzke, D., Rolke, B., Ulrich, R., & Peters, M. (2007). Central slowing during the night. Journal of Psychological Science, 18, 456–461.
Davidson, P. R., Jones, R. D., & Peiris, M. T. (2007). EEG-based behavioral microsleep detection with high temporal resolution. IEEE Transactions on Biomedical Engineering, 54, 832–839.
Dinges, D. F., & Kribbs, N. (1991). Performing while sleepy: Effects of experimentally induced sleepineß. In T. H. Monk (Ed.), Sleep, sleepineß and performance (pp. 97–128). Chichester, U.K.: Wiley.
Durmer, J. S., & Dinges, D. F. (2005). Neurocognitive consequences of sleep deprivation. Seminars in Neurology, 25, 117–129.
Engle-Friedman, M., Riela, S., Golan, R., Ventuneac, A. M., Davis, C. M., Jefferson, A. D., & Major, D. (2003). The effect of sleep loß on next day effort. Journal of Sleep Research, 12, 113–124.
Flatley, D., Reyner, L. A., & Horne, J. A. (2004). Sleep-related crashes on sections of different road types in the UK (1995–2001) (Road Safety Research Rep. No. 52, pp. 4–132). London: Department of Transport.
Golz, M., Sommer, D., Holzbrecher, M., & Schnupp, T. (2007). Detection and prediction of driver’s microsleep events. In RS4C (Eds.), Proceedings of the 14th International Conference of Road Safety on Four Continents. Bangkok, Thailand.
Greeley, H. P., Berg, J., Friets, E., Wilson, J., Greenough, G., Picone, J., et al. (2007). Fatigue estimation using voice analysis. Behavior Research Methods, 39, 610–619.
Harrison, Y., & Horne, J. A. (1997). Sleep deprivation affects speech. Sleep, 20, 871–877.
Harwood, K., Barnett, B., & Wickens, C. D. (1988). Situational awareneß: A conceptual and methodological framework. In F. E. McIntire (Ed.), Proceedings of the 11th Biennial Psychology in the Department of Defense Symposium (pp. 7–23). Colorado Springs: U.S. Air Force Academ y.
Horberry, T., Hutchins, R., & Tong, R. (2008). Motorcycle rider fatigue: A review. (Road Safety Research Rep. No. 78, pp. 4–63). London: Department for Transport.
Horne, J. A. (1988). Sleep loß and “divergent” thinking ability. Sleep, 11, 528–536.
Ingre, M., Åkerstedt, T., Peters, B., Anund, A., & Kecklund, G. (2006). Subjective sleepineß, simulated driving performance and blink duration: Examining individual differences. Journal of Sleep Research, 15, 47–53.
Jennings, J. R., Monk, T. H., & van der Molen, M. W. (2003). Sleep deprivation influences some but not all proceßes of supervisory attention. Psychological Science, 14, 473–479.
Kienast, M., & Sendlmeier, W. F. (2000). Acoustical analysis of spectral and temporal changes in emotional speech. In Proceedings of the ISCA Workshop on Speech and Emotion (pp. 92–97). Bonn: ISCA.
Krajewski, J. (2008). Acoustic sleepineß analysis. Unpublished doctoral thesis, University of Wuppertal.
Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical acceß in speech production. Journal of Behavioral & Brain Sciences, 22, 1–75.
Linde, L., & Bergström, M. (1992). The effect of one night without sleep on problem-solving and immediate recall. Psychological Research, 54, 127–136.
Maireße, F., Walker, M., Mehl, M., & Moore, R. (2007). Using linguistic cues for the automatic recognition of personality in conversation and text. Journal of Artificial Intelligence Research, 30, 457–500.
Melamed, S., & Oksenberg, A. (2002). Exceßive daytime sleepineß and risk of occupational injuries in non-shift daytime workers. Sleep, 25, 315–322.
Mitchell, T. M. (1997). Machine learning. New York: McGraw-Hill.
Nilßon, J. P., Soderstrom, M., Karlßon, A. U., Lekander, M., Åkerstedt, T., Lindroth, N. E., & Axelßon, J. (2005). Leß effective executive functioning after one night’s sleep deprivation. Journal of Sleep Research, 14, 1–6.
Nwe, T. L., Li, H., & Dong, M. (2006). Analysis and detection of speech under sleep deprivation. In Proceedings of Interspeech (Vol. 9, pp. 17–21). Bonn: ISCA.
O’Shaughneßy, D. (2000). Speech communications: Human and machine. Piscataway, NJ: IEEE Preß.
Read, L. (2006). Road safety Part 1: Alcohol, drugs and fatigue. In Department for Transport (Ed.), Road safety Part 1 (pp. 1–12). London: Department for Transport.
Rogers, N. L., Dorrian, J., & Dinges, D. F. (2003). Sleep, waking and neurobehavioral performance. In J. M. Kreuger (Ed.), Frontiers in Bioscience 8 (pp. 1056–1067). Albertson, NY: Frontiers in Bioscience.
Schiel, F. (2004). MAUS goes iterative. In Proceedings of the IV. International Conference on Language Resources and Evaluation (pp. 1015–1018).
Schleicher, R., Galley, N., Briest, S., & Galley, L. (2008). Blinks and saccades as indicators of fatigue in sleepineß warnings: Looking tired? Ergonomics, 51, 982–1010.
Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., et al. (2007). The relevance of feature type for the automatic claßification of emotional user states: Low level descriptors and functionals. In Proceedings of Interspeech (Vol. 8, pp. 2253–2256). Bonn: ISCA.
Schuller, B., Wimmer, M., Mösenlechner, L., Kern, C., & Rigoll, G. (2008). Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space? In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Proceßing (Vol. 33, pp. 4501–4504).
Sommer, D., Chen, M., Golz, M., Trutschel, U., & Mandic, D. (2005). Fusion of state space and frequency domain features for improved microsleep detection. In W. Dutch et al. (Eds.), Proceedings of the International Conference on Artifical Neural Networks (ICANN 2005) (pp. 753–759). Berlin: Springer.
Tartter, V. C. (1980). Happy talk: Perceptual and acoustic effects of smiling on speech. Perception & Psychophysics, 27, 24–27.
Taßi, P., Pellerin, N., Moeßinger, M., Eschenlauer, R., & Muzet, A. (2000). Variation of visual detection over the 24-hour period in humans. Journal of Chronobiology International, 17, 795–805.
Vapnik, V. N. (1995). The nature of statistical learning theory. New York: Springer.
Vlasenko, B., Schuller, B., Wendemuth, A., & Rigoll, G. (2007). Combining frame and turnlevel information for robust recognition of emotions within speech. In Proceedings of Interspeech (Vol. 8, pp. 2249–2252). Bonn: ISCA.
Vöhringer-Kuhnt, T., Baumgarten, T., Karrer, K., & Briest, S. (2004). Wierwille’s method of driver drowsineß evaluation revisited. In International Conference on Traffic and Transport Psychology (Vol. 3, pp. 5–9).
Webber, C. L., & Zbilut, J. P. (1994). Dynamical aßeßment of physiological systems and states using recurrence plot strategies. Journal of Applied Physiology, 76, 965–973.
Wesensten, N. J., Belenky, G., Thorne, D. R., Kautz, M. A., & Balkin, T. J. (2004). Modafinil vs. caffeine: Effects on fatigue during sleep deprivation. Aviation, Space, & Environmental Medicine, 75, 520–525.
Whitmore, J., & Fisher, S. (1996). Speech during sustained operations. Speech Communication, 20, 55–70.
Wilhelm, B., Giedke H., Lüdtke, H., Bittner, E., Hofmann, A., & Wilhelm, H. (2001). Daytime variations in central nervous system activation measured by a pupillographic sleepineß test. Journal of Sleep Research, 10, 1–7.
Wolpert, D. H. (1992). Stacked generalization. Journal of Neural Networks, 5, 241–259.
Wright, N., & McGown, A. (2001). Vigilance on the civil flight deck: Incidence of sleepineß and sleep during long-haul flights and aßociated changes in physiological parameters. Ergonomics, 44, 82–106.
Zils, E., Sprenger, A., Heide, W., Born, J., & Gais, S. (2005). Differential effects of sleep deprivation on saccadic eye movements. Sleep, 28, 1109–1115.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Krajewski, J., Batliner, A. & Golz, M. Acoustic sleepiness detection: Framework and validation of a speech-adapted pattern recognition approach. Behavior Research Methods 41, 795–804 (2009). https://doi.org/10.3758/BRM.41.3.795
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3758/BRM.41.3.795