main-content

## Swipe to navigate through the chapters of this book

Published in:

2019 | OriginalPaper | Chapter

# 2. Production and Perception of Voice

Author: Rita Singh

Published in:

Publisher: Springer Singapore

## Abstract

The goal of this chapter is to present the human speech production process in sufficient detail for the reader to understand why profiling should be possible, and to provide sufficient information to reason about the effects of different parameters on voice, so that profiling efforts may be better guided. The details are sufficient, but not complete since the area is too vast to be covered within one chapter of this book.
Literature
1.
Titze, I. R., Luschei, E. S., & Hirano, M. (1989). Role of the thyroarytenoid muscle in regulation of fundamental frequency. Journal of Voice, 3(3), 213–224. CrossRef
2.
Hermand, E., Lhuissier, F. J., Larribaut, J., Pichon, A., & Richalet, J. P. (2015). Ventilatory oscillations at exercise: Effects of hyperoxia, hypercapnia, and acetazolamide. Physiological Reports, 3(6), e12446.
3.
Yamagishi, M., Ishizuka, Y., Fujiwara, M., Nakamura, H., Igarashi, S., Nakano, Y., et al. (1993). Distribution of calcium binding proteins in sensory organs of the ear, nose and throat. Acta Oto-Laryngologica, 113(sup506), 85–89. CrossRef
4.
Sataloff, R. T. (2017). Clinical anatomy and physiology of the voice. Professional voice: The science and art of clinical care (4th ed., pp. 157–196). California: Plural Publishing, San Diego.
5.
Baer, T. (1981). Investigation of the phonatory mechanism. Status report on speech research SR-66 (pp. 35–54). New Haven: Haskins Laboratories.
6.
Zhang, Z. (2009). Characteristics of phonation onset in a two-layer vocal fold model. The Journal of the Acoustical Society of America, 125(2), 1091–1102. CrossRef
7.
Flanagan, J., & Landgraf, L. (1968). Self-oscillating source for vocal-tract synthesizers. IEEE Transactions on Audio and Electroacoustics, 16(1), 57–64. CrossRef
8.
Ishizaka, K., & Flanagan, J. L. (1972). Synthesis of voiced sounds from a two-mass model of the vocal cords. Bell System Technical Journal, 51(6), 1233–1268. CrossRef
9.
Zhang, Z., Neubauer, J., & Berry, D. A. (2006). The influence of subglottal acoustics on laboratory models of phonation. The Journal of the Acoustical Society of America, 120(3), 1558–1569. CrossRef
10.
Zhang, Z., Neubauer, J., & Berry, D. A. (2007). Physical mechanisms of phonation onset: A linear stability analysis of an aeroelastic continuum model of phonation. The Journal of the Acoustical Society of America, 122(4), 2279–2295. CrossRef
11.
Zhao, W., Zhang, C., Frankel, S. H., & Mongeau, L. (2002). Computational aeroacoustics of phonation, part I: Computational methods and sound generation mechanisms. The Journal of the Acoustical Society of America, 112(5), 2134–2146. CrossRef
12.
Zhang, C., Zhao, W., Frankel, S. H., & Mongeau, L. (2002). Computational aeroacoustics of phonation, part II: Effects of flow parameters and ventricular folds. The Journal of the Acoustical Society of America, 112(5), 2147–2154. CrossRef
13.
Chan, R. W., & Titze, I. R. (1999). Viscoelastic shear properties of human vocal fold mucosa: Measurement methodology and empirical results. The Journal of the Acoustical Society of America, 106(4), 2008–2021. CrossRef
14.
Chan, R. W., & Rodriguez, M. L. (2008). A simple-shear rheometer for linear viscoelastic characterization of vocal fold tissues at phonatory frequencies. The Journal of the Acoustical Society of America, 124(2), 1207–1219. CrossRef
15.
Miri, A. K., Mongrain, R., Chen, L. X., & Mongeau, L. (2012). Quantitative assessment of the anisotropy of vocal fold tissue using shear rheometry and traction testing. Journal of Biomechanics, 45(16), 2943–2946. CrossRef
16.
Kazemirad, S., Bakhshaee, H., Mongeau, L., & Kost, K. (2014). Non-invasive in vivo measurement of the shear modulus of human vocal fold tissue. Journal of Biomechanics, 47(5), 1173–1179. CrossRef
17.
Haji, T., Mori, K., Omori, K., & Isshiki, N. (1992). Experimental studies on the viscoelasticity of the vocal fold. Acta Oto-Laryngologica, 112(1), 151–159. CrossRef
18.
Tran, Q. T., Gerratt, B. R., Berke, G. S., & Kreiman, J. (1993). Measurement of Young’s modulus in the in vivo human vocal folds. Annals of Otology, Rhinology and Laryngology, 102(8), 584–591. CrossRef
19.
Chhetri, D. K., Zhang, Z., & Neubauer, J. (2011). Measurement of Young’s modulus of vocal folds by indentation. Journal of Voice, 25(1), 1–7. CrossRef
20.
Scherer, R. C., Shinwari, D., De Witt, K. J., Zhang, C., Kucinschi, B. R., & Afjeh, A. A. (2001). Intraglottal pressure profiles for a symmetric and oblique glottis with a divergence angle of 10 degrees. The Journal of the Acoustical Society of America, 109(4), 1616–1630. CrossRef
21.
Li, S., Scherer, R. C., Wan, M., & Wang, S. (2012). The effect of entrance radii on intraglottal pressure distributions in the divergent glottis. The Journal of the Acoustical Society of America, 131(2), 1371–1377. CrossRef
22.
Kettlewell, B. Q. (2015). The influence of intraglottal vortices upon the dynamics of the vocal folds. Master’s thesis, University of Waterloo, Canada.
23.
Shinwari, D., Scherer, R. C., DeWitt, K. J., & Afjeh, A. A. (2003). Flow visualization and pressure distributions in a model of the glottis with a symmetric and oblique divergent angle of 10 degrees. The Journal of the Acoustical Society of America, 113(1), 487–497. CrossRef
24.
Kucinschi, B. R., Scherer, R. C., DeWitt, K. J., & Ng, T. T. (2006). Flow visualization and acoustic consequences of the air moving through a static model of the human larynx. Journal of Biomechanical Engineering, 128(3), 380–390. CrossRef
25.
Erath, B. D., & Plesniak, M. W. (2006). The occurrence of the Coanda effect in pulsatile flow through static models of the human vocal folds. The Journal of the Acoustical Society of America, 120(2), 1000–1011. CrossRef
26.
Mihaescu, M., Khosla, S. M., Murugappan, S., & Gutmark, E. J. (2010). Unsteady laryngeal airflow simulations of the intra-glottal vortical structures. The Journal of the Acoustical Society of America, 127(1), 435–444. CrossRef
27.
Hirano, M., Kakita, Y., & Daniloff, R. G. (1985). Cover-body theory of vocal fold vibration. In R. G. Daniloff (Ed.), Speech science (pp. 1–46). San Diego, California: College-Hill Press.
28.
Alipour, F., & Vigmostad, S. (2012). Measurement of vocal folds elastic properties for continuum modeling. Journal of Voice, 26(6), 816-e21. CrossRef
29.
Kelleher, J. E., Siegmund, T., Du, M., Naseri, E., & Chan, R. W. (2013). Empirical measurements of biomechanical anisotropy of the human vocal fold lamina propria. Biomechanics and Modeling in Mechanobiology, 12(3), 555–567. CrossRef
30.
Xuan, Y., & Zhang, Z. (2014). Influence of embedded fibers and an epithelium layer on the glottal closure pattern in a physical vocal fold model. Journal of Speech, Language, and Hearing Research, 57(2), 416–425.
31.
Hirano, M. (1974). Morphological structure of the vocal cord as a vibrator and its variations. Folia Phoniatrica et Logopaedica, 26(2), 89–94. CrossRef
32.
Hirano, M., Kurita, S., & Sakaguchi, S. (1989). Ageing of the vibratory tissue of human vocal folds. Acta Oto-Laryngologica, 107(5–6), 428–433. CrossRef
33.
Zhang, Z. (2010). Dependence of phonation threshold pressure and frequency on vocal fold geometry and biomechanics. The Journal of the Acoustical Society of America, 127(4), 2554–2562. CrossRef
34.
Horáček, J., & Švec, J. G. (2002). Aeroelastic model of vocal-fold-shaped vibrating element for studying the phonation threshold. Journal of Fluids and Structures, 16(7), 931–955. CrossRef
35.
Titze, I. R., & Strong, W. J. (1975). Normal modes in vocal cord tissues. The Journal of the Acoustical Society of America, 57(3), 736–744. CrossRef
36.
Berry, D. A. (2001). Mechanisms of modal and nonmodal phonation. Journal of Phonetics, 29(4), 431–450.
37.
Mergell, P., & Herzel, G. H. (1997). Speech Communication, 22(2–3), 141–154.
38.
Berry, D. A., Zhang, Z., & Neubauer, J. (2006). Mechanisms of irregular vibration in a physical model of the vocal folds. The Journal of the Acoustical Society of America, 120(3), EL36–EL42.
39.
Steinecke, I., & Herzel, H. (1995). Bifurcations in an asymmetric vocal-fold model. The Journal of the Acoustical Society of America, 97(3), 1874–1884. CrossRef
40.
Herbst, C. T., Lohscheller, J., Švec, J. G., Henrich, N., Weissengruber, G., & Fitch, W. T. (2014). Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. Journal of Experimental Biology, 217(6), 955–963. CrossRef
41.
Large, J. (1972). Towards an integrated physiologic-acoustic theory of vocal registers. National Association of Teachers of Singing (NATS) Bulletin, 28(3), 18–25.
42.
Ware, C. (1998). Basics of vocal pedagogy: The foundations and process of singing. New York: McGraw-Hill.
43.
Fant, G. (1967). Auditory Patterns of Speech. Models for the perception of speech and visual form (pp. 111–125). Cambridge, Massachusetts: MIT Press.
44.
Pinto, N. B., & Childers, D. G. (1988). Formant speech synthesis. IETE Journal of Research, 34(1), 5–20. CrossRef
45.
Spanias, A. S. (1994). Speech coding: A tutorial review. Proceedings of the IEEE, 82(10), 1541–1582. CrossRef
46.
Švec, J. G., Horáček, J., Šram, F., & Veselỳ, J. (2000). Resonance properties of the vocal folds: In vivo laryngoscopic investigation of the externally excited laryngeal vibrations. The Journal of the Acoustical Society of America, 108(4), 1397–1407. CrossRef
47.
Ishizaka, K. (1988). Significance of Kaneko’s measurement of natural frequencies of the vocal folds. In O. Fujimura (Ed.), Vocal physiology: Voice production, mechanisms and functions (pp. 181–190). New York: AT&T Bell Laboratories, Raven Press.
48.
Zhang, Z. (2016). Mechanics of human voice production and control. The Journal of the Acoustical Society of America, 140(4), 2614–2635. CrossRef
49.
Rothenberg, M. (1973). A new inverse-filtering technique for deriving the glottal air flow waveform during voicing. The Journal of the Acoustical Society of America, 53(6), 1632–1645. CrossRef
50.
Alku, P. (2011). Glottal inverse filtering analysis of human voice production - a review of estimation and parameterization methods of the glottal excitation and their applications. Sadhana, 36(5), 623–650. CrossRef
51.
Fant, G. (2012). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations (Vol. 2). Berlin: Walter de Gruyter.
52.
Portnoff, M. R. (1973). A quasi-one-dimensional digital simulation for the time-varying vocal tract. Masters dissertation, Massachusetts Institute of Technology, Cambridge, USA.
53.
Story, B. H. (2005). A parametric model of the vocal tract area function for vowel and consonant simulation. The Journal of the Acoustical Society of America, 117(5), 3231–3254. CrossRef
54.
Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Englewood Cliffs, New Jersey: Prentice-Hall.
55.
Lamere, P., Kwok, P., Gouvea, E., Raj, B., Singh, R., Walker, W., et al. (2003). The CMU SPHINX-4 speech recognition system. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Vol. 1, pp. 2–5). Hong Kong: IEEE.
56.
Stevens, K. N. (2000). Acoustic phonetics. Cambridge, USA: MIT Press.
57.
Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages (Vol. 1012). Oxford, UK: Blackwell Publishers.
58.
Labov, W., Ash, S., & Boberg, C. (2005). The Atlas of North American English: Phonetics, phonology and sound change. Berlin: Walter de Gruyter.
59.
Stevens, K. N. (2000). Diverse acoustic cues at consonantal landmarks. Phonetica, 57(2–4), 139–151. CrossRef
60.
Fant, G. (1960). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations (Vol. 2). Berlin, Germany: Walter de Gruyter.
61.
Saks, M. J., & Koehler, J. J. (2008). The individualization fallacy in forensic science evidence. Vanderbilt Law Review, 61(1), 197.
62.
Page, M., Taylor, J., & Blenkin, M. (2011). Uniqueness in the forensic identification sciences - fact or fiction? Forensic Science International, 206(1–3), 12–18. CrossRef
64.
Jain, A. K., Prabhakar, S., & Pankanti, S. (2002). On the similarity of identical twin fingerprints. Pattern Recognition, 35(11), 2653–2663. CrossRef
65.
Sun, Z., Paulino, A. A., Feng, J., Chai, Z., Tan, T., & Jain, A. K. (2010). A study of multibiometric traits of identical twins. Biometric technology for human identification VII (Vol. 7667, p. 76670T). International Society for Optics and Photonics.
66.
Van, W. G., Vercammen, J., & Debruyne, F. (2001). Voice similarity in identical twins. Acta Oto-Rhino-Laryngologica Belgica, 55(1), 49–55.
67.
Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins. Doctoral dissertation, School of Languages, University of Melbourne, Australia.
68.
Koyama, T., Kawasaki, M., & Ogura, J. H. (1969). Mechanics of voice production. I. Regulation of vocal intensity. The Laryngoscope, 79(3), 337–354.
69.
Von Békésy, G., & Wever, E. G. (1960). Experiments in hearing (Vol. 8). New York: McGraw-Hill.
70.
Reichenbach, T., & Hudspeth, A. J. (2014). The physics of hearing: Fluid mechanics and the active process of the inner ear. Reports on Progress in Physics, 77(7), 076601.
71.
Zwicker, E. (1961). Subdivision of the audible frequency range into critical bands (Frequenzgruppen). The Journal of the Acoustical Society of America, 33(2), 248–248. CrossRef
72.
Fletcher, H., & Munson, W. A. (1933). Loudness, its definition, measurement and calculation. Bell System Technical Journal, 12(4), 377–430. CrossRef
73.
Traunmüller, H. (1990). Analytical expressions for the tonotopic sensory scale. The Journal of the Acoustical Society of America, 88(1), 97–100. CrossRef
74.
Moore, B. C., & Glasberg, B. R. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The Journal of the Acoustical Society of America, 74(3), 750–753. CrossRef
75.
Fillon, T., & Prado, J. (2003). Evaluation of an ERB frequency scale noise reduction for hearing aids: A comparative study. Speech Communication, 39(1–2), 23–32. CrossRef
76.
Smith, J. O., & Abel, J. S. (1999). Bark and ERB bilinear transforms. IEEE Transactions on Speech and Audio Processing, 7(6), 697–708. CrossRef
77.
Stevens, S. S., Volkmann, J., & Newman, E. B. (1937). A scale for the measurement of the psychological magnitude pitch. The Journal of the Acoustical Society of America, 8(3), 185–190. CrossRef
78.
Holdsworth, J., Nimmo-Smith, I., Patterson, R., & Rice, P. (1988). Implementing a gammatone filter bank. Annex C of the SVOS Final Report: Part A: The Auditory Filterbank, 1, 1–5.
79.
Lyon, R. F., Katsiamis, A. G., & Drakakis, E. M. (2010). History and future of auditory filter models. In Proceedings the International Symposium on Circuits and Systems (pp. 3809–3812). IEEE.
80.
Greenwood, D. D. (1990). A cochlear frequency-position function for several species - 29 years later. The Journal of the Acoustical Society of America, 87(6), 2592–2605. CrossRef
81.
Zwicker, E., & Fastl, H. (2013). Psychoacoustics: Facts and models (Vol. 22). New York: Springer Science & Business Media.
82.
Flanagan, J. L. (2013). Speech analysis synthesis and perception (Vol. 3). New York: Springer Science & Business Media.
83.
Mersky, B. L. (1991). Method and apparatus for endodontically augmenting hearing. U.S. Patent 5,033,999.
84.
Winkworth, A. L., Davis, P. J., Adams, R. D., & Ellis, E. (1995). Breathing patterns during spontaneous speech. Journal of Speech, Language, and Hearing Research, 38(1), 124–144. CrossRef
85.
Loudon, R. G., Lee, L., & Holcomb, B. J. (1988). Volumes and breathing patterns during speech in healthy and asthmatic subjects. Journal of Speech, Language, and Hearing Research, 31(2), 219–227. CrossRef
86.
Bellemare, F., & Grassino, A. (1982). Effect of pressure and timing of contraction on human diaphragm fatigue. Journal of Applied Physiology, 53(5), 1190–1195. CrossRef
87.
Pauluhn, J. (2006). Acute nose-only exposure of rats to phosgene. Part I: Concentration $$\times$$ time dependence of LC50s, nonlethal-threshold concentrations, and analysis of breathing patterns. Inhalation Toxicology, 18(6), 423–435. CrossRef
88.
Lucía, A., Carvajal, A., Calderón, F. J., Alfonso, A., & Chicharro, J. L. (1999). Breathing pattern in highly competitive cyclists during incremental exercise. European Journal of Applied Physiology and Occupational Physiology, 79(6), 512–521. CrossRef