Top

Published in:

2019 | OriginalPaper | Chapter

2. Production and Perception of Voice

Author : Rita Singh

Published in: Profiling Humans from their Voice

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The goal of this chapter is to present the human speech production process in sufficient detail for the reader to understand why profiling should be possible, and to provide sufficient information to reason about the effects of different parameters on voice, so that profiling efforts may be better guided. The details are sufficient, but not complete since the area is too vast to be covered within one chapter of this book.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Profiling and Its Facets

next chapter Relations Between Voice and Profile Parameters

Titze, I. R., Luschei, E. S., & Hirano, M. (1989). Role of the thyroarytenoid muscle in regulation of fundamental frequency. Journal of Voice, 3(3), 213–224.CrossRef

Hermand, E., Lhuissier, F. J., Larribaut, J., Pichon, A., & Richalet, J. P. (2015). Ventilatory oscillations at exercise: Effects of hyperoxia, hypercapnia, and acetazolamide. Physiological Reports, 3(6), e12446.

Yamagishi, M., Ishizuka, Y., Fujiwara, M., Nakamura, H., Igarashi, S., Nakano, Y., et al. (1993). Distribution of calcium binding proteins in sensory organs of the ear, nose and throat. Acta Oto-Laryngologica, 113(sup506), 85–89.CrossRef

Sataloff, R. T. (2017). Clinical anatomy and physiology of the voice. Professional voice: The science and art of clinical care (4th ed., pp. 157–196). California: Plural Publishing, San Diego.

Baer, T. (1981). Investigation of the phonatory mechanism. Status report on speech research SR-66 (pp. 35–54). New Haven: Haskins Laboratories.

Zhang, Z. (2009). Characteristics of phonation onset in a two-layer vocal fold model. The Journal of the Acoustical Society of America, 125(2), 1091–1102.CrossRef

Flanagan, J., & Landgraf, L. (1968). Self-oscillating source for vocal-tract synthesizers. IEEE Transactions on Audio and Electroacoustics, 16(1), 57–64.CrossRef

Ishizaka, K., & Flanagan, J. L. (1972). Synthesis of voiced sounds from a two-mass model of the vocal cords. Bell System Technical Journal, 51(6), 1233–1268.CrossRef

Zhang, Z., Neubauer, J., & Berry, D. A. (2006). The influence of subglottal acoustics on laboratory models of phonation. The Journal of the Acoustical Society of America, 120(3), 1558–1569.CrossRef

10.

Zhang, Z., Neubauer, J., & Berry, D. A. (2007). Physical mechanisms of phonation onset: A linear stability analysis of an aeroelastic continuum model of phonation. The Journal of the Acoustical Society of America, 122(4), 2279–2295.CrossRef

11.

Zhao, W., Zhang, C., Frankel, S. H., & Mongeau, L. (2002). Computational aeroacoustics of phonation, part I: Computational methods and sound generation mechanisms. The Journal of the Acoustical Society of America, 112(5), 2134–2146.CrossRef

12.

Zhang, C., Zhao, W., Frankel, S. H., & Mongeau, L. (2002). Computational aeroacoustics of phonation, part II: Effects of flow parameters and ventricular folds. The Journal of the Acoustical Society of America, 112(5), 2147–2154.CrossRef

13.

Chan, R. W., & Titze, I. R. (1999). Viscoelastic shear properties of human vocal fold mucosa: Measurement methodology and empirical results. The Journal of the Acoustical Society of America, 106(4), 2008–2021.CrossRef

14.

Chan, R. W., & Rodriguez, M. L. (2008). A simple-shear rheometer for linear viscoelastic characterization of vocal fold tissues at phonatory frequencies. The Journal of the Acoustical Society of America, 124(2), 1207–1219.CrossRef

15.

Miri, A. K., Mongrain, R., Chen, L. X., & Mongeau, L. (2012). Quantitative assessment of the anisotropy of vocal fold tissue using shear rheometry and traction testing. Journal of Biomechanics, 45(16), 2943–2946.CrossRef

16.

Kazemirad, S., Bakhshaee, H., Mongeau, L., & Kost, K. (2014). Non-invasive in vivo measurement of the shear modulus of human vocal fold tissue. Journal of Biomechanics, 47(5), 1173–1179.CrossRef

17.

Haji, T., Mori, K., Omori, K., & Isshiki, N. (1992). Experimental studies on the viscoelasticity of the vocal fold. Acta Oto-Laryngologica, 112(1), 151–159.CrossRef

18.

Tran, Q. T., Gerratt, B. R., Berke, G. S., & Kreiman, J. (1993). Measurement of Young’s modulus in the in vivo human vocal folds. Annals of Otology, Rhinology and Laryngology, 102(8), 584–591.CrossRef

19.

Chhetri, D. K., Zhang, Z., & Neubauer, J. (2011). Measurement of Young’s modulus of vocal folds by indentation. Journal of Voice, 25(1), 1–7.CrossRef

20.

Scherer, R. C., Shinwari, D., De Witt, K. J., Zhang, C., Kucinschi, B. R., & Afjeh, A. A. (2001). Intraglottal pressure profiles for a symmetric and oblique glottis with a divergence angle of 10 degrees. The Journal of the Acoustical Society of America, 109(4), 1616–1630.CrossRef

21.

Li, S., Scherer, R. C., Wan, M., & Wang, S. (2012). The effect of entrance radii on intraglottal pressure distributions in the divergent glottis. The Journal of the Acoustical Society of America, 131(2), 1371–1377.CrossRef

22.

Kettlewell, B. Q. (2015). The influence of intraglottal vortices upon the dynamics of the vocal folds. Master’s thesis, University of Waterloo, Canada.

23.

Shinwari, D., Scherer, R. C., DeWitt, K. J., & Afjeh, A. A. (2003). Flow visualization and pressure distributions in a model of the glottis with a symmetric and oblique divergent angle of 10 degrees. The Journal of the Acoustical Society of America, 113(1), 487–497.CrossRef

24.

Kucinschi, B. R., Scherer, R. C., DeWitt, K. J., & Ng, T. T. (2006). Flow visualization and acoustic consequences of the air moving through a static model of the human larynx. Journal of Biomechanical Engineering, 128(3), 380–390.CrossRef

25.

Erath, B. D., & Plesniak, M. W. (2006). The occurrence of the Coanda effect in pulsatile flow through static models of the human vocal folds. The Journal of the Acoustical Society of America, 120(2), 1000–1011.CrossRef

26.

Mihaescu, M., Khosla, S. M., Murugappan, S., & Gutmark, E. J. (2010). Unsteady laryngeal airflow simulations of the intra-glottal vortical structures. The Journal of the Acoustical Society of America, 127(1), 435–444.CrossRef

27.

Hirano, M., Kakita, Y., & Daniloff, R. G. (1985). Cover-body theory of vocal fold vibration. In R. G. Daniloff (Ed.), Speech science (pp. 1–46). San Diego, California: College-Hill Press.

28.

Alipour, F., & Vigmostad, S. (2012). Measurement of vocal folds elastic properties for continuum modeling. Journal of Voice, 26(6), 816-e21.CrossRef

29.

Kelleher, J. E., Siegmund, T., Du, M., Naseri, E., & Chan, R. W. (2013). Empirical measurements of biomechanical anisotropy of the human vocal fold lamina propria. Biomechanics and Modeling in Mechanobiology, 12(3), 555–567.CrossRef

30.

Xuan, Y., & Zhang, Z. (2014). Influence of embedded fibers and an epithelium layer on the glottal closure pattern in a physical vocal fold model. Journal of Speech, Language, and Hearing Research, 57(2), 416–425.MathSciNetCrossRef

31.

Hirano, M. (1974). Morphological structure of the vocal cord as a vibrator and its variations. Folia Phoniatrica et Logopaedica, 26(2), 89–94.CrossRef

32.

Hirano, M., Kurita, S., & Sakaguchi, S. (1989). Ageing of the vibratory tissue of human vocal folds. Acta Oto-Laryngologica, 107(5–6), 428–433.CrossRef

33.

Zhang, Z. (2010). Dependence of phonation threshold pressure and frequency on vocal fold geometry and biomechanics. The Journal of the Acoustical Society of America, 127(4), 2554–2562.CrossRef

34.

Horáček, J., & Švec, J. G. (2002). Aeroelastic model of vocal-fold-shaped vibrating element for studying the phonation threshold. Journal of Fluids and Structures, 16(7), 931–955.CrossRef

35.

Titze, I. R., & Strong, W. J. (1975). Normal modes in vocal cord tissues. The Journal of the Acoustical Society of America, 57(3), 736–744.CrossRef

36.

Berry, D. A. (2001). Mechanisms of modal and nonmodal phonation. Journal of Phonetics, 29(4), 431–450.MathSciNetCrossRef

37.

Mergell, P., & Herzel, G. H. (1997). Speech Communication, 22(2–3), 141–154.

38.

Berry, D. A., Zhang, Z., & Neubauer, J. (2006). Mechanisms of irregular vibration in a physical model of the vocal folds. The Journal of the Acoustical Society of America, 120(3), EL36–EL42.

39.

Steinecke, I., & Herzel, H. (1995). Bifurcations in an asymmetric vocal-fold model. The Journal of the Acoustical Society of America, 97(3), 1874–1884.CrossRef

40.

Herbst, C. T., Lohscheller, J., Švec, J. G., Henrich, N., Weissengruber, G., & Fitch, W. T. (2014). Glottal opening and closing events investigated by electroglottography and super-high-speed video recordings. Journal of Experimental Biology, 217(6), 955–963.CrossRef

41.

Large, J. (1972). Towards an integrated physiologic-acoustic theory of vocal registers. National Association of Teachers of Singing (NATS) Bulletin, 28(3), 18–25.

42.

Ware, C. (1998). Basics of vocal pedagogy: The foundations and process of singing. New York: McGraw-Hill.

43.

Fant, G. (1967). Auditory Patterns of Speech. Models for the perception of speech and visual form (pp. 111–125). Cambridge, Massachusetts: MIT Press.

44.

Pinto, N. B., & Childers, D. G. (1988). Formant speech synthesis. IETE Journal of Research, 34(1), 5–20.CrossRef

45.

Spanias, A. S. (1994). Speech coding: A tutorial review. Proceedings of the IEEE, 82(10), 1541–1582.CrossRef

46.

Švec, J. G., Horáček, J., Šram, F., & Veselỳ, J. (2000). Resonance properties of the vocal folds: In vivo laryngoscopic investigation of the externally excited laryngeal vibrations. The Journal of the Acoustical Society of America, 108(4), 1397–1407.CrossRef

47.

Ishizaka, K. (1988). Significance of Kaneko’s measurement of natural frequencies of the vocal folds. In O. Fujimura (Ed.), Vocal physiology: Voice production, mechanisms and functions (pp. 181–190). New York: AT&T Bell Laboratories, Raven Press.

48.

Zhang, Z. (2016). Mechanics of human voice production and control. The Journal of the Acoustical Society of America, 140(4), 2614–2635.CrossRef

49.

Rothenberg, M. (1973). A new inverse-filtering technique for deriving the glottal air flow waveform during voicing. The Journal of the Acoustical Society of America, 53(6), 1632–1645.CrossRef

50.

Alku, P. (2011). Glottal inverse filtering analysis of human voice production - a review of estimation and parameterization methods of the glottal excitation and their applications. Sadhana, 36(5), 623–650.CrossRef

51.

Fant, G. (2012). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations (Vol. 2). Berlin: Walter de Gruyter.

52.

Portnoff, M. R. (1973). A quasi-one-dimensional digital simulation for the time-varying vocal tract. Masters dissertation, Massachusetts Institute of Technology, Cambridge, USA.

53.

Story, B. H. (2005). A parametric model of the vocal tract area function for vowel and consonant simulation. The Journal of the Acoustical Society of America, 117(5), 3231–3254.CrossRef

54.

Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals. Englewood Cliffs, New Jersey: Prentice-Hall.

55.

Lamere, P., Kwok, P., Gouvea, E., Raj, B., Singh, R., Walker, W., et al. (2003). The CMU SPHINX-4 speech recognition system. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP) (Vol. 1, pp. 2–5). Hong Kong: IEEE.

56.

Stevens, K. N. (2000). Acoustic phonetics. Cambridge, USA: MIT Press.

57.

Ladefoged, P., & Maddieson, I. (1996). The sounds of the world’s languages (Vol. 1012). Oxford, UK: Blackwell Publishers.

58.

Labov, W., Ash, S., & Boberg, C. (2005). The Atlas of North American English: Phonetics, phonology and sound change. Berlin: Walter de Gruyter.

59.

Stevens, K. N. (2000). Diverse acoustic cues at consonantal landmarks. Phonetica, 57(2–4), 139–151.CrossRef

60.

Fant, G. (1960). Acoustic theory of speech production: With calculations based on X-ray studies of Russian articulations (Vol. 2). Berlin, Germany: Walter de Gruyter.

61.

Saks, M. J., & Koehler, J. J. (2008). The individualization fallacy in forensic science evidence. Vanderbilt Law Review, 61(1), 197.

62.

Page, M., Taylor, J., & Blenkin, M. (2011). Uniqueness in the forensic identification sciences - fact or fiction? Forensic Science International, 206(1–3), 12–18.CrossRef

63.

http://www.worldometers.info/world-population/.

64.

Jain, A. K., Prabhakar, S., & Pankanti, S. (2002). On the similarity of identical twin fingerprints. Pattern Recognition, 35(11), 2653–2663.CrossRef

65.

Sun, Z., Paulino, A. A., Feng, J., Chai, Z., Tan, T., & Jain, A. K. (2010). A study of multibiometric traits of identical twins. Biometric technology for human identification VII (Vol. 7667, p. 76670T). International Society for Optics and Photonics.

66.

Van, W. G., Vercammen, J., & Debruyne, F. (2001). Voice similarity in identical twins. Acta Oto-Rhino-Laryngologica Belgica, 55(1), 49–55.

67.

Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins. Doctoral dissertation, School of Languages, University of Melbourne, Australia.

68.

Koyama, T., Kawasaki, M., & Ogura, J. H. (1969). Mechanics of voice production. I. Regulation of vocal intensity. The Laryngoscope, 79(3), 337–354.

69.

Von Békésy, G., & Wever, E. G. (1960). Experiments in hearing (Vol. 8). New York: McGraw-Hill.

70.

Reichenbach, T., & Hudspeth, A. J. (2014). The physics of hearing: Fluid mechanics and the active process of the inner ear. Reports on Progress in Physics, 77(7), 076601.

71.

Zwicker, E. (1961). Subdivision of the audible frequency range into critical bands (Frequenzgruppen). The Journal of the Acoustical Society of America, 33(2), 248–248.CrossRef

72.

Fletcher, H., & Munson, W. A. (1933). Loudness, its definition, measurement and calculation. Bell System Technical Journal, 12(4), 377–430.CrossRef

73.

Traunmüller, H. (1990). Analytical expressions for the tonotopic sensory scale. The Journal of the Acoustical Society of America, 88(1), 97–100.CrossRef

74.

Moore, B. C., & Glasberg, B. R. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The Journal of the Acoustical Society of America, 74(3), 750–753.CrossRef

75.

Fillon, T., & Prado, J. (2003). Evaluation of an ERB frequency scale noise reduction for hearing aids: A comparative study. Speech Communication, 39(1–2), 23–32.CrossRef

76.

Smith, J. O., & Abel, J. S. (1999). Bark and ERB bilinear transforms. IEEE Transactions on Speech and Audio Processing, 7(6), 697–708.CrossRef

77.

Stevens, S. S., Volkmann, J., & Newman, E. B. (1937). A scale for the measurement of the psychological magnitude pitch. The Journal of the Acoustical Society of America, 8(3), 185–190.CrossRef

78.

Holdsworth, J., Nimmo-Smith, I., Patterson, R., & Rice, P. (1988). Implementing a gammatone filter bank. Annex C of the SVOS Final Report: Part A: The Auditory Filterbank, 1, 1–5.

79.

Lyon, R. F., Katsiamis, A. G., & Drakakis, E. M. (2010). History and future of auditory filter models. In Proceedings the International Symposium on Circuits and Systems (pp. 3809–3812). IEEE.

80.

Greenwood, D. D. (1990). A cochlear frequency-position function for several species - 29 years later. The Journal of the Acoustical Society of America, 87(6), 2592–2605.CrossRef

81.

Zwicker, E., & Fastl, H. (2013). Psychoacoustics: Facts and models (Vol. 22). New York: Springer Science & Business Media.

82.

Flanagan, J. L. (2013). Speech analysis synthesis and perception (Vol. 3). New York: Springer Science & Business Media.

83.

Mersky, B. L. (1991). Method and apparatus for endodontically augmenting hearing. U.S. Patent 5,033,999.

84.

Winkworth, A. L., Davis, P. J., Adams, R. D., & Ellis, E. (1995). Breathing patterns during spontaneous speech. Journal of Speech, Language, and Hearing Research, 38(1), 124–144.CrossRef

85.

Loudon, R. G., Lee, L., & Holcomb, B. J. (1988). Volumes and breathing patterns during speech in healthy and asthmatic subjects. Journal of Speech, Language, and Hearing Research, 31(2), 219–227.CrossRef

86.

Bellemare, F., & Grassino, A. (1982). Effect of pressure and timing of contraction on human diaphragm fatigue. Journal of Applied Physiology, 53(5), 1190–1195.CrossRef

87.

Pauluhn, J. (2006). Acute nose-only exposure of rats to phosgene. Part I: Concentration \(\times \) time dependence of LC50s, nonlethal-threshold concentrations, and analysis of breathing patterns. Inhalation Toxicology, 18(6), 423–435.CrossRef

88.

Lucía, A., Carvajal, A., Calderón, F. J., Alfonso, A., & Chicharro, J. L. (1999). Breathing pattern in highly competitive cyclists during incremental exercise. European Journal of Applied Physiology and Occupational Physiology, 79(6), 512–521.CrossRef

Title: Production and Perception of Voice
Author: Rita Singh
Publisher: Springer Singapore
Book: Profiling Humans from their Voice
Print ISBN: 978-981-13-8402-8

Electronic ISBN: 978-981-13-8403-5

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-981-13-8403-5_2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"