Skip to main content
main-content
Top

Hint

Swipe to navigate through the chapters of this book

2019 | OriginalPaper | Chapter

7. Feature Engineering for Profiling

Author: Rita Singh

Published in: Profiling Humans from their Voice

Publisher: Springer Singapore

share
SHARE

Abstract

Humans are able to perceive many profile parameters from voice. Such perceptual relationships between voice and parameters show that there is information relevant to profiling in the voice signal, but may not tell us what that information is. If a study finds perceptual relations between some parameter and specific features derived from voice, then that indicates the existence of a statistical relationship between those features and voice, which may be directly causal, or relate to a common underlying cause. Such features will usually be directly useful in profiling.
Literature
1.
go back to reference Fowler, C. A., & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2–3), 171–195. CrossRef Fowler, C. A., & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2–3), 171–195. CrossRef
2.
go back to reference Maeda, S. (1990). Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modelling (pp. 131–149). Dordrecht: Springer. Maeda, S. (1990). Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modelling (pp. 131–149). Dordrecht: Springer.
3.
go back to reference Ziegler, W., & Von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130. CrossRef Ziegler, W., & Von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130. CrossRef
4.
go back to reference Nicholas, M., Obler, L. K., Albert, M. L., & Helm-Estabrooks, N. (1985). Empty speech in Alzheimer’s disease and fluent aphasia. Journal of Speech, Language, and Hearing Research, 28(3), 405–410. CrossRef Nicholas, M., Obler, L. K., Albert, M. L., & Helm-Estabrooks, N. (1985). Empty speech in Alzheimer’s disease and fluent aphasia. Journal of Speech, Language, and Hearing Research, 28(3), 405–410. CrossRef
5.
go back to reference Cummings, J. L., Darkins, A., Mendez, M., Hill, M. A., & Benson, D. F. (1988). Alzheimer’s disease and Parkinson’s disease: Comparison of speech and language alterations. Neurology, 38(5), 680–680. CrossRef Cummings, J. L., Darkins, A., Mendez, M., Hill, M. A., & Benson, D. F. (1988). Alzheimer’s disease and Parkinson’s disease: Comparison of speech and language alterations. Neurology, 38(5), 680–680. CrossRef
6.
go back to reference Mendez, M. F., Clark, D. G., Shapira, J. S., & Cummings, J. L. (2003). Speech and language in progressive nonfluent aphasia compared with early Alzheimer’s disease. Neurology, 61(8), 1108–1113. CrossRef Mendez, M. F., Clark, D. G., Shapira, J. S., & Cummings, J. L. (2003). Speech and language in progressive nonfluent aphasia compared with early Alzheimer’s disease. Neurology, 61(8), 1108–1113. CrossRef
7.
go back to reference Ahmed, S., Haigh, A. M. F., de Jager, C. A., & Garrard, P. (2013). Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain, 136(12), 3727–3737. CrossRef Ahmed, S., Haigh, A. M. F., de Jager, C. A., & Garrard, P. (2013). Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain, 136(12), 3727–3737. CrossRef
8.
go back to reference Forbes, K. E., Venneri, A., & Shanks, M. F. (2002). Distinct patterns of spontaneous speech deterioration: An early predictor of Alzheimer’s disease. Brain and Cognition, 48(2–3), 356–361. Forbes, K. E., Venneri, A., & Shanks, M. F. (2002). Distinct patterns of spontaneous speech deterioration: An early predictor of Alzheimer’s disease. Brain and Cognition, 48(2–3), 356–361.
9.
go back to reference Menn, L. (1983). Development of articulatory, phonetic, and phonological capabilities. Language Production, 2, 3–50. Menn, L. (1983). Development of articulatory, phonetic, and phonological capabilities. Language Production, 2, 3–50.
10.
go back to reference Harnad, S. (2003). Categorical perception. encyclopedia of cognitive science. Macmillan: Nature Publishing Group. Harnad, S. (2003). Categorical perception. encyclopedia of cognitive science. Macmillan: Nature Publishing Group.
12.
go back to reference Ferrand, C. T. (2006). Speech science: An integrated approach to theory and clinical practice. Boston: Allyn & Bacon. Ferrand, C. T. (2006). Speech science: An integrated approach to theory and clinical practice. Boston: Allyn & Bacon.
13.
go back to reference Singh, R., Raj, B., & Vitanen, T. (2012). The basics of automatic speech recognition. Techniques for Noise Robustness in Automatic Speech Recognition (T. Virtanen, R. Singh & B. Raj (Eds.). New York: Wiley. Singh, R., Raj, B., & Vitanen, T. (2012). The basics of automatic speech recognition. Techniques for Noise Robustness in Automatic Speech Recognition (T. Virtanen, R. Singh & B. Raj (Eds.). New York: Wiley.
14.
go back to reference Brand, M. (1999). Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5), 1155–1182. CrossRef Brand, M. (1999). Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5), 1155–1182. CrossRef
15.
go back to reference Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798. CrossRef Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798. CrossRef
16.
go back to reference Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins, Doctoral dissertation, School of Languages. Australia: University of Melbourne. Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins, Doctoral dissertation, School of Languages. Australia: University of Melbourne.
17.
go back to reference Snell, R. C., & Milinazzo, F. (1993). Formant location from LPC analysis data. IEEE Transactions on Speech and Audio Processing, 1(2), 129–134. CrossRef Snell, R. C., & Milinazzo, F. (1993). Formant location from LPC analysis data. IEEE Transactions on Speech and Audio Processing, 1(2), 129–134. CrossRef
18.
go back to reference Mercer, J. (1935). “Popeye the Sailor Man.” Popeye—Original Motion Picture Soundtrack Album (Vinyl, LP, Album: Voice Art Rendition). USA: The Boardwalk Entertainment Co. Mercer, J. (1935). “Popeye the Sailor Man.” Popeye—Original Motion Picture Soundtrack Album (Vinyl, LP, Album: Voice Art Rendition). USA: The Boardwalk Entertainment Co.
19.
go back to reference Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., & Hannequin, D. (2000). Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clinical Linguistics and Phonetics, 14(2), 131–150. CrossRef Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., & Hannequin, D. (2000). Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clinical Linguistics and Phonetics, 14(2), 131–150. CrossRef
20.
go back to reference Klatt, D. H. (1975). Voice onset time, frication, and aspiration in word-initial consonant clusters. Journal of Speech, Language, and Hearing Research, 18(4), 686–706. CrossRef Klatt, D. H. (1975). Voice onset time, frication, and aspiration in word-initial consonant clusters. Journal of Speech, Language, and Hearing Research, 18(4), 686–706. CrossRef
21.
go back to reference Kent, R. D., & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech, Language, and Hearing Research, 26(2), 231–249. CrossRef Kent, R. D., & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech, Language, and Hearing Research, 26(2), 231–249. CrossRef
22.
go back to reference Sweeting, P. M., & Baken, R. J. (1982). Voice onset time in a normal-aged population. Journal of Speech, Language, and Hearing Research, 25(1), 129–134. CrossRef Sweeting, P. M., & Baken, R. J. (1982). Voice onset time in a normal-aged population. Journal of Speech, Language, and Hearing Research, 25(1), 129–134. CrossRef
23.
go back to reference Morris, R. J., & Brown, W. S, Jr. (1994). Age-related differences in speech variability among women. Journal of Communication Disorders, 27(1), 49–64. CrossRef Morris, R. J., & Brown, W. S, Jr. (1994). Age-related differences in speech variability among women. Journal of Communication Disorders, 27(1), 49–64. CrossRef
24.
go back to reference Tremblay, K. L., Piskosz, M., & Souza, P. (2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343. CrossRef Tremblay, K. L., Piskosz, M., & Souza, P. (2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343. CrossRef
25.
go back to reference Flege, J. E. (1991). Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America, 89(1), 395–411. CrossRef Flege, J. E. (1991). Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America, 89(1), 395–411. CrossRef
26.
go back to reference Stölten, K., Abrahamsson, N., & Hyltenstam, K. (2015). Effects of age and speaking rate on voice onset time: The production of voiceless stops by near-native L2 speakers. Studies in Second Language Acquisition, 37(1), 71–100. CrossRef Stölten, K., Abrahamsson, N., & Hyltenstam, K. (2015). Effects of age and speaking rate on voice onset time: The production of voiceless stops by near-native L2 speakers. Studies in Second Language Acquisition, 37(1), 71–100. CrossRef
27.
go back to reference Decoster, W., & Debruyne, F. (1997). The ageing voice: Changes in fundamental frequency, waveform stability and spectrum. Acta Oto-Rhino-Laryngologica Belgica, 51(2), 105–112. Decoster, W., & Debruyne, F. (1997). The ageing voice: Changes in fundamental frequency, waveform stability and spectrum. Acta Oto-Rhino-Laryngologica Belgica, 51(2), 105–112.
28.
go back to reference Fischer, E., & Goberman, A. M. (2010). Voice onset time in Parkinson disease. Journal of Communication Disorders, 43(1), 21–34. CrossRef Fischer, E., & Goberman, A. M. (2010). Voice onset time in Parkinson disease. Journal of Communication Disorders, 43(1), 21–34. CrossRef
29.
go back to reference Ackermann, H., & Hertrich, I. (1997). Voice onset time in ataxic dysarthria. Brain and Language, 56(3), 321–333. CrossRef Ackermann, H., & Hertrich, I. (1997). Voice onset time in ataxic dysarthria. Brain and Language, 56(3), 321–333. CrossRef
30.
go back to reference Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., & Levinton, C. (1992). Acoustic analysis in the differentiation of Parkinson’s disease and major depression. Journal of Psycholinguistic Research, 21(5), 383–399. CrossRef Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., & Levinton, C. (1992). Acoustic analysis in the differentiation of Parkinson’s disease and major depression. Journal of Psycholinguistic Research, 21(5), 383–399. CrossRef
31.
go back to reference Heald, S., & Nusbaum, H. C. (2014). Speech perception as an active cognitive process. Frontiers in Systems Neuroscience, 8, 35. CrossRef Heald, S., & Nusbaum, H. C. (2014). Speech perception as an active cognitive process. Frontiers in Systems Neuroscience, 8, 35. CrossRef
32.
go back to reference Hashimoto, Y., & Sakai, K. L. (2003). Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study. Human Brain Mapping, 20(1), 22–28. CrossRef Hashimoto, Y., & Sakai, K. L. (2003). Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study. Human Brain Mapping, 20(1), 22–28. CrossRef
33.
go back to reference McGettigan, C. (2015). The social life of voices: Studying the neural bases for the expression and perception of the self and others during spoken communication. Frontiers in Human Neuroscience, 9, 129. CrossRef McGettigan, C. (2015). The social life of voices: Studying the neural bases for the expression and perception of the self and others during spoken communication. Frontiers in Human Neuroscience, 9, 129. CrossRef
34.
go back to reference Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, 98(2), 676–682. CrossRef Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, 98(2), 676–682. CrossRef
35.
go back to reference Singh, R., Gencaga, D., & Raj, B. (2016). Formant manipulations in voice disguise by mimicry. In Proceedings of the Fourth International Workshop on Biometrics and Forensics (IWBF) (pp. 1–6). Limassol, Cyprus: IEEE. Singh, R., Gencaga, D., & Raj, B. (2016). Formant manipulations in voice disguise by mimicry. In Proceedings of the Fourth International Workshop on Biometrics and Forensics (IWBF) (pp. 1–6). Limassol, Cyprus: IEEE.
36.
go back to reference Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. The Journal of the Acoustical Society of America, 87(2), 820–857. Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. The Journal of the Acoustical Society of America, 87(2), 820–857.
37.
go back to reference Kreiman, J., Garellek, M., Chen, G., Alwan, A., & Gerratt, B. R. (2015). Perceptual evaluation of voice source models. The Journal of the Acoustical Society of America, 138(1), 1–10. CrossRef Kreiman, J., Garellek, M., Chen, G., Alwan, A., & Gerratt, B. R. (2015). Perceptual evaluation of voice source models. The Journal of the Acoustical Society of America, 138(1), 1–10. CrossRef
38.
go back to reference Henderson, A., Goldman-Eisler, F., & Skarbek, A. (1965). Temporal patterns of cognitive activity and breath control in speech. Language and Speech, 8(4), 236–242. CrossRef Henderson, A., Goldman-Eisler, F., & Skarbek, A. (1965). Temporal patterns of cognitive activity and breath control in speech. Language and Speech, 8(4), 236–242. CrossRef
39.
go back to reference Allen, J. S., Miller, J. L., & DeSteno, D. (2003). Individual talker differences in voice-onset-time. The Journal of the Acoustical Society of America, 113(1), 544–552. CrossRef Allen, J. S., Miller, J. L., & DeSteno, D. (2003). Individual talker differences in voice-onset-time. The Journal of the Acoustical Society of America, 113(1), 544–552. CrossRef
40.
go back to reference Sonderegger, M., & Keshet, J. (2012). Automatic measurement of voice onset time using discriminative structured prediction. The Journal of the Acoustical Society of America, 132(6), 3965–3979. CrossRef Sonderegger, M., & Keshet, J. (2012). Automatic measurement of voice onset time using discriminative structured prediction. The Journal of the Acoustical Society of America, 132(6), 3965–3979. CrossRef
41.
go back to reference Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems 27 (NIPS 2014) (pp. 2672–2680). Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems 27 (NIPS 2014) (pp. 2672–2680).
42.
go back to reference Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML) (pp. 214–223). Australia: Sydney. Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML) (pp. 214–223). Australia: Sydney.
44.
go back to reference Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. CrossRef Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. CrossRef
45.
go back to reference Dai, W., Jin, O., Xue, G. R., Yang, Q., & Yu, Y. (2009). Eigentransfer: A unified framework for transfer learning. In Proceedings of the Twenty-sixth Annual International Conference on Machine Learning (ICML) (pp. 193–200). Montreal, Canada: ACM. Dai, W., Jin, O., Xue, G. R., Yang, Q., & Yu, Y. (2009). Eigentransfer: A unified framework for transfer learning. In Proceedings of the Twenty-sixth Annual International Conference on Machine Learning (ICML) (pp. 193–200). Montreal, Canada: ACM.
Metadata
Title
Feature Engineering for Profiling
Author
Rita Singh
Copyright Year
2019
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-8403-5_7