Skip to main content

2019 | OriginalPaper | Buchkapitel

7. Feature Engineering for Profiling

verfasst von : Rita Singh

Erschienen in: Profiling Humans from their Voice

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Humans are able to perceive many profile parameters from voice. Such perceptual relationships between voice and parameters show that there is information relevant to profiling in the voice signal, but may not tell us what that information is. If a study finds perceptual relations between some parameter and specific features derived from voice, then that indicates the existence of a statistical relationship between those features and voice, which may be directly causal, or relate to a common underlying cause. Such features will usually be directly useful in profiling.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Fowler, C. A., & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2–3), 171–195.CrossRef Fowler, C. A., & Saltzman, E. (1993). Coordination and coarticulation in speech production. Language and Speech, 36(2–3), 171–195.CrossRef
2.
Zurück zum Zitat Maeda, S. (1990). Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modelling (pp. 131–149). Dordrecht: Springer. Maeda, S. (1990). Compensatory articulation during speech: Evidence from the analysis and synthesis of vocal-tract shapes using an articulatory model. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modelling (pp. 131–149). Dordrecht: Springer.
3.
Zurück zum Zitat Ziegler, W., & Von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130.CrossRef Ziegler, W., & Von Cramon, D. (1985). Anticipatory coarticulation in a patient with apraxia of speech. Brain and Language, 26(1), 117–130.CrossRef
4.
Zurück zum Zitat Nicholas, M., Obler, L. K., Albert, M. L., & Helm-Estabrooks, N. (1985). Empty speech in Alzheimer’s disease and fluent aphasia. Journal of Speech, Language, and Hearing Research, 28(3), 405–410.CrossRef Nicholas, M., Obler, L. K., Albert, M. L., & Helm-Estabrooks, N. (1985). Empty speech in Alzheimer’s disease and fluent aphasia. Journal of Speech, Language, and Hearing Research, 28(3), 405–410.CrossRef
5.
Zurück zum Zitat Cummings, J. L., Darkins, A., Mendez, M., Hill, M. A., & Benson, D. F. (1988). Alzheimer’s disease and Parkinson’s disease: Comparison of speech and language alterations. Neurology, 38(5), 680–680.CrossRef Cummings, J. L., Darkins, A., Mendez, M., Hill, M. A., & Benson, D. F. (1988). Alzheimer’s disease and Parkinson’s disease: Comparison of speech and language alterations. Neurology, 38(5), 680–680.CrossRef
6.
Zurück zum Zitat Mendez, M. F., Clark, D. G., Shapira, J. S., & Cummings, J. L. (2003). Speech and language in progressive nonfluent aphasia compared with early Alzheimer’s disease. Neurology, 61(8), 1108–1113.CrossRef Mendez, M. F., Clark, D. G., Shapira, J. S., & Cummings, J. L. (2003). Speech and language in progressive nonfluent aphasia compared with early Alzheimer’s disease. Neurology, 61(8), 1108–1113.CrossRef
7.
Zurück zum Zitat Ahmed, S., Haigh, A. M. F., de Jager, C. A., & Garrard, P. (2013). Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain, 136(12), 3727–3737.CrossRef Ahmed, S., Haigh, A. M. F., de Jager, C. A., & Garrard, P. (2013). Connected speech as a marker of disease progression in autopsy-proven Alzheimer’s disease. Brain, 136(12), 3727–3737.CrossRef
8.
Zurück zum Zitat Forbes, K. E., Venneri, A., & Shanks, M. F. (2002). Distinct patterns of spontaneous speech deterioration: An early predictor of Alzheimer’s disease. Brain and Cognition, 48(2–3), 356–361. Forbes, K. E., Venneri, A., & Shanks, M. F. (2002). Distinct patterns of spontaneous speech deterioration: An early predictor of Alzheimer’s disease. Brain and Cognition, 48(2–3), 356–361.
9.
Zurück zum Zitat Menn, L. (1983). Development of articulatory, phonetic, and phonological capabilities. Language Production, 2, 3–50. Menn, L. (1983). Development of articulatory, phonetic, and phonological capabilities. Language Production, 2, 3–50.
10.
Zurück zum Zitat Harnad, S. (2003). Categorical perception. encyclopedia of cognitive science. Macmillan: Nature Publishing Group. Harnad, S. (2003). Categorical perception. encyclopedia of cognitive science. Macmillan: Nature Publishing Group.
12.
Zurück zum Zitat Ferrand, C. T. (2006). Speech science: An integrated approach to theory and clinical practice. Boston: Allyn & Bacon. Ferrand, C. T. (2006). Speech science: An integrated approach to theory and clinical practice. Boston: Allyn & Bacon.
13.
Zurück zum Zitat Singh, R., Raj, B., & Vitanen, T. (2012). The basics of automatic speech recognition. Techniques for Noise Robustness in Automatic Speech Recognition (T. Virtanen, R. Singh & B. Raj (Eds.). New York: Wiley. Singh, R., Raj, B., & Vitanen, T. (2012). The basics of automatic speech recognition. Techniques for Noise Robustness in Automatic Speech Recognition (T. Virtanen, R. Singh & B. Raj (Eds.). New York: Wiley.
14.
Zurück zum Zitat Brand, M. (1999). Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5), 1155–1182.CrossRef Brand, M. (1999). Structure learning in conditional probability models via an entropic prior and parameter extinction. Neural Computation, 11(5), 1155–1182.CrossRef
15.
Zurück zum Zitat Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.CrossRef Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.CrossRef
16.
Zurück zum Zitat Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins, Doctoral dissertation, School of Languages. Australia: University of Melbourne. Loakes, D. (2006). A forensic phonetic investigation into the speech patterns of identical and non-identical twins, Doctoral dissertation, School of Languages. Australia: University of Melbourne.
17.
Zurück zum Zitat Snell, R. C., & Milinazzo, F. (1993). Formant location from LPC analysis data. IEEE Transactions on Speech and Audio Processing, 1(2), 129–134.CrossRef Snell, R. C., & Milinazzo, F. (1993). Formant location from LPC analysis data. IEEE Transactions on Speech and Audio Processing, 1(2), 129–134.CrossRef
18.
Zurück zum Zitat Mercer, J. (1935). “Popeye the Sailor Man.” Popeye—Original Motion Picture Soundtrack Album (Vinyl, LP, Album: Voice Art Rendition). USA: The Boardwalk Entertainment Co. Mercer, J. (1935). “Popeye the Sailor Man.” Popeye—Original Motion Picture Soundtrack Album (Vinyl, LP, Album: Voice Art Rendition). USA: The Boardwalk Entertainment Co.
19.
Zurück zum Zitat Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., & Hannequin, D. (2000). Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clinical Linguistics and Phonetics, 14(2), 131–150.CrossRef Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache, F., & Hannequin, D. (2000). Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clinical Linguistics and Phonetics, 14(2), 131–150.CrossRef
20.
Zurück zum Zitat Klatt, D. H. (1975). Voice onset time, frication, and aspiration in word-initial consonant clusters. Journal of Speech, Language, and Hearing Research, 18(4), 686–706.CrossRef Klatt, D. H. (1975). Voice onset time, frication, and aspiration in word-initial consonant clusters. Journal of Speech, Language, and Hearing Research, 18(4), 686–706.CrossRef
21.
Zurück zum Zitat Kent, R. D., & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech, Language, and Hearing Research, 26(2), 231–249.CrossRef Kent, R. D., & Rosenbek, J. C. (1983). Acoustic patterns of apraxia of speech. Journal of Speech, Language, and Hearing Research, 26(2), 231–249.CrossRef
22.
Zurück zum Zitat Sweeting, P. M., & Baken, R. J. (1982). Voice onset time in a normal-aged population. Journal of Speech, Language, and Hearing Research, 25(1), 129–134.CrossRef Sweeting, P. M., & Baken, R. J. (1982). Voice onset time in a normal-aged population. Journal of Speech, Language, and Hearing Research, 25(1), 129–134.CrossRef
23.
Zurück zum Zitat Morris, R. J., & Brown, W. S, Jr. (1994). Age-related differences in speech variability among women. Journal of Communication Disorders, 27(1), 49–64.CrossRef Morris, R. J., & Brown, W. S, Jr. (1994). Age-related differences in speech variability among women. Journal of Communication Disorders, 27(1), 49–64.CrossRef
24.
Zurück zum Zitat Tremblay, K. L., Piskosz, M., & Souza, P. (2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343.CrossRef Tremblay, K. L., Piskosz, M., & Souza, P. (2003). Effects of age and age-related hearing loss on the neural representation of speech cues. Clinical Neurophysiology, 114(7), 1332–1343.CrossRef
25.
Zurück zum Zitat Flege, J. E. (1991). Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America, 89(1), 395–411.CrossRef Flege, J. E. (1991). Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a second language. The Journal of the Acoustical Society of America, 89(1), 395–411.CrossRef
26.
Zurück zum Zitat Stölten, K., Abrahamsson, N., & Hyltenstam, K. (2015). Effects of age and speaking rate on voice onset time: The production of voiceless stops by near-native L2 speakers. Studies in Second Language Acquisition, 37(1), 71–100.CrossRef Stölten, K., Abrahamsson, N., & Hyltenstam, K. (2015). Effects of age and speaking rate on voice onset time: The production of voiceless stops by near-native L2 speakers. Studies in Second Language Acquisition, 37(1), 71–100.CrossRef
27.
Zurück zum Zitat Decoster, W., & Debruyne, F. (1997). The ageing voice: Changes in fundamental frequency, waveform stability and spectrum. Acta Oto-Rhino-Laryngologica Belgica, 51(2), 105–112. Decoster, W., & Debruyne, F. (1997). The ageing voice: Changes in fundamental frequency, waveform stability and spectrum. Acta Oto-Rhino-Laryngologica Belgica, 51(2), 105–112.
28.
Zurück zum Zitat Fischer, E., & Goberman, A. M. (2010). Voice onset time in Parkinson disease. Journal of Communication Disorders, 43(1), 21–34.CrossRef Fischer, E., & Goberman, A. M. (2010). Voice onset time in Parkinson disease. Journal of Communication Disorders, 43(1), 21–34.CrossRef
29.
Zurück zum Zitat Ackermann, H., & Hertrich, I. (1997). Voice onset time in ataxic dysarthria. Brain and Language, 56(3), 321–333.CrossRef Ackermann, H., & Hertrich, I. (1997). Voice onset time in ataxic dysarthria. Brain and Language, 56(3), 321–333.CrossRef
30.
Zurück zum Zitat Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., & Levinton, C. (1992). Acoustic analysis in the differentiation of Parkinson’s disease and major depression. Journal of Psycholinguistic Research, 21(5), 383–399.CrossRef Flint, A. J., Black, S. E., Campbell-Taylor, I., Gailey, G. F., & Levinton, C. (1992). Acoustic analysis in the differentiation of Parkinson’s disease and major depression. Journal of Psycholinguistic Research, 21(5), 383–399.CrossRef
31.
Zurück zum Zitat Heald, S., & Nusbaum, H. C. (2014). Speech perception as an active cognitive process. Frontiers in Systems Neuroscience, 8, 35.CrossRef Heald, S., & Nusbaum, H. C. (2014). Speech perception as an active cognitive process. Frontiers in Systems Neuroscience, 8, 35.CrossRef
32.
Zurück zum Zitat Hashimoto, Y., & Sakai, K. L. (2003). Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study. Human Brain Mapping, 20(1), 22–28.CrossRef Hashimoto, Y., & Sakai, K. L. (2003). Brain activations during conscious self-monitoring of speech production with delayed auditory feedback: An fMRI study. Human Brain Mapping, 20(1), 22–28.CrossRef
33.
Zurück zum Zitat McGettigan, C. (2015). The social life of voices: Studying the neural bases for the expression and perception of the self and others during spoken communication. Frontiers in Human Neuroscience, 9, 129.CrossRef McGettigan, C. (2015). The social life of voices: Studying the neural bases for the expression and perception of the self and others during spoken communication. Frontiers in Human Neuroscience, 9, 129.CrossRef
34.
Zurück zum Zitat Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, 98(2), 676–682.CrossRef Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, 98(2), 676–682.CrossRef
35.
Zurück zum Zitat Singh, R., Gencaga, D., & Raj, B. (2016). Formant manipulations in voice disguise by mimicry. In Proceedings of the Fourth International Workshop on Biometrics and Forensics (IWBF) (pp. 1–6). Limassol, Cyprus: IEEE. Singh, R., Gencaga, D., & Raj, B. (2016). Formant manipulations in voice disguise by mimicry. In Proceedings of the Fourth International Workshop on Biometrics and Forensics (IWBF) (pp. 1–6). Limassol, Cyprus: IEEE.
36.
Zurück zum Zitat Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. The Journal of the Acoustical Society of America, 87(2), 820–857. Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. The Journal of the Acoustical Society of America, 87(2), 820–857.
37.
Zurück zum Zitat Kreiman, J., Garellek, M., Chen, G., Alwan, A., & Gerratt, B. R. (2015). Perceptual evaluation of voice source models. The Journal of the Acoustical Society of America, 138(1), 1–10.CrossRef Kreiman, J., Garellek, M., Chen, G., Alwan, A., & Gerratt, B. R. (2015). Perceptual evaluation of voice source models. The Journal of the Acoustical Society of America, 138(1), 1–10.CrossRef
38.
Zurück zum Zitat Henderson, A., Goldman-Eisler, F., & Skarbek, A. (1965). Temporal patterns of cognitive activity and breath control in speech. Language and Speech, 8(4), 236–242.CrossRef Henderson, A., Goldman-Eisler, F., & Skarbek, A. (1965). Temporal patterns of cognitive activity and breath control in speech. Language and Speech, 8(4), 236–242.CrossRef
39.
Zurück zum Zitat Allen, J. S., Miller, J. L., & DeSteno, D. (2003). Individual talker differences in voice-onset-time. The Journal of the Acoustical Society of America, 113(1), 544–552.CrossRef Allen, J. S., Miller, J. L., & DeSteno, D. (2003). Individual talker differences in voice-onset-time. The Journal of the Acoustical Society of America, 113(1), 544–552.CrossRef
40.
Zurück zum Zitat Sonderegger, M., & Keshet, J. (2012). Automatic measurement of voice onset time using discriminative structured prediction. The Journal of the Acoustical Society of America, 132(6), 3965–3979.CrossRef Sonderegger, M., & Keshet, J. (2012). Automatic measurement of voice onset time using discriminative structured prediction. The Journal of the Acoustical Society of America, 132(6), 3965–3979.CrossRef
41.
Zurück zum Zitat Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems 27 (NIPS 2014) (pp. 2672–2680). Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems 27 (NIPS 2014) (pp. 2672–2680).
42.
Zurück zum Zitat Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML) (pp. 214–223). Australia: Sydney. Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In Proceedings of the Thirty-Fourth International Conference on Machine Learning (ICML) (pp. 214–223). Australia: Sydney.
43.
Zurück zum Zitat Neyshabur, B., Bhojanapalli, S. & Chakrabarti, A. (2017). Stabilizing GAN training with multiple random projections. arXiv:1705.07831. Neyshabur, B., Bhojanapalli, S. & Chakrabarti, A. (2017). Stabilizing GAN training with multiple random projections. arXiv:​1705.​07831.
44.
Zurück zum Zitat Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.CrossRef Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359.CrossRef
45.
Zurück zum Zitat Dai, W., Jin, O., Xue, G. R., Yang, Q., & Yu, Y. (2009). Eigentransfer: A unified framework for transfer learning. In Proceedings of the Twenty-sixth Annual International Conference on Machine Learning (ICML) (pp. 193–200). Montreal, Canada: ACM. Dai, W., Jin, O., Xue, G. R., Yang, Q., & Yu, Y. (2009). Eigentransfer: A unified framework for transfer learning. In Proceedings of the Twenty-sixth Annual International Conference on Machine Learning (ICML) (pp. 193–200). Montreal, Canada: ACM.
Metadaten
Titel
Feature Engineering for Profiling
verfasst von
Rita Singh
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-13-8403-5_7

Neuer Inhalt