Skip to main content
Erschienen in:
Buchtitelbild

2012 | OriginalPaper | Buchkapitel

1. Historical and Procedural Overview of Forensic Speaker Recognition as a Science

verfasst von : Kanae Amino, Ph.D., Takashi Osanai, Ph.D., Toshiaki Kamada, B.E., Hisanori Makinae, Ph.D., Takayuki Arai, Ph.D.

Erschienen in: Forensic Speaker Recognition

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Forensic phonetics and acoustics are nowadays widely used regarding police and legal use of acoustic samples. Among many tasks included in this area, forensic speaker recognition is considered as one of the most complex problems. Forensic speaker recognition, sometimes called forensic speaker comparison, is a process for making judgments on whether or not two speech samples are from the same speaker. This chapter introduces the historical backgrounds of forensic speaker recognition including “voiceprint” controversy, human-based visual and auditory forensic speaker recognition, and automatic forensic speaker recognition. Procedural considerations in forensic speaker recognition processes and factors that affect recognition performances are also presented. Finally, we will give a summary of the progress and developments made in the forensic automatic speaker recognition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Nolan F (1983) The phonetic basis of speaker recognition. Cambridge studies in speech science and communiation. Cambridge University Press, Cambridge Nolan F (1983) The phonetic basis of speaker recognition. Cambridge studies in speech science and communiation. Cambridge University Press, Cambridge
2.
Zurück zum Zitat Schmidt-Nielsen A, Stern KR (1985) Identification of known voices as a function of familiarity and narrow-band coding. J Acoust Soc Am 77:658–663CrossRef Schmidt-Nielsen A, Stern KR (1985) Identification of known voices as a function of familiarity and narrow-band coding. J Acoust Soc Am 77:658–663CrossRef
3.
Zurück zum Zitat Van Lacker D, Kreiman J, Emmorey K (1985) Familiar voice recognition: patterns and parameters part 1: recognition of backward voices. J Phonetics 13:19–38 Van Lacker D, Kreiman J, Emmorey K (1985) Familiar voice recognition: patterns and parameters part 1: recognition of backward voices. J Phonetics 13:19–38
4.
Zurück zum Zitat Van Lacker D, Kreiman J (1985) Familiar voice recognition: patterns and parameters part 2: recognition of rate-altered voices. J Phonetics 13:39–52 Van Lacker D, Kreiman J (1985) Familiar voice recognition: patterns and parameters part 2: recognition of rate-altered voices. J Phonetics 13:39–52
5.
Zurück zum Zitat Cheney D, Seyfarth R (1980) Vocal recognition in free-ranging vervet monkeys. Anim Behav 28:362–367CrossRef Cheney D, Seyfarth R (1980) Vocal recognition in free-ranging vervet monkeys. Anim Behav 28:362–367CrossRef
6.
Zurück zum Zitat Rendall D, Rodman PS, Emond RE (1996) Vocal recognition of individuals and kin in free-ranging rhesus monkeys. Anim Behav 51:1007–1015CrossRef Rendall D, Rodman PS, Emond RE (1996) Vocal recognition of individuals and kin in free-ranging rhesus monkeys. Anim Behav 51:1007–1015CrossRef
7.
Zurück zum Zitat Sugiura H (2001) Vocal exchange of coo calls in Japanese macaques. In: Matsuzawa T (ed) Primate origins of human cognition and behaviour. Springer, Tokyo, pp 135–154 Sugiura H (2001) Vocal exchange of coo calls in Japanese macaques. In: Matsuzawa T (ed) Primate origins of human cognition and behaviour. Springer, Tokyo, pp 135–154
8.
Zurück zum Zitat Bricker P, Pruzansky S (1976) Speaker recognition. In: Lass N (ed) Contemporary issues in experimental phonetics. Academic Press, New York, pp 295–326 Bricker P, Pruzansky S (1976) Speaker recognition. In: Lass N (ed) Contemporary issues in experimental phonetics. Academic Press, New York, pp 295–326
9.
Zurück zum Zitat Furui S (1992) Acoustic and speech engineering (onkyo, onsei kougaku). Kindai Kagakusha Publishing Company, Tokyo Furui S (1992) Acoustic and speech engineering (onkyo, onsei kougaku). Kindai Kagakusha Publishing Company, Tokyo
10.
Zurück zum Zitat National Research Council (1979) On the theory and practice of voice identification. National Academy of Science, Washington, pp 3–13 National Research Council (1979) On the theory and practice of voice identification. National Academy of Science, Washington, pp 3–13
11.
Zurück zum Zitat Steinberg JC (1934) Application of sound measuring instruments to the study of phonetic problems. J Acoust Soc Am 6:16–24CrossRef Steinberg JC (1934) Application of sound measuring instruments to the study of phonetic problems. J Acoust Soc Am 6:16–24CrossRef
12.
13.
Zurück zum Zitat Grey CHG, Kopp GA (1944) Voiceprint identification. Bell Telephone Laboratory Annual Report, New York, pp 1–14 Grey CHG, Kopp GA (1944) Voiceprint identification. Bell Telephone Laboratory Annual Report, New York, pp 1–14
14.
Zurück zum Zitat Tosi O, Oyer H, Lashbrook W, Pedrey C, Nicol J, Nash E (1972) Experiment on voice identification. J Acoust Soc Am 51:2030–2043CrossRef Tosi O, Oyer H, Lashbrook W, Pedrey C, Nicol J, Nash E (1972) Experiment on voice identification. J Acoust Soc Am 51:2030–2043CrossRef
15.
16.
Zurück zum Zitat Campbell JP, Shen W, Campbell WM, Schwartz R, Bonastre JF, Matrouf D (2009) Forensic speaker recognition. IEEE Signal Process Mag 26:95–103CrossRef Campbell JP, Shen W, Campbell WM, Schwartz R, Bonastre JF, Matrouf D (2009) Forensic speaker recognition. IEEE Signal Process Mag 26:95–103CrossRef
17.
Zurück zum Zitat Young MA, Campbell RA (1967) Effects of context on talker identification. J Acoust Soc Am 42:1250–1254CrossRef Young MA, Campbell RA (1967) Effects of context on talker identification. J Acoust Soc Am 42:1250–1254CrossRef
18.
Zurück zum Zitat Tosi O (1968) Speaker identification through acoustic spectrography. Proc Logoped Phoniatr, pp 138–145 Tosi O (1968) Speaker identification through acoustic spectrography. Proc Logoped Phoniatr, pp 138–145
19.
Zurück zum Zitat Stevens KN, Williams CE, Carbonell JR, Woods B (1968) Speaker authentication and identification: a comparison of spectrographic and auditory presentations of speech material. J Acoust Soc Am 44:1596–1607CrossRef Stevens KN, Williams CE, Carbonell JR, Woods B (1968) Speaker authentication and identification: a comparison of spectrographic and auditory presentations of speech material. J Acoust Soc Am 44:1596–1607CrossRef
20.
Zurück zum Zitat Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1970) Speaker identification by speech spectrograms: a scientists’ view of its reliability for legal purposes. J Acoust Soc Am 47:597–612CrossRef Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1970) Speaker identification by speech spectrograms: a scientists’ view of its reliability for legal purposes. J Acoust Soc Am 47:597–612CrossRef
21.
Zurück zum Zitat Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1973) Speaker identification by speech spectrograpms: some further observations. J Acoust Soc Am 54:531–534CrossRef Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1973) Speaker identification by speech spectrograpms: some further observations. J Acoust Soc Am 54:531–534CrossRef
22.
Zurück zum Zitat Koenig BE (1986) Spectrographic voice identification: a forensic survey. J Acoust Soc Am 79:2088–2090CrossRef Koenig BE (1986) Spectrographic voice identification: a forensic survey. J Acoust Soc Am 79:2088–2090CrossRef
23.
Zurück zum Zitat Shipp T, Doherty TE, Hollien H (1987) Some fundamental considerations regarding voice identification. J Acoust Soc Am 82:687–688CrossRef Shipp T, Doherty TE, Hollien H (1987) Some fundamental considerations regarding voice identification. J Acoust Soc Am 82:687–688CrossRef
24.
Zurück zum Zitat Koenig BE, Ritenour DV Jr, Kohus BA, Kelly S (1987) Reply to ‘Some fundamental considerations regarding voice identification’. J Acoust Soc Am 82:688–689CrossRef Koenig BE, Ritenour DV Jr, Kohus BA, Kelly S (1987) Reply to ‘Some fundamental considerations regarding voice identification’. J Acoust Soc Am 82:688–689CrossRef
25.
Zurück zum Zitat Lindh J (2004) Handling the voiceprint issue. Proc Fonetik, pp 72–75 Lindh J (2004) Handling the voiceprint issue. Proc Fonetik, pp 72–75
26.
Zurück zum Zitat Poza FT, Begault DR (2005) Voice identification and elimination using sural-spectrographic protocols. Proc AES Int’l Conf, pp 1–8 Poza FT, Begault DR (2005) Voice identification and elimination using sural-spectrographic protocols. Proc AES Int’l Conf, pp 1–8
27.
Zurück zum Zitat McGehee F (1937) The reliability of the identification of the human voice. J Gen Psychol 17:249–271CrossRef McGehee F (1937) The reliability of the identification of the human voice. J Gen Psychol 17:249–271CrossRef
28.
Zurück zum Zitat McGehee F (1944) An experimental study of voice recognition. J Gen Psychol 31:53–65CrossRef McGehee F (1944) An experimental study of voice recognition. J Gen Psychol 31:53–65CrossRef
29.
Zurück zum Zitat Pollack I, Pickett JM, Sumby WH (1954) On the identification of speaker by voice. J Acoust Soc Am 26:403–406CrossRef Pollack I, Pickett JM, Sumby WH (1954) On the identification of speaker by voice. J Acoust Soc Am 26:403–406CrossRef
30.
Zurück zum Zitat Bricker P, Pruzansky S (1966) Effects of stimulus content and duration on talker identification. J Acoust Soc Am 40:1441–1450CrossRef Bricker P, Pruzansky S (1966) Effects of stimulus content and duration on talker identification. J Acoust Soc Am 40:1441–1450CrossRef
31.
Zurück zum Zitat Clifford BR (1980) Voice identification by human listeners: on earwitness reliability. Law Human Behav 4:373–394CrossRef Clifford BR (1980) Voice identification by human listeners: on earwitness reliability. Law Human Behav 4:373–394CrossRef
32.
Zurück zum Zitat Papcun G, Kreiman J, Davis A (1989) Long-term memory for unfamiliar voices. J Acoust Soc Am 85:913–925CrossRef Papcun G, Kreiman J, Davis A (1989) Long-term memory for unfamiliar voices. J Acoust Soc Am 85:913–925CrossRef
33.
Zurück zum Zitat Yarmey AD, Matthys E (1992) Voice identification of an abductor. Appl Cogn Psychol 6:367–377CrossRef Yarmey AD, Matthys E (1992) Voice identification of an abductor. Appl Cogn Psychol 6:367–377CrossRef
34.
Zurück zum Zitat Yarmey AD, Yarmey AL, Yarmey M, Parliament L (2001) Commonsense beliefs and the identification of familiar voices. Appl Cogn Psychol 15:283–299CrossRef Yarmey AD, Yarmey AL, Yarmey M, Parliament L (2001) Commonsense beliefs and the identification of familiar voices. Appl Cogn Psychol 15:283–299CrossRef
35.
Zurück zum Zitat O’Shaughnessy D (2001) Speech communication—human and machine, 2nd edn. Addison-Wesley Publishing Company, New York O’Shaughnessy D (2001) Speech communication—human and machine, 2nd edn. Addison-Wesley Publishing Company, New York
36.
Zurück zum Zitat Hollien H (2002) Forensic voice identification. Academic Press, San Diego Hollien H (2002) Forensic voice identification. Academic Press, San Diego
37.
Zurück zum Zitat Bonastre JF, Bimbot F, Boe LJ, Campbell JP, Reynolds DA, Magrin-Chagnolleau I (2003) Person authentication by voice: a need for caution. Proc Eurospeech, pp 1–4 Bonastre JF, Bimbot F, Boe LJ, Campbell JP, Reynolds DA, Magrin-Chagnolleau I (2003) Person authentication by voice: a need for caution. Proc Eurospeech, pp 1–4
38.
Zurück zum Zitat Denes PB, Pinson EN (1993) The speech chain, 2nd edn. Worth Publishers, New York Denes PB, Pinson EN (1993) The speech chain, 2nd edn. Worth Publishers, New York
39.
Zurück zum Zitat Kuenzel H (2000) Effects of voice disguise on speaking fundamental frequency. Forensic Ling 7:149–179CrossRef Kuenzel H (2000) Effects of voice disguise on speaking fundamental frequency. Forensic Ling 7:149–179CrossRef
40.
Zurück zum Zitat Zhang C, Tan T (2007) Voice disguise and automatic speaker recognition. Forensic Sci Int 175:118–122CrossRef Zhang C, Tan T (2007) Voice disguise and automatic speaker recognition. Forensic Sci Int 175:118–122CrossRef
41.
Zurück zum Zitat Reich AR, Duke JE (1979) Effects of selected vocal disguises upon speaker identification by listening. J Acoust Soc Am 66:1023–1028CrossRef Reich AR, Duke JE (1979) Effects of selected vocal disguises upon speaker identification by listening. J Acoust Soc Am 66:1023–1028CrossRef
42.
Zurück zum Zitat Orchard TL, Yarmey AD (1995) The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Appl Cogn Psychol 9:249–260CrossRef Orchard TL, Yarmey AD (1995) The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Appl Cogn Psychol 9:249–260CrossRef
43.
Zurück zum Zitat Sjoestroem M, Eriksson E, Zetterholm E, Sullivan KP (2006) A switch of dialect as disguise. Lund Univ. Linguistics and Phonetics Woking Papers, vol 52, pp 113–116 Sjoestroem M, Eriksson E, Zetterholm E, Sullivan KP (2006) A switch of dialect as disguise. Lund Univ. Linguistics and Phonetics Woking Papers, vol 52, pp 113–116
44.
Zurück zum Zitat Markham D (1999) Listeners and disguised voices: the imitation and perception of dialect accent. J Speech Lang Law 6:289–299 Markham D (1999) Listeners and disguised voices: the imitation and perception of dialect accent. J Speech Lang Law 6:289–299
45.
Zurück zum Zitat Amino K, Arai T (2009) Dialectal characteristics of Osaka and Tokyo Japanese: analyses of phonologically identical words. Proc Interspeech, pp 2303–2306 Amino K, Arai T (2009) Dialectal characteristics of Osaka and Tokyo Japanese: analyses of phonologically identical words. Proc Interspeech, pp 2303–2306
46.
Zurück zum Zitat House AS, Stevens KN (1993) Speech production: thirty years after. J Acoust Soc Am 94:1763CrossRef House AS, Stevens KN (1993) Speech production: thirty years after. J Acoust Soc Am 94:1763CrossRef
47.
Zurück zum Zitat Hollien H, Schwartz R (2000) Aural-perceptual speaker identification: problems with noncontemporary samples. Forensic Linguist 7:199–211CrossRef Hollien H, Schwartz R (2000) Aural-perceptual speaker identification: problems with noncontemporary samples. Forensic Linguist 7:199–211CrossRef
48.
Zurück zum Zitat Hollien H, Schwartz R (2001) Speaker identification utilizing noncontemporary speech. J Forensic Sci 46:63–67 Hollien H, Schwartz R (2001) Speaker identification utilizing noncontemporary speech. J Forensic Sci 46:63–67
49.
Zurück zum Zitat Amino K, Osanai T, Kamada T, Makinae H, Arai T (2011) Effects of the phonological contents and transmission channels on forensic speaker recognition. In: Neustein A, Patil HA (eds) Advances in forensic speaker recognition. Springer Amino K, Osanai T, Kamada T, Makinae H, Arai T (2011) Effects of the phonological contents and transmission channels on forensic speaker recognition. In: Neustein A, Patil HA (eds) Advances in forensic speaker recognition. Springer
50.
Zurück zum Zitat Kuenzel HJ (2001) Beware of the ’telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Liguist 8:80–99CrossRef Kuenzel HJ (2001) Beware of the ’telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Liguist 8:80–99CrossRef
51.
Zurück zum Zitat Byne C, Foulkes P (2004) The ‘mobile phone effect’ on vowel formants. J Speech Lang Law 11:1350–1771 Byne C, Foulkes P (2004) The ‘mobile phone effect’ on vowel formants. J Speech Lang Law 11:1350–1771
52.
Zurück zum Zitat Lawrence S, Nolan F, McDougall K (2008) Acoustic and perceptual effects of telephone transmission on vowel quality. J Speech Lang Law 15:161–192 Lawrence S, Nolan F, McDougall K (2008) Acoustic and perceptual effects of telephone transmission on vowel quality. J Speech Lang Law 15:161–192
53.
Zurück zum Zitat Titze I (1989) Physiologic and acoustic differences between male and female voices. J Acoust Soc Am 85:1699–1707CrossRef Titze I (1989) Physiologic and acoustic differences between male and female voices. J Acoust Soc Am 85:1699–1707CrossRef
54.
Zurück zum Zitat Kent RD, Read C (2001) Acoustic analysis of speech, 2nd edn. Cengage Learning Kent RD, Read C (2001) Acoustic analysis of speech, 2nd edn. Cengage Learning
55.
Zurück zum Zitat Clarke FR, Becker RW (1969) Comparison of techniques for discriminating among talkers. J Speech Hear Res 12:747–761 Clarke FR, Becker RW (1969) Comparison of techniques for discriminating among talkers. J Speech Hear Res 12:747–761
56.
Zurück zum Zitat Thompson CP (1987) A language effect in voice identification. Appl Cogn Psychol 1:121–131CrossRef Thompson CP (1987) A language effect in voice identification. Appl Cogn Psychol 1:121–131CrossRef
57.
Zurück zum Zitat Goggin J, Thompson CP, Strube G, Simental LR (1991) The role of language familiarity in voice identification. Mem Cognit 19:448–458CrossRef Goggin J, Thompson CP, Strube G, Simental LR (1991) The role of language familiarity in voice identification. Mem Cognit 19:448–458CrossRef
58.
Zurück zum Zitat Koester O, Schiller NO (1997) Different influences of the native language of a listener on speaker recognition. Forensic Linguist 4:18–28 Koester O, Schiller NO (1997) Different influences of the native language of a listener on speaker recognition. Forensic Linguist 4:18–28
59.
Zurück zum Zitat Philippon AC, Cherryman J, Bull R, Vrij A (2007) Earwitness identification performances: the effect of language, target, deliberate strategies and indirect measures. Appl Cogn Psychol 21:539–550CrossRef Philippon AC, Cherryman J, Bull R, Vrij A (2007) Earwitness identification performances: the effect of language, target, deliberate strategies and indirect measures. Appl Cogn Psychol 21:539–550CrossRef
60.
Zurück zum Zitat Hashimoto M, Kitagawa S, Higuchi N (1998) Quantitative analysis of acoustic features affecting speaker identification. J Acoust Soc Jpn 54:169–178 Hashimoto M, Kitagawa S, Higuchi N (1998) Quantitative analysis of acoustic features affecting speaker identification. J Acoust Soc Jpn 54:169–178
61.
Zurück zum Zitat Hollien H, Majewski W, Doherty TE (1982) Perceptual identification of voices under normal, stress, and disguise speaking conditions. J Phonetics 10:139–148 Hollien H, Majewski W, Doherty TE (1982) Perceptual identification of voices under normal, stress, and disguise speaking conditions. J Phonetics 10:139–148
62.
Zurück zum Zitat Ladefoged P, Ladefoged J (1980) The ability of listeners to identify voices. UCLA Working Papers Phon 49:43–89 Ladefoged P, Ladefoged J (1980) The ability of listeners to identify voices. UCLA Working Papers Phon 49:43–89
63.
Zurück zum Zitat Nygaard L (2005) Perceptual integration of linguistic and nonlinguistic properties of speech. In: Pisoni DB, Remez RE (eds) The handbook of speech perception. Blackwell, Oxford, pp 390–413 Nygaard L (2005) Perceptual integration of linguistic and nonlinguistic properties of speech. In: Pisoni DB, Remez RE (eds) The handbook of speech perception. Blackwell, Oxford, pp 390–413
64.
Zurück zum Zitat Roebuck R, Wilding J (1993) Effects of vowel variety and sample length on identification of a speaker in a line-up. Appl Cogn Psychol 7:475–481CrossRef Roebuck R, Wilding J (1993) Effects of vowel variety and sample length on identification of a speaker in a line-up. Appl Cogn Psychol 7:475–481CrossRef
65.
Zurück zum Zitat Cook S, Wilding J (1997) Earwitness testimony: never mind the variety, hear the length. Appl Cogn Psychol 11:95–111CrossRef Cook S, Wilding J (1997) Earwitness testimony: never mind the variety, hear the length. Appl Cogn Psychol 11:95–111CrossRef
66.
Zurück zum Zitat Loftus EF, Loftus GR, Messo J (1987) Some facts about weapon focus. Law Human Behav 11:55–62CrossRef Loftus EF, Loftus GR, Messo J (1987) Some facts about weapon focus. Law Human Behav 11:55–62CrossRef
67.
Zurück zum Zitat Loftus EF, Miller DG, Burns HJ (1978) Semantic integration of verbal information into a visual memory. J Exp Psychol Human Learn Mem 4:19–31CrossRef Loftus EF, Miller DG, Burns HJ (1978) Semantic integration of verbal information into a visual memory. J Exp Psychol Human Learn Mem 4:19–31CrossRef
68.
Zurück zum Zitat Schooler JW, Engstler-Schooler TY (1990) Verbal overshadowing of visual memories: some things are better left unsaid. Cogn Psychol 22:36–71CrossRef Schooler JW, Engstler-Schooler TY (1990) Verbal overshadowing of visual memories: some things are better left unsaid. Cogn Psychol 22:36–71CrossRef
69.
Zurück zum Zitat Chin JM, Schooler JW (2008) Why do words hurt? Content, process, and criterion shift accounts of verbal overshadowing. Eur J Cogn Psychol 20:396–413CrossRef Chin JM, Schooler JW (2008) Why do words hurt? Content, process, and criterion shift accounts of verbal overshadowing. Eur J Cogn Psychol 20:396–413CrossRef
70.
Zurück zum Zitat Kitagami S (2001) Disruptive effect of verbal encoding on memory and cognition of nonverbal information. Kyoto Univ Dept Edu Bull Paper 47:403–413 Kitagami S (2001) Disruptive effect of verbal encoding on memory and cognition of nonverbal information. Kyoto Univ Dept Edu Bull Paper 47:403–413
71.
Zurück zum Zitat Kasahara H, Ochi K (2008) Verbal overshadowing effect in earwitness perception. Proc Ann Conv Jpn Psychol Assoc 72:889 Kasahara H, Ochi K (2008) Verbal overshadowing effect in earwitness perception. Proc Ann Conv Jpn Psychol Assoc 72:889
72.
Zurück zum Zitat Cook S, Wilding J (2001) Earwitness testimony: effects of exposure and attention on the face overshadowing effect. Br J Psychol 92:617–629CrossRef Cook S, Wilding J (2001) Earwitness testimony: effects of exposure and attention on the face overshadowing effect. Br J Psychol 92:617–629CrossRef
73.
Zurück zum Zitat Kasahara H, Ochi K (2006) Effect of face presence on memory for a voice. J Jpn Acad Facial Studies 6:71–76 Kasahara H, Ochi K (2006) Effect of face presence on memory for a voice. J Jpn Acad Facial Studies 6:71–76
74.
Zurück zum Zitat Yarmey AD, Yarmey AL, Yarmey MJ (1994) Face and voice identifications in showups and lineups. Appl Cogn Psychol 8:453–464CrossRef Yarmey AD, Yarmey AL, Yarmey MJ (1994) Face and voice identifications in showups and lineups. Appl Cogn Psychol 8:453–464CrossRef
75.
Zurück zum Zitat Bull R, Clifford BR (1984) Earwitness voice recognition accuracy. In: Wells GL, Loftus EF (eds) Eyewitness testimony: psychological perspectives. Cambridge University Press, Cambridge, pp 92–123 Bull R, Clifford BR (1984) Earwitness voice recognition accuracy. In: Wells GL, Loftus EF (eds) Eyewitness testimony: psychological perspectives. Cambridge University Press, Cambridge, pp 92–123
76.
Zurück zum Zitat Kerstholt JH, Jansen N, Van Amelsvoort AG, Broeders AP (2004) Earwitnesses: effects of speech duration, retention, internal and acoustic environment. Appl Cogn Psychol 18:327–336CrossRef Kerstholt JH, Jansen N, Van Amelsvoort AG, Broeders AP (2004) Earwitnesses: effects of speech duration, retention, internal and acoustic environment. Appl Cogn Psychol 18:327–336CrossRef
77.
Zurück zum Zitat Van Wallendael LR, Surace A, Parsons DH, Brown M (1994) Earwitness’ voice recognition: factors affecting accuracy and impact on jurors. Appl Cogn Psychol 8:661–677CrossRef Van Wallendael LR, Surace A, Parsons DH, Brown M (1994) Earwitness’ voice recognition: factors affecting accuracy and impact on jurors. Appl Cogn Psychol 8:661–677CrossRef
78.
Zurück zum Zitat Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoust Soc Am 35:354–358CrossRef Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoust Soc Am 35:354–358CrossRef
79.
Zurück zum Zitat Li KP, Dammann JE, Chapman WD (1966) Experimental studies in speaker verification, using and adaptive system. J Acoust Soc Am 40:966–978CrossRef Li KP, Dammann JE, Chapman WD (1966) Experimental studies in speaker verification, using and adaptive system. J Acoust Soc Am 40:966–978CrossRef
80.
Zurück zum Zitat Glenn JW, Kleiner N (1967) Speaker identification based on nasal phonation. J Acoust Soc Am 43:368–372CrossRef Glenn JW, Kleiner N (1967) Speaker identification based on nasal phonation. J Acoust Soc Am 43:368–372CrossRef
81.
Zurück zum Zitat Furui S, Itakura F, Saito S (1972) Talker recognition by the longtime averaged speech spectrum. IEICE Trans 55-A(1):549–556 Furui S, Itakura F, Saito S (1972) Talker recognition by the longtime averaged speech spectrum. IEICE Trans 55-A(1):549–556
82.
Zurück zum Zitat Wolf JJ (1971) Efficient acoustic parameters for speaker recognition. J Acoust Soc Am 51:2044–2056CrossRef Wolf JJ (1971) Efficient acoustic parameters for speaker recognition. J Acoust Soc Am 51:2044–2056CrossRef
83.
Zurück zum Zitat Atal BS (1972) Automatic speaker recognition based on pitch contours. J Acoust Soc Am 52:1687–1697CrossRef Atal BS (1972) Automatic speaker recognition based on pitch contours. J Acoust Soc Am 52:1687–1697CrossRef
84.
Zurück zum Zitat Furui S, Itakura F (1973) Talker recognition by statistical features of speech sounds. Electron Commun Jap 56-A:62–71 Furui S, Itakura F (1973) Talker recognition by statistical features of speech sounds. Electron Commun Jap 56-A:62–71
85.
Zurück zum Zitat Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55:1304–1312CrossRef Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55:1304–1312CrossRef
86.
Zurück zum Zitat Sambur MR (1975) Selection of acoustic features for speaker identification. IEEE Trans Acoust Speech Sig Process 23:176–182CrossRef Sambur MR (1975) Selection of acoustic features for speaker identification. IEEE Trans Acoust Speech Sig Process 23:176–182CrossRef
87.
Zurück zum Zitat Hollien H, Majewski W (1977) Speaker identification by long-term spectra under normal and distorted speech conditions. J Acoust Soc Am 62:975–980CrossRef Hollien H, Majewski W (1977) Speaker identification by long-term spectra under normal and distorted speech conditions. J Acoust Soc Am 62:975–980CrossRef
88.
Zurück zum Zitat Matsumoto H, Nimura T (1978) Text-independent speaker identification based on piecewise canonical discriminant analysis. Proc Int Conf Acoust Speech Sig Process, 3:291–294 Matsumoto H, Nimura T (1978) Text-independent speaker identification based on piecewise canonical discriminant analysis. Proc Int Conf Acoust Speech Sig Process, 3:291–294
89.
Zurück zum Zitat Markel JD, Davis SB (1979) Text-independent speaker recognition from a large linguistically unconstrained time spaced data base. IEEE Trans Acoust Speech Sig Process 27:74–82CrossRef Markel JD, Davis SB (1979) Text-independent speaker recognition from a large linguistically unconstrained time spaced data base. IEEE Trans Acoust Speech Sig Process 27:74–82CrossRef
90.
Zurück zum Zitat Furui S (1981) Cepstral analysis technique for automatic speaker verification. IEEE Trans Acoust Speech Sig Process 29:254–272CrossRef Furui S (1981) Cepstral analysis technique for automatic speaker verification. IEEE Trans Acoust Speech Sig Process 29:254–272CrossRef
91.
Zurück zum Zitat Li KP, Wrench EH (1983) Text-independent speaker recognition with short utterances. Proc Int Conf Acoust Speech Sig Process, 8:555–558 Li KP, Wrench EH (1983) Text-independent speaker recognition with short utterances. Proc Int Conf Acoust Speech Sig Process, 8:555–558
92.
Zurück zum Zitat Soong F, Rosenberg A, Rabiner L, Juang BH (1985) A vector quantization approach to speaker recognition. Proc Int Conf Acoust Speech Sig Process, 387–390 Soong F, Rosenberg A, Rabiner L, Juang BH (1985) A vector quantization approach to speaker recognition. Proc Int Conf Acoust Speech Sig Process, 387–390
93.
Zurück zum Zitat Rosenberg A, Soong F (1986) Evaluation of a vector quantisation talker recognition system in text independent and text dependent modes. Proc Int Conf Acoust Speech Sig Process, 11:873–876 Rosenberg A, Soong F (1986) Evaluation of a vector quantisation talker recognition system in text independent and text dependent modes. Proc Int Conf Acoust Speech Sig Process, 11:873–876
94.
Zurück zum Zitat Shirai K, Mano K, Ishige D (1987) Speaker identification based on frequency distribution of vector-quantised spectra. IEICE Trans 70-D:1181–1188 Shirai K, Mano K, Ishige D (1987) Speaker identification based on frequency distribution of vector-quantised spectra. IEICE Trans 70-D:1181–1188
95.
Zurück zum Zitat Rosenberg A, Lee CH, Soong F (1990) Sub-word unit talker verification using Hidden Markov Models. Proc Int Conf Acoust Speech Sig Process, 1:269–272 Rosenberg A, Lee CH, Soong F (1990) Sub-word unit talker verification using Hidden Markov Models. Proc Int Conf Acoust Speech Sig Process, 1:269–272
96.
Zurück zum Zitat Higgins A, Bahler L, Porter J (1991) Speaker verification using randomized phrase prompting. Digit Signal Process 1:89–106 Higgins A, Bahler L, Porter J (1991) Speaker verification using randomized phrase prompting. Digit Signal Process 1:89–106
97.
Zurück zum Zitat Tishby NZ (1991) On the application of mixture AR Hidden Markov Models to text-independent speaker recognition. IEEE Trans Acoust Speech Sig Process 39:563–570 Tishby NZ (1991) On the application of mixture AR Hidden Markov Models to text-independent speaker recognition. IEEE Trans Acoust Speech Sig Process 39:563–570
98.
Zurück zum Zitat Reynolds AD, Carlson B (1995) Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers. Proc Eurospeech, pp 647–650 Reynolds AD, Carlson B (1995) Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers. Proc Eurospeech, pp 647–650
99.
Zurück zum Zitat Reynolds AD, Rose R (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audi Process 3:72–83CrossRef Reynolds AD, Rose R (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audi Process 3:72–83CrossRef
100.
Zurück zum Zitat Che C, Lin Q (1995) Speaker recognition using HMM with experiments on the YOHO database. Proc Eurospeech, pp 625–628 Che C, Lin Q (1995) Speaker recognition using HMM with experiments on the YOHO database. Proc Eurospeech, pp 625–628
101.
Zurück zum Zitat NIST webpage. http://www.nist.gov/index.html NIST webpage. http://​www.​nist.​gov/​index.​html
102.
Zurück zum Zitat NIST-SRE. http://www.itl.nist.gov/iad/mig//tests/sre/ NIST-SRE. http://​www.​itl.​nist.​gov/​iad/​mig/​/​tests/​sre/​
103.
Zurück zum Zitat Doddington GR, Przybocki MA, Martin AF, Reynolds DA (2000) The NIST speaker recognition evaluation—overview, methodology, systems, results, perspective. Speech Commun 31:225–254CrossRef Doddington GR, Przybocki MA, Martin AF, Reynolds DA (2000) The NIST speaker recognition evaluation—overview, methodology, systems, results, perspective. Speech Commun 31:225–254CrossRef
104.
Zurück zum Zitat Nakasone H, Beck SD (2001) Forensic automatic speaker recognition. Proc A Speaker Odyssey—the speaker recognition workshop, pp 139–142 Nakasone H, Beck SD (2001) Forensic automatic speaker recognition. Proc A Speaker Odyssey—the speaker recognition workshop, pp 139–142
105.
Zurück zum Zitat Drygajlo A (2007) Forensic automatic speaker recognition. IEEE Signal Process Mag 24:132–135CrossRef Drygajlo A (2007) Forensic automatic speaker recognition. IEEE Signal Process Mag 24:132–135CrossRef
106.
Zurück zum Zitat Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Proc Eurospeech, pp 1895–1898 Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Proc Eurospeech, pp 1895–1898
107.
Zurück zum Zitat Bimbot F, Bonastre JF, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds DA (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451 Bimbot F, Bonastre JF, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds DA (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451
108.
Zurück zum Zitat Noda H, Darada K, Kawaguchi E, Sawai H (1998) A context-dependent approach for speaker verification using sequential decision. Proc Int Conf Spoken Lang Process Noda H, Darada K, Kawaguchi E, Sawai H (1998) A context-dependent approach for speaker verification using sequential decision. Proc Int Conf Spoken Lang Process
109.
Zurück zum Zitat Ortega-Garcia J, Cruz-Llanas S, Gonzalez-Rodriguez J (1998) Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. Proc Int Conf Spoken Lang Process Ortega-Garcia J, Cruz-Llanas S, Gonzalez-Rodriguez J (1998) Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. Proc Int Conf Spoken Lang Process
110.
Zurück zum Zitat Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) On the application of the Bayesian approach to real forensic conditions with GMM-based systems. Proc a speaker odyssey—the speaker recognition workshop, pp 135–138 Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) On the application of the Bayesian approach to real forensic conditions with GMM-based systems. Proc a speaker odyssey—the speaker recognition workshop, pp 135–138
111.
Zurück zum Zitat Meuwly D, Drygajlo A (2001) Forensic speaker recognition based on a Bayesian framework and Gaussian Mixture Modelling (GMM). Proc a speaker odyssey—the speaker recognition workshop, pp 145–150 Meuwly D, Drygajlo A (2001) Forensic speaker recognition based on a Bayesian framework and Gaussian Mixture Modelling (GMM). Proc a speaker odyssey—the speaker recognition workshop, pp 145–150
112.
Zurück zum Zitat Alexander A, Botti F, Drygajlo A (2004) Handling mismatch in corpus-based forensic speaker recognition. Proc odyssey04 the speaker and language recognition workshop, pp 69–74 Alexander A, Botti F, Drygajlo A (2004) Handling mismatch in corpus-based forensic speaker recognition. Proc odyssey04 the speaker and language recognition workshop, pp 69–74
113.
Zurück zum Zitat Ramos D, Gonzalez-Rodriguez J, Gonzalez-Dominguez J, Lucena-Molina JJ (2008) Addressing database mismatch in forensic speaker recognition with Ahumada III: A public real-casework database in Spanish Proc Interspeech, pp 1493–1496 Ramos D, Gonzalez-Rodriguez J, Gonzalez-Dominguez J, Lucena-Molina JJ (2008) Addressing database mismatch in forensic speaker recognition with Ahumada III: A public real-casework database in Spanish Proc Interspeech, pp 1493–1496
114.
Zurück zum Zitat Thiruvaran T, Ambikairajah E, Epps J (2008) FM features for automatic forensic speaker recognition. Proc Interspeech, pp 1497–1500 Thiruvaran T, Ambikairajah E, Epps J (2008) FM features for automatic forensic speaker recognition. Proc Interspeech, pp 1497–1500
115.
Zurück zum Zitat Becker T, Jessen M, Grigoras C (2008) Forensic speaker verification using formant features and Gaussian Mixture Models. Proc Interspeech, pp 1505–1508 Becker T, Jessen M, Grigoras C (2008) Forensic speaker verification using formant features and Gaussian Mixture Models. Proc Interspeech, pp 1505–1508
116.
Zurück zum Zitat Becker T, Jessen M, Alsbach S, Bross F, Meier T (2010) SPES: The BKA forensic automatic voice comparison system. Proc Odyssey—the Speaker and Language Recognition Workshop, pp 58–62 Becker T, Jessen M, Alsbach S, Bross F, Meier T (2010) SPES: The BKA forensic automatic voice comparison system. Proc Odyssey—the Speaker and Language Recognition Workshop, pp 58–62
117.
Zurück zum Zitat Hermansky H (1989) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87:1738–1752CrossRef Hermansky H (1989) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87:1738–1752CrossRef
118.
Zurück zum Zitat Paul JE, Rabinowitz AS, Riganati JP, Richardson JM (1975) Semi-automatic speaker identification system (SASIS)—analytical studies. Final Report C74–11841501, Rockwell International Paul JE, Rabinowitz AS, Riganati JP, Richardson JM (1975) Semi-automatic speaker identification system (SASIS)—analytical studies. Final Report C74–11841501, Rockwell International
119.
Zurück zum Zitat Bunge E (1977) Speaker recognition by computer. Philips Tech. Review 37(8):207–219 Bunge E (1977) Speaker recognition by computer. Philips Tech. Review 37(8):207–219
120.
Zurück zum Zitat Nakasone H, Melvin C (1989) C.A.V.I.S.: (Computer assisted voice identification system). Final Report 85-IJ-CX-0024. National Institute of Justice Nakasone H, Melvin C (1989) C.A.V.I.S.: (Computer assisted voice identification system). Final Report 85-IJ-CX-0024. National Institute of Justice
121.
Zurück zum Zitat Falcone M, De Sairo N (1994) A PC speaker identification system for forensic use: IDEM. Proc ESCA workshop on automatic speaker recognition, identification and verification, pp 169–172 Falcone M, De Sairo N (1994) A PC speaker identification system for forensic use: IDEM. Proc ESCA workshop on automatic speaker recognition, identification and verification, pp 169–172
122.
Zurück zum Zitat Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) IdentiVox: a PC-Windows tool for text-independent speaker recognition in forensic environments. Prob Forensic Sci 47:246–253 Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) IdentiVox: a PC-Windows tool for text-independent speaker recognition in forensic environments. Prob Forensic Sci 47:246–253
123.
Zurück zum Zitat Drygajlo A, Meuwly D, Alexander A (2003) Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Proc Eurospeech, pp 689–692 Drygajlo A, Meuwly D, Alexander A (2003) Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Proc Eurospeech, pp 689–692
124.
Zurück zum Zitat Agnitio, Sociedad Limitada. http://www.agnitio.es/index.php Agnitio, Sociedad Limitada. http://​www.​agnitio.​es/​index.​php
125.
Zurück zum Zitat Morrison GS (2009) Forensic voice comparison and the paradigm shift. Sci Justice 49:298–308CrossRef Morrison GS (2009) Forensic voice comparison and the paradigm shift. Sci Justice 49:298–308CrossRef
Metadaten
Titel
Historical and Procedural Overview of Forensic Speaker Recognition as a Science
verfasst von
Kanae Amino, Ph.D.
Takashi Osanai, Ph.D.
Toshiaki Kamada, B.E.
Hisanori Makinae, Ph.D.
Takayuki Arai, Ph.D.
Copyright-Jahr
2012
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-0263-3_1

Neuer Inhalt