Skip to main content
Top
Published in:
Cover of the book

2012 | OriginalPaper | Chapter

1. Historical and Procedural Overview of Forensic Speaker Recognition as a Science

Authors : Kanae Amino, Ph.D., Takashi Osanai, Ph.D., Toshiaki Kamada, B.E., Hisanori Makinae, Ph.D., Takayuki Arai, Ph.D.

Published in: Forensic Speaker Recognition

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Forensic phonetics and acoustics are nowadays widely used regarding police and legal use of acoustic samples. Among many tasks included in this area, forensic speaker recognition is considered as one of the most complex problems. Forensic speaker recognition, sometimes called forensic speaker comparison, is a process for making judgments on whether or not two speech samples are from the same speaker. This chapter introduces the historical backgrounds of forensic speaker recognition including “voiceprint” controversy, human-based visual and auditory forensic speaker recognition, and automatic forensic speaker recognition. Procedural considerations in forensic speaker recognition processes and factors that affect recognition performances are also presented. Finally, we will give a summary of the progress and developments made in the forensic automatic speaker recognition.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Nolan F (1983) The phonetic basis of speaker recognition. Cambridge studies in speech science and communiation. Cambridge University Press, Cambridge Nolan F (1983) The phonetic basis of speaker recognition. Cambridge studies in speech science and communiation. Cambridge University Press, Cambridge
2.
go back to reference Schmidt-Nielsen A, Stern KR (1985) Identification of known voices as a function of familiarity and narrow-band coding. J Acoust Soc Am 77:658–663CrossRef Schmidt-Nielsen A, Stern KR (1985) Identification of known voices as a function of familiarity and narrow-band coding. J Acoust Soc Am 77:658–663CrossRef
3.
go back to reference Van Lacker D, Kreiman J, Emmorey K (1985) Familiar voice recognition: patterns and parameters part 1: recognition of backward voices. J Phonetics 13:19–38 Van Lacker D, Kreiman J, Emmorey K (1985) Familiar voice recognition: patterns and parameters part 1: recognition of backward voices. J Phonetics 13:19–38
4.
go back to reference Van Lacker D, Kreiman J (1985) Familiar voice recognition: patterns and parameters part 2: recognition of rate-altered voices. J Phonetics 13:39–52 Van Lacker D, Kreiman J (1985) Familiar voice recognition: patterns and parameters part 2: recognition of rate-altered voices. J Phonetics 13:39–52
5.
go back to reference Cheney D, Seyfarth R (1980) Vocal recognition in free-ranging vervet monkeys. Anim Behav 28:362–367CrossRef Cheney D, Seyfarth R (1980) Vocal recognition in free-ranging vervet monkeys. Anim Behav 28:362–367CrossRef
6.
go back to reference Rendall D, Rodman PS, Emond RE (1996) Vocal recognition of individuals and kin in free-ranging rhesus monkeys. Anim Behav 51:1007–1015CrossRef Rendall D, Rodman PS, Emond RE (1996) Vocal recognition of individuals and kin in free-ranging rhesus monkeys. Anim Behav 51:1007–1015CrossRef
7.
go back to reference Sugiura H (2001) Vocal exchange of coo calls in Japanese macaques. In: Matsuzawa T (ed) Primate origins of human cognition and behaviour. Springer, Tokyo, pp 135–154 Sugiura H (2001) Vocal exchange of coo calls in Japanese macaques. In: Matsuzawa T (ed) Primate origins of human cognition and behaviour. Springer, Tokyo, pp 135–154
8.
go back to reference Bricker P, Pruzansky S (1976) Speaker recognition. In: Lass N (ed) Contemporary issues in experimental phonetics. Academic Press, New York, pp 295–326 Bricker P, Pruzansky S (1976) Speaker recognition. In: Lass N (ed) Contemporary issues in experimental phonetics. Academic Press, New York, pp 295–326
9.
go back to reference Furui S (1992) Acoustic and speech engineering (onkyo, onsei kougaku). Kindai Kagakusha Publishing Company, Tokyo Furui S (1992) Acoustic and speech engineering (onkyo, onsei kougaku). Kindai Kagakusha Publishing Company, Tokyo
10.
go back to reference National Research Council (1979) On the theory and practice of voice identification. National Academy of Science, Washington, pp 3–13 National Research Council (1979) On the theory and practice of voice identification. National Academy of Science, Washington, pp 3–13
11.
go back to reference Steinberg JC (1934) Application of sound measuring instruments to the study of phonetic problems. J Acoust Soc Am 6:16–24CrossRef Steinberg JC (1934) Application of sound measuring instruments to the study of phonetic problems. J Acoust Soc Am 6:16–24CrossRef
12.
13.
go back to reference Grey CHG, Kopp GA (1944) Voiceprint identification. Bell Telephone Laboratory Annual Report, New York, pp 1–14 Grey CHG, Kopp GA (1944) Voiceprint identification. Bell Telephone Laboratory Annual Report, New York, pp 1–14
14.
go back to reference Tosi O, Oyer H, Lashbrook W, Pedrey C, Nicol J, Nash E (1972) Experiment on voice identification. J Acoust Soc Am 51:2030–2043CrossRef Tosi O, Oyer H, Lashbrook W, Pedrey C, Nicol J, Nash E (1972) Experiment on voice identification. J Acoust Soc Am 51:2030–2043CrossRef
15.
16.
go back to reference Campbell JP, Shen W, Campbell WM, Schwartz R, Bonastre JF, Matrouf D (2009) Forensic speaker recognition. IEEE Signal Process Mag 26:95–103CrossRef Campbell JP, Shen W, Campbell WM, Schwartz R, Bonastre JF, Matrouf D (2009) Forensic speaker recognition. IEEE Signal Process Mag 26:95–103CrossRef
17.
go back to reference Young MA, Campbell RA (1967) Effects of context on talker identification. J Acoust Soc Am 42:1250–1254CrossRef Young MA, Campbell RA (1967) Effects of context on talker identification. J Acoust Soc Am 42:1250–1254CrossRef
18.
go back to reference Tosi O (1968) Speaker identification through acoustic spectrography. Proc Logoped Phoniatr, pp 138–145 Tosi O (1968) Speaker identification through acoustic spectrography. Proc Logoped Phoniatr, pp 138–145
19.
go back to reference Stevens KN, Williams CE, Carbonell JR, Woods B (1968) Speaker authentication and identification: a comparison of spectrographic and auditory presentations of speech material. J Acoust Soc Am 44:1596–1607CrossRef Stevens KN, Williams CE, Carbonell JR, Woods B (1968) Speaker authentication and identification: a comparison of spectrographic and auditory presentations of speech material. J Acoust Soc Am 44:1596–1607CrossRef
20.
go back to reference Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1970) Speaker identification by speech spectrograms: a scientists’ view of its reliability for legal purposes. J Acoust Soc Am 47:597–612CrossRef Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1970) Speaker identification by speech spectrograms: a scientists’ view of its reliability for legal purposes. J Acoust Soc Am 47:597–612CrossRef
21.
go back to reference Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1973) Speaker identification by speech spectrograpms: some further observations. J Acoust Soc Am 54:531–534CrossRef Bolt RH, Cooper FS, David EE Jr, Denes PB, Pickett JM, Stevens KN (1973) Speaker identification by speech spectrograpms: some further observations. J Acoust Soc Am 54:531–534CrossRef
22.
go back to reference Koenig BE (1986) Spectrographic voice identification: a forensic survey. J Acoust Soc Am 79:2088–2090CrossRef Koenig BE (1986) Spectrographic voice identification: a forensic survey. J Acoust Soc Am 79:2088–2090CrossRef
23.
go back to reference Shipp T, Doherty TE, Hollien H (1987) Some fundamental considerations regarding voice identification. J Acoust Soc Am 82:687–688CrossRef Shipp T, Doherty TE, Hollien H (1987) Some fundamental considerations regarding voice identification. J Acoust Soc Am 82:687–688CrossRef
24.
go back to reference Koenig BE, Ritenour DV Jr, Kohus BA, Kelly S (1987) Reply to ‘Some fundamental considerations regarding voice identification’. J Acoust Soc Am 82:688–689CrossRef Koenig BE, Ritenour DV Jr, Kohus BA, Kelly S (1987) Reply to ‘Some fundamental considerations regarding voice identification’. J Acoust Soc Am 82:688–689CrossRef
25.
go back to reference Lindh J (2004) Handling the voiceprint issue. Proc Fonetik, pp 72–75 Lindh J (2004) Handling the voiceprint issue. Proc Fonetik, pp 72–75
26.
go back to reference Poza FT, Begault DR (2005) Voice identification and elimination using sural-spectrographic protocols. Proc AES Int’l Conf, pp 1–8 Poza FT, Begault DR (2005) Voice identification and elimination using sural-spectrographic protocols. Proc AES Int’l Conf, pp 1–8
27.
go back to reference McGehee F (1937) The reliability of the identification of the human voice. J Gen Psychol 17:249–271CrossRef McGehee F (1937) The reliability of the identification of the human voice. J Gen Psychol 17:249–271CrossRef
28.
go back to reference McGehee F (1944) An experimental study of voice recognition. J Gen Psychol 31:53–65CrossRef McGehee F (1944) An experimental study of voice recognition. J Gen Psychol 31:53–65CrossRef
29.
go back to reference Pollack I, Pickett JM, Sumby WH (1954) On the identification of speaker by voice. J Acoust Soc Am 26:403–406CrossRef Pollack I, Pickett JM, Sumby WH (1954) On the identification of speaker by voice. J Acoust Soc Am 26:403–406CrossRef
30.
go back to reference Bricker P, Pruzansky S (1966) Effects of stimulus content and duration on talker identification. J Acoust Soc Am 40:1441–1450CrossRef Bricker P, Pruzansky S (1966) Effects of stimulus content and duration on talker identification. J Acoust Soc Am 40:1441–1450CrossRef
31.
go back to reference Clifford BR (1980) Voice identification by human listeners: on earwitness reliability. Law Human Behav 4:373–394CrossRef Clifford BR (1980) Voice identification by human listeners: on earwitness reliability. Law Human Behav 4:373–394CrossRef
32.
go back to reference Papcun G, Kreiman J, Davis A (1989) Long-term memory for unfamiliar voices. J Acoust Soc Am 85:913–925CrossRef Papcun G, Kreiman J, Davis A (1989) Long-term memory for unfamiliar voices. J Acoust Soc Am 85:913–925CrossRef
33.
go back to reference Yarmey AD, Matthys E (1992) Voice identification of an abductor. Appl Cogn Psychol 6:367–377CrossRef Yarmey AD, Matthys E (1992) Voice identification of an abductor. Appl Cogn Psychol 6:367–377CrossRef
34.
go back to reference Yarmey AD, Yarmey AL, Yarmey M, Parliament L (2001) Commonsense beliefs and the identification of familiar voices. Appl Cogn Psychol 15:283–299CrossRef Yarmey AD, Yarmey AL, Yarmey M, Parliament L (2001) Commonsense beliefs and the identification of familiar voices. Appl Cogn Psychol 15:283–299CrossRef
35.
go back to reference O’Shaughnessy D (2001) Speech communication—human and machine, 2nd edn. Addison-Wesley Publishing Company, New York O’Shaughnessy D (2001) Speech communication—human and machine, 2nd edn. Addison-Wesley Publishing Company, New York
36.
go back to reference Hollien H (2002) Forensic voice identification. Academic Press, San Diego Hollien H (2002) Forensic voice identification. Academic Press, San Diego
37.
go back to reference Bonastre JF, Bimbot F, Boe LJ, Campbell JP, Reynolds DA, Magrin-Chagnolleau I (2003) Person authentication by voice: a need for caution. Proc Eurospeech, pp 1–4 Bonastre JF, Bimbot F, Boe LJ, Campbell JP, Reynolds DA, Magrin-Chagnolleau I (2003) Person authentication by voice: a need for caution. Proc Eurospeech, pp 1–4
38.
go back to reference Denes PB, Pinson EN (1993) The speech chain, 2nd edn. Worth Publishers, New York Denes PB, Pinson EN (1993) The speech chain, 2nd edn. Worth Publishers, New York
39.
go back to reference Kuenzel H (2000) Effects of voice disguise on speaking fundamental frequency. Forensic Ling 7:149–179CrossRef Kuenzel H (2000) Effects of voice disguise on speaking fundamental frequency. Forensic Ling 7:149–179CrossRef
40.
go back to reference Zhang C, Tan T (2007) Voice disguise and automatic speaker recognition. Forensic Sci Int 175:118–122CrossRef Zhang C, Tan T (2007) Voice disguise and automatic speaker recognition. Forensic Sci Int 175:118–122CrossRef
41.
go back to reference Reich AR, Duke JE (1979) Effects of selected vocal disguises upon speaker identification by listening. J Acoust Soc Am 66:1023–1028CrossRef Reich AR, Duke JE (1979) Effects of selected vocal disguises upon speaker identification by listening. J Acoust Soc Am 66:1023–1028CrossRef
42.
go back to reference Orchard TL, Yarmey AD (1995) The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Appl Cogn Psychol 9:249–260CrossRef Orchard TL, Yarmey AD (1995) The effects of whispers, voice-sample duration, and voice distinctiveness on criminal speaker identification. Appl Cogn Psychol 9:249–260CrossRef
43.
go back to reference Sjoestroem M, Eriksson E, Zetterholm E, Sullivan KP (2006) A switch of dialect as disguise. Lund Univ. Linguistics and Phonetics Woking Papers, vol 52, pp 113–116 Sjoestroem M, Eriksson E, Zetterholm E, Sullivan KP (2006) A switch of dialect as disguise. Lund Univ. Linguistics and Phonetics Woking Papers, vol 52, pp 113–116
44.
go back to reference Markham D (1999) Listeners and disguised voices: the imitation and perception of dialect accent. J Speech Lang Law 6:289–299 Markham D (1999) Listeners and disguised voices: the imitation and perception of dialect accent. J Speech Lang Law 6:289–299
45.
go back to reference Amino K, Arai T (2009) Dialectal characteristics of Osaka and Tokyo Japanese: analyses of phonologically identical words. Proc Interspeech, pp 2303–2306 Amino K, Arai T (2009) Dialectal characteristics of Osaka and Tokyo Japanese: analyses of phonologically identical words. Proc Interspeech, pp 2303–2306
46.
go back to reference House AS, Stevens KN (1993) Speech production: thirty years after. J Acoust Soc Am 94:1763CrossRef House AS, Stevens KN (1993) Speech production: thirty years after. J Acoust Soc Am 94:1763CrossRef
47.
go back to reference Hollien H, Schwartz R (2000) Aural-perceptual speaker identification: problems with noncontemporary samples. Forensic Linguist 7:199–211CrossRef Hollien H, Schwartz R (2000) Aural-perceptual speaker identification: problems with noncontemporary samples. Forensic Linguist 7:199–211CrossRef
48.
go back to reference Hollien H, Schwartz R (2001) Speaker identification utilizing noncontemporary speech. J Forensic Sci 46:63–67 Hollien H, Schwartz R (2001) Speaker identification utilizing noncontemporary speech. J Forensic Sci 46:63–67
49.
go back to reference Amino K, Osanai T, Kamada T, Makinae H, Arai T (2011) Effects of the phonological contents and transmission channels on forensic speaker recognition. In: Neustein A, Patil HA (eds) Advances in forensic speaker recognition. Springer Amino K, Osanai T, Kamada T, Makinae H, Arai T (2011) Effects of the phonological contents and transmission channels on forensic speaker recognition. In: Neustein A, Patil HA (eds) Advances in forensic speaker recognition. Springer
50.
go back to reference Kuenzel HJ (2001) Beware of the ’telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Liguist 8:80–99CrossRef Kuenzel HJ (2001) Beware of the ’telephone effect’: the influence of telephone transmission on the measurement of formant frequencies. Forensic Liguist 8:80–99CrossRef
51.
go back to reference Byne C, Foulkes P (2004) The ‘mobile phone effect’ on vowel formants. J Speech Lang Law 11:1350–1771 Byne C, Foulkes P (2004) The ‘mobile phone effect’ on vowel formants. J Speech Lang Law 11:1350–1771
52.
go back to reference Lawrence S, Nolan F, McDougall K (2008) Acoustic and perceptual effects of telephone transmission on vowel quality. J Speech Lang Law 15:161–192 Lawrence S, Nolan F, McDougall K (2008) Acoustic and perceptual effects of telephone transmission on vowel quality. J Speech Lang Law 15:161–192
53.
go back to reference Titze I (1989) Physiologic and acoustic differences between male and female voices. J Acoust Soc Am 85:1699–1707CrossRef Titze I (1989) Physiologic and acoustic differences between male and female voices. J Acoust Soc Am 85:1699–1707CrossRef
54.
go back to reference Kent RD, Read C (2001) Acoustic analysis of speech, 2nd edn. Cengage Learning Kent RD, Read C (2001) Acoustic analysis of speech, 2nd edn. Cengage Learning
55.
go back to reference Clarke FR, Becker RW (1969) Comparison of techniques for discriminating among talkers. J Speech Hear Res 12:747–761 Clarke FR, Becker RW (1969) Comparison of techniques for discriminating among talkers. J Speech Hear Res 12:747–761
56.
go back to reference Thompson CP (1987) A language effect in voice identification. Appl Cogn Psychol 1:121–131CrossRef Thompson CP (1987) A language effect in voice identification. Appl Cogn Psychol 1:121–131CrossRef
57.
go back to reference Goggin J, Thompson CP, Strube G, Simental LR (1991) The role of language familiarity in voice identification. Mem Cognit 19:448–458CrossRef Goggin J, Thompson CP, Strube G, Simental LR (1991) The role of language familiarity in voice identification. Mem Cognit 19:448–458CrossRef
58.
go back to reference Koester O, Schiller NO (1997) Different influences of the native language of a listener on speaker recognition. Forensic Linguist 4:18–28 Koester O, Schiller NO (1997) Different influences of the native language of a listener on speaker recognition. Forensic Linguist 4:18–28
59.
go back to reference Philippon AC, Cherryman J, Bull R, Vrij A (2007) Earwitness identification performances: the effect of language, target, deliberate strategies and indirect measures. Appl Cogn Psychol 21:539–550CrossRef Philippon AC, Cherryman J, Bull R, Vrij A (2007) Earwitness identification performances: the effect of language, target, deliberate strategies and indirect measures. Appl Cogn Psychol 21:539–550CrossRef
60.
go back to reference Hashimoto M, Kitagawa S, Higuchi N (1998) Quantitative analysis of acoustic features affecting speaker identification. J Acoust Soc Jpn 54:169–178 Hashimoto M, Kitagawa S, Higuchi N (1998) Quantitative analysis of acoustic features affecting speaker identification. J Acoust Soc Jpn 54:169–178
61.
go back to reference Hollien H, Majewski W, Doherty TE (1982) Perceptual identification of voices under normal, stress, and disguise speaking conditions. J Phonetics 10:139–148 Hollien H, Majewski W, Doherty TE (1982) Perceptual identification of voices under normal, stress, and disguise speaking conditions. J Phonetics 10:139–148
62.
go back to reference Ladefoged P, Ladefoged J (1980) The ability of listeners to identify voices. UCLA Working Papers Phon 49:43–89 Ladefoged P, Ladefoged J (1980) The ability of listeners to identify voices. UCLA Working Papers Phon 49:43–89
63.
go back to reference Nygaard L (2005) Perceptual integration of linguistic and nonlinguistic properties of speech. In: Pisoni DB, Remez RE (eds) The handbook of speech perception. Blackwell, Oxford, pp 390–413 Nygaard L (2005) Perceptual integration of linguistic and nonlinguistic properties of speech. In: Pisoni DB, Remez RE (eds) The handbook of speech perception. Blackwell, Oxford, pp 390–413
64.
go back to reference Roebuck R, Wilding J (1993) Effects of vowel variety and sample length on identification of a speaker in a line-up. Appl Cogn Psychol 7:475–481CrossRef Roebuck R, Wilding J (1993) Effects of vowel variety and sample length on identification of a speaker in a line-up. Appl Cogn Psychol 7:475–481CrossRef
65.
go back to reference Cook S, Wilding J (1997) Earwitness testimony: never mind the variety, hear the length. Appl Cogn Psychol 11:95–111CrossRef Cook S, Wilding J (1997) Earwitness testimony: never mind the variety, hear the length. Appl Cogn Psychol 11:95–111CrossRef
66.
go back to reference Loftus EF, Loftus GR, Messo J (1987) Some facts about weapon focus. Law Human Behav 11:55–62CrossRef Loftus EF, Loftus GR, Messo J (1987) Some facts about weapon focus. Law Human Behav 11:55–62CrossRef
67.
go back to reference Loftus EF, Miller DG, Burns HJ (1978) Semantic integration of verbal information into a visual memory. J Exp Psychol Human Learn Mem 4:19–31CrossRef Loftus EF, Miller DG, Burns HJ (1978) Semantic integration of verbal information into a visual memory. J Exp Psychol Human Learn Mem 4:19–31CrossRef
68.
go back to reference Schooler JW, Engstler-Schooler TY (1990) Verbal overshadowing of visual memories: some things are better left unsaid. Cogn Psychol 22:36–71CrossRef Schooler JW, Engstler-Schooler TY (1990) Verbal overshadowing of visual memories: some things are better left unsaid. Cogn Psychol 22:36–71CrossRef
69.
go back to reference Chin JM, Schooler JW (2008) Why do words hurt? Content, process, and criterion shift accounts of verbal overshadowing. Eur J Cogn Psychol 20:396–413CrossRef Chin JM, Schooler JW (2008) Why do words hurt? Content, process, and criterion shift accounts of verbal overshadowing. Eur J Cogn Psychol 20:396–413CrossRef
70.
go back to reference Kitagami S (2001) Disruptive effect of verbal encoding on memory and cognition of nonverbal information. Kyoto Univ Dept Edu Bull Paper 47:403–413 Kitagami S (2001) Disruptive effect of verbal encoding on memory and cognition of nonverbal information. Kyoto Univ Dept Edu Bull Paper 47:403–413
71.
go back to reference Kasahara H, Ochi K (2008) Verbal overshadowing effect in earwitness perception. Proc Ann Conv Jpn Psychol Assoc 72:889 Kasahara H, Ochi K (2008) Verbal overshadowing effect in earwitness perception. Proc Ann Conv Jpn Psychol Assoc 72:889
72.
go back to reference Cook S, Wilding J (2001) Earwitness testimony: effects of exposure and attention on the face overshadowing effect. Br J Psychol 92:617–629CrossRef Cook S, Wilding J (2001) Earwitness testimony: effects of exposure and attention on the face overshadowing effect. Br J Psychol 92:617–629CrossRef
73.
go back to reference Kasahara H, Ochi K (2006) Effect of face presence on memory for a voice. J Jpn Acad Facial Studies 6:71–76 Kasahara H, Ochi K (2006) Effect of face presence on memory for a voice. J Jpn Acad Facial Studies 6:71–76
74.
go back to reference Yarmey AD, Yarmey AL, Yarmey MJ (1994) Face and voice identifications in showups and lineups. Appl Cogn Psychol 8:453–464CrossRef Yarmey AD, Yarmey AL, Yarmey MJ (1994) Face and voice identifications in showups and lineups. Appl Cogn Psychol 8:453–464CrossRef
75.
go back to reference Bull R, Clifford BR (1984) Earwitness voice recognition accuracy. In: Wells GL, Loftus EF (eds) Eyewitness testimony: psychological perspectives. Cambridge University Press, Cambridge, pp 92–123 Bull R, Clifford BR (1984) Earwitness voice recognition accuracy. In: Wells GL, Loftus EF (eds) Eyewitness testimony: psychological perspectives. Cambridge University Press, Cambridge, pp 92–123
76.
go back to reference Kerstholt JH, Jansen N, Van Amelsvoort AG, Broeders AP (2004) Earwitnesses: effects of speech duration, retention, internal and acoustic environment. Appl Cogn Psychol 18:327–336CrossRef Kerstholt JH, Jansen N, Van Amelsvoort AG, Broeders AP (2004) Earwitnesses: effects of speech duration, retention, internal and acoustic environment. Appl Cogn Psychol 18:327–336CrossRef
77.
go back to reference Van Wallendael LR, Surace A, Parsons DH, Brown M (1994) Earwitness’ voice recognition: factors affecting accuracy and impact on jurors. Appl Cogn Psychol 8:661–677CrossRef Van Wallendael LR, Surace A, Parsons DH, Brown M (1994) Earwitness’ voice recognition: factors affecting accuracy and impact on jurors. Appl Cogn Psychol 8:661–677CrossRef
78.
go back to reference Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoust Soc Am 35:354–358CrossRef Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoust Soc Am 35:354–358CrossRef
79.
go back to reference Li KP, Dammann JE, Chapman WD (1966) Experimental studies in speaker verification, using and adaptive system. J Acoust Soc Am 40:966–978CrossRef Li KP, Dammann JE, Chapman WD (1966) Experimental studies in speaker verification, using and adaptive system. J Acoust Soc Am 40:966–978CrossRef
80.
go back to reference Glenn JW, Kleiner N (1967) Speaker identification based on nasal phonation. J Acoust Soc Am 43:368–372CrossRef Glenn JW, Kleiner N (1967) Speaker identification based on nasal phonation. J Acoust Soc Am 43:368–372CrossRef
81.
go back to reference Furui S, Itakura F, Saito S (1972) Talker recognition by the longtime averaged speech spectrum. IEICE Trans 55-A(1):549–556 Furui S, Itakura F, Saito S (1972) Talker recognition by the longtime averaged speech spectrum. IEICE Trans 55-A(1):549–556
82.
go back to reference Wolf JJ (1971) Efficient acoustic parameters for speaker recognition. J Acoust Soc Am 51:2044–2056CrossRef Wolf JJ (1971) Efficient acoustic parameters for speaker recognition. J Acoust Soc Am 51:2044–2056CrossRef
83.
go back to reference Atal BS (1972) Automatic speaker recognition based on pitch contours. J Acoust Soc Am 52:1687–1697CrossRef Atal BS (1972) Automatic speaker recognition based on pitch contours. J Acoust Soc Am 52:1687–1697CrossRef
84.
go back to reference Furui S, Itakura F (1973) Talker recognition by statistical features of speech sounds. Electron Commun Jap 56-A:62–71 Furui S, Itakura F (1973) Talker recognition by statistical features of speech sounds. Electron Commun Jap 56-A:62–71
85.
go back to reference Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55:1304–1312CrossRef Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55:1304–1312CrossRef
86.
go back to reference Sambur MR (1975) Selection of acoustic features for speaker identification. IEEE Trans Acoust Speech Sig Process 23:176–182CrossRef Sambur MR (1975) Selection of acoustic features for speaker identification. IEEE Trans Acoust Speech Sig Process 23:176–182CrossRef
87.
go back to reference Hollien H, Majewski W (1977) Speaker identification by long-term spectra under normal and distorted speech conditions. J Acoust Soc Am 62:975–980CrossRef Hollien H, Majewski W (1977) Speaker identification by long-term spectra under normal and distorted speech conditions. J Acoust Soc Am 62:975–980CrossRef
88.
go back to reference Matsumoto H, Nimura T (1978) Text-independent speaker identification based on piecewise canonical discriminant analysis. Proc Int Conf Acoust Speech Sig Process, 3:291–294 Matsumoto H, Nimura T (1978) Text-independent speaker identification based on piecewise canonical discriminant analysis. Proc Int Conf Acoust Speech Sig Process, 3:291–294
89.
go back to reference Markel JD, Davis SB (1979) Text-independent speaker recognition from a large linguistically unconstrained time spaced data base. IEEE Trans Acoust Speech Sig Process 27:74–82CrossRef Markel JD, Davis SB (1979) Text-independent speaker recognition from a large linguistically unconstrained time spaced data base. IEEE Trans Acoust Speech Sig Process 27:74–82CrossRef
90.
go back to reference Furui S (1981) Cepstral analysis technique for automatic speaker verification. IEEE Trans Acoust Speech Sig Process 29:254–272CrossRef Furui S (1981) Cepstral analysis technique for automatic speaker verification. IEEE Trans Acoust Speech Sig Process 29:254–272CrossRef
91.
go back to reference Li KP, Wrench EH (1983) Text-independent speaker recognition with short utterances. Proc Int Conf Acoust Speech Sig Process, 8:555–558 Li KP, Wrench EH (1983) Text-independent speaker recognition with short utterances. Proc Int Conf Acoust Speech Sig Process, 8:555–558
92.
go back to reference Soong F, Rosenberg A, Rabiner L, Juang BH (1985) A vector quantization approach to speaker recognition. Proc Int Conf Acoust Speech Sig Process, 387–390 Soong F, Rosenberg A, Rabiner L, Juang BH (1985) A vector quantization approach to speaker recognition. Proc Int Conf Acoust Speech Sig Process, 387–390
93.
go back to reference Rosenberg A, Soong F (1986) Evaluation of a vector quantisation talker recognition system in text independent and text dependent modes. Proc Int Conf Acoust Speech Sig Process, 11:873–876 Rosenberg A, Soong F (1986) Evaluation of a vector quantisation talker recognition system in text independent and text dependent modes. Proc Int Conf Acoust Speech Sig Process, 11:873–876
94.
go back to reference Shirai K, Mano K, Ishige D (1987) Speaker identification based on frequency distribution of vector-quantised spectra. IEICE Trans 70-D:1181–1188 Shirai K, Mano K, Ishige D (1987) Speaker identification based on frequency distribution of vector-quantised spectra. IEICE Trans 70-D:1181–1188
95.
go back to reference Rosenberg A, Lee CH, Soong F (1990) Sub-word unit talker verification using Hidden Markov Models. Proc Int Conf Acoust Speech Sig Process, 1:269–272 Rosenberg A, Lee CH, Soong F (1990) Sub-word unit talker verification using Hidden Markov Models. Proc Int Conf Acoust Speech Sig Process, 1:269–272
96.
go back to reference Higgins A, Bahler L, Porter J (1991) Speaker verification using randomized phrase prompting. Digit Signal Process 1:89–106 Higgins A, Bahler L, Porter J (1991) Speaker verification using randomized phrase prompting. Digit Signal Process 1:89–106
97.
go back to reference Tishby NZ (1991) On the application of mixture AR Hidden Markov Models to text-independent speaker recognition. IEEE Trans Acoust Speech Sig Process 39:563–570 Tishby NZ (1991) On the application of mixture AR Hidden Markov Models to text-independent speaker recognition. IEEE Trans Acoust Speech Sig Process 39:563–570
98.
go back to reference Reynolds AD, Carlson B (1995) Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers. Proc Eurospeech, pp 647–650 Reynolds AD, Carlson B (1995) Text-dependent speaker verification using decoupled and integrated speaker and speech recognizers. Proc Eurospeech, pp 647–650
99.
go back to reference Reynolds AD, Rose R (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audi Process 3:72–83CrossRef Reynolds AD, Rose R (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audi Process 3:72–83CrossRef
100.
go back to reference Che C, Lin Q (1995) Speaker recognition using HMM with experiments on the YOHO database. Proc Eurospeech, pp 625–628 Che C, Lin Q (1995) Speaker recognition using HMM with experiments on the YOHO database. Proc Eurospeech, pp 625–628
101.
go back to reference NIST webpage. http://www.nist.gov/index.html NIST webpage. http://​www.​nist.​gov/​index.​html
102.
go back to reference NIST-SRE. http://www.itl.nist.gov/iad/mig//tests/sre/ NIST-SRE. http://​www.​itl.​nist.​gov/​iad/​mig/​/​tests/​sre/​
103.
go back to reference Doddington GR, Przybocki MA, Martin AF, Reynolds DA (2000) The NIST speaker recognition evaluation—overview, methodology, systems, results, perspective. Speech Commun 31:225–254CrossRef Doddington GR, Przybocki MA, Martin AF, Reynolds DA (2000) The NIST speaker recognition evaluation—overview, methodology, systems, results, perspective. Speech Commun 31:225–254CrossRef
104.
go back to reference Nakasone H, Beck SD (2001) Forensic automatic speaker recognition. Proc A Speaker Odyssey—the speaker recognition workshop, pp 139–142 Nakasone H, Beck SD (2001) Forensic automatic speaker recognition. Proc A Speaker Odyssey—the speaker recognition workshop, pp 139–142
105.
go back to reference Drygajlo A (2007) Forensic automatic speaker recognition. IEEE Signal Process Mag 24:132–135CrossRef Drygajlo A (2007) Forensic automatic speaker recognition. IEEE Signal Process Mag 24:132–135CrossRef
106.
go back to reference Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Proc Eurospeech, pp 1895–1898 Martin A, Doddington G, Kamm T, Ordowski M, Przybocki M (1997) The DET curve in assessment of detection task performance. Proc Eurospeech, pp 1895–1898
107.
go back to reference Bimbot F, Bonastre JF, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds DA (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451 Bimbot F, Bonastre JF, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds DA (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451
108.
go back to reference Noda H, Darada K, Kawaguchi E, Sawai H (1998) A context-dependent approach for speaker verification using sequential decision. Proc Int Conf Spoken Lang Process Noda H, Darada K, Kawaguchi E, Sawai H (1998) A context-dependent approach for speaker verification using sequential decision. Proc Int Conf Spoken Lang Process
109.
go back to reference Ortega-Garcia J, Cruz-Llanas S, Gonzalez-Rodriguez J (1998) Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. Proc Int Conf Spoken Lang Process Ortega-Garcia J, Cruz-Llanas S, Gonzalez-Rodriguez J (1998) Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. Proc Int Conf Spoken Lang Process
110.
go back to reference Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) On the application of the Bayesian approach to real forensic conditions with GMM-based systems. Proc a speaker odyssey—the speaker recognition workshop, pp 135–138 Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) On the application of the Bayesian approach to real forensic conditions with GMM-based systems. Proc a speaker odyssey—the speaker recognition workshop, pp 135–138
111.
go back to reference Meuwly D, Drygajlo A (2001) Forensic speaker recognition based on a Bayesian framework and Gaussian Mixture Modelling (GMM). Proc a speaker odyssey—the speaker recognition workshop, pp 145–150 Meuwly D, Drygajlo A (2001) Forensic speaker recognition based on a Bayesian framework and Gaussian Mixture Modelling (GMM). Proc a speaker odyssey—the speaker recognition workshop, pp 145–150
112.
go back to reference Alexander A, Botti F, Drygajlo A (2004) Handling mismatch in corpus-based forensic speaker recognition. Proc odyssey04 the speaker and language recognition workshop, pp 69–74 Alexander A, Botti F, Drygajlo A (2004) Handling mismatch in corpus-based forensic speaker recognition. Proc odyssey04 the speaker and language recognition workshop, pp 69–74
113.
go back to reference Ramos D, Gonzalez-Rodriguez J, Gonzalez-Dominguez J, Lucena-Molina JJ (2008) Addressing database mismatch in forensic speaker recognition with Ahumada III: A public real-casework database in Spanish Proc Interspeech, pp 1493–1496 Ramos D, Gonzalez-Rodriguez J, Gonzalez-Dominguez J, Lucena-Molina JJ (2008) Addressing database mismatch in forensic speaker recognition with Ahumada III: A public real-casework database in Spanish Proc Interspeech, pp 1493–1496
114.
go back to reference Thiruvaran T, Ambikairajah E, Epps J (2008) FM features for automatic forensic speaker recognition. Proc Interspeech, pp 1497–1500 Thiruvaran T, Ambikairajah E, Epps J (2008) FM features for automatic forensic speaker recognition. Proc Interspeech, pp 1497–1500
115.
go back to reference Becker T, Jessen M, Grigoras C (2008) Forensic speaker verification using formant features and Gaussian Mixture Models. Proc Interspeech, pp 1505–1508 Becker T, Jessen M, Grigoras C (2008) Forensic speaker verification using formant features and Gaussian Mixture Models. Proc Interspeech, pp 1505–1508
116.
go back to reference Becker T, Jessen M, Alsbach S, Bross F, Meier T (2010) SPES: The BKA forensic automatic voice comparison system. Proc Odyssey—the Speaker and Language Recognition Workshop, pp 58–62 Becker T, Jessen M, Alsbach S, Bross F, Meier T (2010) SPES: The BKA forensic automatic voice comparison system. Proc Odyssey—the Speaker and Language Recognition Workshop, pp 58–62
117.
go back to reference Hermansky H (1989) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87:1738–1752CrossRef Hermansky H (1989) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87:1738–1752CrossRef
118.
go back to reference Paul JE, Rabinowitz AS, Riganati JP, Richardson JM (1975) Semi-automatic speaker identification system (SASIS)—analytical studies. Final Report C74–11841501, Rockwell International Paul JE, Rabinowitz AS, Riganati JP, Richardson JM (1975) Semi-automatic speaker identification system (SASIS)—analytical studies. Final Report C74–11841501, Rockwell International
119.
go back to reference Bunge E (1977) Speaker recognition by computer. Philips Tech. Review 37(8):207–219 Bunge E (1977) Speaker recognition by computer. Philips Tech. Review 37(8):207–219
120.
go back to reference Nakasone H, Melvin C (1989) C.A.V.I.S.: (Computer assisted voice identification system). Final Report 85-IJ-CX-0024. National Institute of Justice Nakasone H, Melvin C (1989) C.A.V.I.S.: (Computer assisted voice identification system). Final Report 85-IJ-CX-0024. National Institute of Justice
121.
go back to reference Falcone M, De Sairo N (1994) A PC speaker identification system for forensic use: IDEM. Proc ESCA workshop on automatic speaker recognition, identification and verification, pp 169–172 Falcone M, De Sairo N (1994) A PC speaker identification system for forensic use: IDEM. Proc ESCA workshop on automatic speaker recognition, identification and verification, pp 169–172
122.
go back to reference Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) IdentiVox: a PC-Windows tool for text-independent speaker recognition in forensic environments. Prob Forensic Sci 47:246–253 Gonzalez-Rodriguez J, Ortega-Garcia J, Lucena-Molina JJ (2001) IdentiVox: a PC-Windows tool for text-independent speaker recognition in forensic environments. Prob Forensic Sci 47:246–253
123.
go back to reference Drygajlo A, Meuwly D, Alexander A (2003) Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Proc Eurospeech, pp 689–692 Drygajlo A, Meuwly D, Alexander A (2003) Statistical methods and Bayesian interpretation of evidence in forensic automatic speaker recognition. Proc Eurospeech, pp 689–692
124.
go back to reference Agnitio, Sociedad Limitada. http://www.agnitio.es/index.php Agnitio, Sociedad Limitada. http://​www.​agnitio.​es/​index.​php
125.
go back to reference Morrison GS (2009) Forensic voice comparison and the paradigm shift. Sci Justice 49:298–308CrossRef Morrison GS (2009) Forensic voice comparison and the paradigm shift. Sci Justice 49:298–308CrossRef
Metadata
Title
Historical and Procedural Overview of Forensic Speaker Recognition as a Science
Authors
Kanae Amino, Ph.D.
Takashi Osanai, Ph.D.
Toshiaki Kamada, B.E.
Hisanori Makinae, Ph.D.
Takayuki Arai, Ph.D.
Copyright Year
2012
Publisher
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-0263-3_1