Skip to main content
Top

2008 | OriginalPaper | Chapter

37. Text-Dependent Speaker Recognition

Author : Matthieu Hébert, Dr.

Published in: Springer Handbook of Speech Processing

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Text-dependent speaker recognition characterizes a speaker recognition task, such as verification or identification, in which the set of words (or lexicon) used during the testing phase is a subset of the ones present during the enrollment phase. The restricted lexicon enables very short enrollment (or registration) and testing sessions to deliver an accurate solution but, at the same time, represents scientific and technical challenges. Because of the short enrollment and testing sessions, text-dependent speaker recognition technology is particularly well suited for deployment in large-scale commercial applications. These are the bases for presenting an overview of the state of the art in text-dependent speaker recognition as well as emerging research avenues. In this chapter, we will demonstrate the intrinsic dependence that the lexical content of the password phrase has on the accuracy. Several research results will be presented and analyzed to show key techniques used in text-dependent speaker recognition systems from different sites. Among these, we mention multichannel speaker model synthesis and continuous adaptation of speaker models with threshold tracking. Since text-dependent speaker recognition is the most widely used voice biometric in commercial deployments, several
results drawn from realistic deployment scenarios are also included.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
37.1.
go back to reference A. Martin, M. Przybocki, G. Doddington, D.A. Reynolds: The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspectives, Speech Commun. 31, 225-254 (2000)CrossRef A. Martin, M. Przybocki, G. Doddington, D.A. Reynolds: The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspectives, Speech Commun. 31, 225-254 (2000)CrossRef
37.2.
go back to reference D.E. Sturim, D.A. Reynolds, R.B. Dunnk, T.F. Quatieri: Speaker verification using text-constrained gaussian mixture models, Proc. IEEE ICASSP 2002(1), 677-680 (2002) D.E. Sturim, D.A. Reynolds, R.B. Dunnk, T.F. Quatieri: Speaker verification using text-constrained gaussian mixture models, Proc. IEEE ICASSP 2002(1), 677-680 (2002)
37.3.
go back to reference K. Boakye, B. Peskin: Text-constrained speaker recognition on a text-independent task, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004) K. Boakye, B. Peskin: Text-constrained speaker recognition on a text-independent task, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004)
37.4.
go back to reference D. Boies, M. Hébert, L.P. Heck: Study of the effect of lexical mismatch in text-dependent speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004) D. Boies, M. Hébert, L.P. Heck: Study of the effect of lexical mismatch in text-dependent speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004)
37.5.
go back to reference M. Wagner, C. Summerfield, T. Dunstone, R. Summerfield, J. Moss: An evaluation of commercial off-the-shelf speaker verification systems, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006) M. Wagner, C. Summerfield, T. Dunstone, R. Summerfield, J. Moss: An evaluation of commercial off-the-shelf speaker verification systems, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006)
37.6.
go back to reference A. Higgins, L. Bahler, J. Porter: Speaker verification using randomized phrase prompting, Digit. Signal Process. 1, 89-106 (1991)CrossRef A. Higgins, L. Bahler, J. Porter: Speaker verification using randomized phrase prompting, Digit. Signal Process. 1, 89-106 (1991)CrossRef
37.7.
go back to reference M.J. Carey, E.S. Parris, J.S. Briddle: A speaker verification system using alpha-nets, Proc. IEEE ICASSP, Vol. 1981 (1981) pp. 397-400 M.J. Carey, E.S. Parris, J.S. Briddle: A speaker verification system using alpha-nets, Proc. IEEE ICASSP, Vol. 1981 (1981) pp. 397-400
37.8.
go back to reference L.P. Heck, M. Weintraub: Handset dependent background models for robust text-independent speaker recognition, Proc. IEEE ICASSP 1997(2), 1037-1040 (1997) L.P. Heck, M. Weintraub: Handset dependent background models for robust text-independent speaker recognition, Proc. IEEE ICASSP 1997(2), 1037-1040 (1997)
37.9.
go back to reference A.E. Rosenberg, S. Parthasarathy: The use of cohort normalized scores for speaker recognition, Proc. IEEE ICASSP 1996(1), 81-84 (1996) A.E. Rosenberg, S. Parthasarathy: The use of cohort normalized scores for speaker recognition, Proc. IEEE ICASSP 1996(1), 81-84 (1996)
37.10.
go back to reference C. Barras, J.-L. Gauvain: Feature and score normalization for speaker verification of cellular data, Proc. IEEE ICASSP 2003(2), 49-52 (2003) C. Barras, J.-L. Gauvain: Feature and score normalization for speaker verification of cellular data, Proc. IEEE ICASSP 2003(2), 49-52 (2003)
37.11.
go back to reference Y. Liu, M. Russell, M. Carey: The role of dynamic features in text-dependent and -independent speaker verification, Proc. IEEE ICASSP 2006(1), 669-672 (2006) Y. Liu, M. Russell, M. Carey: The role of dynamic features in text-dependent and -independent speaker verification, Proc. IEEE ICASSP 2006(1), 669-672 (2006)
37.12.
go back to reference D. Reynolds: Channel robust speaker verification via feature mapping, Proc. IEEE ICASSP 2003(2), 53-56 (2003) D. Reynolds: Channel robust speaker verification via feature mapping, Proc. IEEE ICASSP 2003(2), 53-56 (2003)
37.13.
go back to reference R. Teunen, B. Shahshahani, L.P. Heck: A model-based transformational approach to robust speaker recognition, Proc. ICSLP 2000(2), 495-498 (2000) R. Teunen, B. Shahshahani, L.P. Heck: A model-based transformational approach to robust speaker recognition, Proc. ICSLP 2000(2), 495-498 (2000)
37.14.
go back to reference R.O. Duda, P.E. Hart, D.G. Stork: Pattern Classification, 2nd edn. (Wiley, New York 2001)MATH R.O. Duda, P.E. Hart, D.G. Stork: Pattern Classification, 2nd edn. (Wiley, New York 2001)MATH
37.15.
go back to reference T. Kato, T. Shimizu: Improved speaker verification over the cellular phone network using phoneme-balanced and digit-sequence-preserving connected digit patterns, Proc. IEEE ICASSP 2003(2), 57-60 (2003) T. Kato, T. Shimizu: Improved speaker verification over the cellular phone network using phoneme-balanced and digit-sequence-preserving connected digit patterns, Proc. IEEE ICASSP 2003(2), 57-60 (2003)
37.16.
go back to reference T. Matsui, S. Furui: Concatenated phoneme models for text-variable speaker recognition, Proc. IEEE ICASSP 1993(2), 391-394 (1993)CrossRef T. Matsui, S. Furui: Concatenated phoneme models for text-variable speaker recognition, Proc. IEEE ICASSP 1993(2), 391-394 (1993)CrossRef
37.17.
go back to reference S. Parthasarathy, A.E. Rosenberg: General phrase speaker verification using sub-word background models and likelihood ratio scoring, Proc. ICSLP 1996(4), 2403-2406 (1996) S. Parthasarathy, A.E. Rosenberg: General phrase speaker verification using sub-word background models and likelihood ratio scoring, Proc. ICSLP 1996(4), 2403-2406 (1996)
37.18.
go back to reference C.W. Che, Q. Lin, D.S. Yuk: An HMM approach to text-prompted speaker verification, Proc. IEEE ICASSP 1996(2), 673-676 (1996) C.W. Che, Q. Lin, D.S. Yuk: An HMM approach to text-prompted speaker verification, Proc. IEEE ICASSP 1996(2), 673-676 (1996)
37.19.
go back to reference M. Hébert, L.P. Heck: Phonetic class-based speaker verification, Proc. Eurospeech, Vol. 2003 (2003) pp. 1665-1668 M. Hébert, L.P. Heck: Phonetic class-based speaker verification, Proc. Eurospeech, Vol. 2003 (2003) pp. 1665-1668
37.20.
go back to reference E.G. Hansen, R.E. Slygh, T.R. Anderson: Speaker recognition using phoneme-specific GMMs, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004) E.G. Hansen, R.E. Slygh, T.R. Anderson: Speaker recognition using phoneme-specific GMMs, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004)
37.21.
go back to reference M. Schmidt, H. Gish: Speaker identification via support vector classifiers, Proc. IEEE ICASSP 1996(1), 105-108 (1996) M. Schmidt, H. Gish: Speaker identification via support vector classifiers, Proc. IEEE ICASSP 1996(1), 105-108 (1996)
37.22.
go back to reference W.M. Campbell, D.E. Sturim, D.A. Reynolds, A. Solomonoff: SVM based speaker verification using a GMM supervector kernel and NAP variability compensation, Proc. IEEE ICASSP 2006(1), 97-100 (2006) W.M. Campbell, D.E. Sturim, D.A. Reynolds, A. Solomonoff: SVM based speaker verification using a GMM supervector kernel and NAP variability compensation, Proc. IEEE ICASSP 2006(1), 97-100 (2006)
37.23.
go back to reference N. Krause, R. Gazit: SVM-based speaker classification in the GMM model space, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006) N. Krause, R. Gazit: SVM-based speaker classification in the GMM model space, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006)
37.24.
go back to reference S. Fine, J. Navratil, R.A. Gopinath: A hybrid GMM/SVM approach to speaker identification, Proc. IEEE ICASSP 2001(1), 417-420 (2001) S. Fine, J. Navratil, R.A. Gopinath: A hybrid GMM/SVM approach to speaker identification, Proc. IEEE ICASSP 2001(1), 417-420 (2001)
37.25.
go back to reference W.M. Campbell: A SVM/HMM system for speaker recognition, Proc. IEEE ICASSP 2003(2), 209-212 (2003) W.M. Campbell: A SVM/HMM system for speaker recognition, Proc. IEEE ICASSP 2003(2), 209-212 (2003)
37.26.
go back to reference S. Furui: Cepstral analysis techniques for automatic speaker verification, IEEE Trans. Acoust. Speech 29, 254-272 (1981)CrossRef S. Furui: Cepstral analysis techniques for automatic speaker verification, IEEE Trans. Acoust. Speech 29, 254-272 (1981)CrossRef
37.27.
go back to reference V. Ramasubramanian, A. Das, V.P. Kumar: Text-dependent speaker recognition using one-pass dynamic programming algorithm, Proc. IEEE ICASSP 2006(2), 901-904 (2006) V. Ramasubramanian, A. Das, V.P. Kumar: Text-dependent speaker recognition using one-pass dynamic programming algorithm, Proc. IEEE ICASSP 2006(2), 901-904 (2006)
37.28.
go back to reference A. Sankar, R.J. Mammone: Growing and pruning neural tree networks, IEEE Trans. Comput. 42, 272-299 (1993)CrossRef A. Sankar, R.J. Mammone: Growing and pruning neural tree networks, IEEE Trans. Comput. 42, 272-299 (1993)CrossRef
37.29.
go back to reference K.R. Farrell: Speaker verification with data fusion and model adaptation, Proc. ICSLP 2002(2), 585-588 (2002)MathSciNet K.R. Farrell: Speaker verification with data fusion and model adaptation, Proc. ICSLP 2002(2), 585-588 (2002)MathSciNet
37.30.
go back to reference D.A. Reynolds, T.F. Quatieri, R. B.Dunn: Speaker verification using adapted gaussian mixture models, Digit. Signal Process. 10, 19-41 (2000)CrossRef D.A. Reynolds, T.F. Quatieri, R. B.Dunn: Speaker verification using adapted gaussian mixture models, Digit. Signal Process. 10, 19-41 (2000)CrossRef
37.31.
go back to reference J.-L. Gauvain, C.-H. Lee: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE T. Speech Audi. Process. 2, 291-298 (1994)CrossRef J.-L. Gauvain, C.-H. Lee: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE T. Speech Audi. Process. 2, 291-298 (1994)CrossRef
37.32.
go back to reference D.A. Reynolds: Comparison of background normalization methods for text-independent speaker verification, Proc. EuroSpeech 1997(2), 963-966 (1997) D.A. Reynolds: Comparison of background normalization methods for text-independent speaker verification, Proc. EuroSpeech 1997(2), 963-966 (1997)
37.33.
go back to reference N. Mirghafori, L.P. Heck: An adaptive speaker verification system with speaker dependent a priori decision thresholds, Proc. ICSLP 2002(2), 589-592 (2002) N. Mirghafori, L.P. Heck: An adaptive speaker verification system with speaker dependent a priori decision thresholds, Proc. ICSLP 2002(2), 589-592 (2002)
37.34.
go back to reference R. Auckenthaler, M.J. Carey, H. Lloyd-Thomas: Score normalization for text-independent speaker verification systems, Digit. Signal Process. 10, 42-54 (2000)CrossRef R. Auckenthaler, M.J. Carey, H. Lloyd-Thomas: Score normalization for text-independent speaker verification systems, Digit. Signal Process. 10, 42-54 (2000)CrossRef
37.35.
go back to reference M. Hébert, D. Boies: T-Norm for text-dependent commercial speaker verification applications: effect of lexical mismatch, Proc. IEEE ICASSP 2005(1), 729-732 (2005) M. Hébert, D. Boies: T-Norm for text-dependent commercial speaker verification applications: effect of lexical mismatch, Proc. IEEE ICASSP 2005(1), 729-732 (2005)
37.36.
go back to reference N. Mirghafori, M. Hébert: Parametrization of the score threshold for a text-dependent adaptive speaker verification system, Proc. IEEE ICASSP 2004(1), 361-364 (2004) N. Mirghafori, M. Hébert: Parametrization of the score threshold for a text-dependent adaptive speaker verification system, Proc. IEEE ICASSP 2004(1), 361-364 (2004)
37.37.
go back to reference T. Matsui, T. Nishitani, S. Furui: Robust methods for updating model and a priori threshold in speaker verificaiton, Proc. IEEE ICASSP, Vol. 1996 (1996) pp. 97-100 T. Matsui, T. Nishitani, S. Furui: Robust methods for updating model and a priori threshold in speaker verificaiton, Proc. IEEE ICASSP, Vol. 1996 (1996) pp. 97-100
37.38.
go back to reference C. Barras, S. Meignier, J.-L. Gauvain: Unsupervised online adaptation for speaker verification over the telephone, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004) C. Barras, S. Meignier, J.-L. Gauvain: Unsupervised online adaptation for speaker verification over the telephone, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004)
37.39.
go back to reference C. Fredouille, J. Mariéthoz, C. Jaboulet, J. Hennebert, J.-F. Bonastre, C. Mokbel, F. Bimbot: Behavior of a bayesian adaptation method for incremental enrollment in speaker verification, Proc. IEEE ICASSP, Vol. 2000 (2000) C. Fredouille, J. Mariéthoz, C. Jaboulet, J. Hennebert, J.-F. Bonastre, C. Mokbel, F. Bimbot: Behavior of a bayesian adaptation method for incremental enrollment in speaker verification, Proc. IEEE ICASSP, Vol. 2000 (2000)
37.40.
go back to reference L.P. Heck, N. Mirghafori: Online unsupervised adaptation in speaker verification, Proc. ICSLP, Vol. 2000 (2000) L.P. Heck, N. Mirghafori: Online unsupervised adaptation in speaker verification, Proc. ICSLP, Vol. 2000 (2000)
37.41.
go back to reference L.P. Heck: On the deployment of speaker recognition for commercial applications, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004), keynote speech L.P. Heck: On the deployment of speaker recognition for commercial applications, Proc. Odyssey Speaker Recognition Workshop, Vol. 2004 (2004), keynote speech
37.42.
go back to reference K. Wadhwa: Voice verification: technology overview anf accuracy testing results, Proc. Biometrics Conference, Vol. 2004 (2004) K. Wadhwa: Voice verification: technology overview anf accuracy testing results, Proc. Biometrics Conference, Vol. 2004 (2004)
37.43.
go back to reference M.J. Carey, R. Auckenthaler: User validation for mobile telephones, Proc. IEEE ICASSP, Vol. 2000 (2000) M.J. Carey, R. Auckenthaler: User validation for mobile telephones, Proc. IEEE ICASSP, Vol. 2000 (2000)
37.44.
go back to reference L.P. Heck, D. Genoud: Integrating speaker and speech recognizers: automatic identity claim capture for speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2001 (2001) L.P. Heck, D. Genoud: Integrating speaker and speech recognizers: automatic identity claim capture for speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2001 (2001)
37.45.
go back to reference M. Hébert, N. Mirghafori: Desperately seeking impostors: data-mining for competitive impostor testing in a text-dependent speaker verification system, Proc. IEEE ICASSP 2004(2), 365-368 (2004) M. Hébert, N. Mirghafori: Desperately seeking impostors: data-mining for competitive impostor testing in a text-dependent speaker verification system, Proc. IEEE ICASSP 2004(2), 365-368 (2004)
37.46.
go back to reference T.F. Quatieri, E. Singer, R.B. Dunn, D.A. Reynolds, J.P. Campbell: Speaker and language recognition using speech codec partameters, Proc. EuroSpeech, Vol. 1999 (1999) pp. 787-790 T.F. Quatieri, E. Singer, R.B. Dunn, D.A. Reynolds, J.P. Campbell: Speaker and language recognition using speech codec partameters, Proc. EuroSpeech, Vol. 1999 (1999) pp. 787-790
37.47.
go back to reference L.P. Heck, Y. Konig, M.K. Sönmez, M. Weintraub: Robustness to telephone handset distortion in speaker recognition by discriminative feature design, Speech Commun. 31, 181-192 (2000)CrossRef L.P. Heck, Y. Konig, M.K. Sönmez, M. Weintraub: Robustness to telephone handset distortion in speaker recognition by discriminative feature design, Speech Commun. 31, 181-192 (2000)CrossRef
37.48.
go back to reference M. Siafarikas, T. Ganchev, N. Fakotakis, G. Kokkinakis: Overlapping wavelet packet features for speaker verification, Proc. EuroSpeech, Vol. 2005 (2005) M. Siafarikas, T. Ganchev, N. Fakotakis, G. Kokkinakis: Overlapping wavelet packet features for speaker verification, Proc. EuroSpeech, Vol. 2005 (2005)
37.49.
go back to reference D. Reynolds: Speaker identification and verification using Gaussian mixture speaker models, Speech Commun. 17, 91-108 (1995)CrossRef D. Reynolds: Speaker identification and verification using Gaussian mixture speaker models, Speech Commun. 17, 91-108 (1995)CrossRef
37.50.
go back to reference O. Siohan, C.-H. Lee, A.C. Surendran, Q. Li: Background model design for flexible and portable speaker verification systems, Proc. IEEE ICASSP 1999(2), 825-829 (1999) O. Siohan, C.-H. Lee, A.C. Surendran, Q. Li: Background model design for flexible and portable speaker verification systems, Proc. IEEE ICASSP 1999(2), 825-829 (1999)
37.51.
go back to reference L.P. Heck, N. Mirghafori: Unsupervised on-line adaptation in speaker verification: confidence-based updates and improved parameter estimation, Proc. Adaptation in Speech Recognition, Vol. 2001 (2001) L.P. Heck, N. Mirghafori: Unsupervised on-line adaptation in speaker verification: confidence-based updates and improved parameter estimation, Proc. Adaptation in Speech Recognition, Vol. 2001 (2001)
37.52.
go back to reference D. Hernando, J.R. Saeta, J. Hernando: Threshold estimation with continuously trained models in speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006) D. Hernando, J.R. Saeta, J. Hernando: Threshold estimation with continuously trained models in speaker verification, Proc. Odyssey Speaker Recognition Workshop, Vol. 2006 (2006)
37.53.
go back to reference A. Sankar, A. Kannan: Automatic confidence score mapping for adapted speech recognition systems, Proc. IEEE ICASSP 2002(1), 213-216 (2002) A. Sankar, A. Kannan: Automatic confidence score mapping for adapted speech recognition systems, Proc. IEEE ICASSP 2002(1), 213-216 (2002)
37.54.
go back to reference D. Genoud, G. Chollet: Deliberate imposture: a challenge for automatic speaker verification systems, Proc. EuroSpeech, Vol. 1999 (1999) pp. 1971-1974 D. Genoud, G. Chollet: Deliberate imposture: a challenge for automatic speaker verification systems, Proc. EuroSpeech, Vol. 1999 (1999) pp. 1971-1974
37.55.
go back to reference B.L. Pellom, J.H.L. Hansen: An experimental study of speaker verification sensitivity to computer voice-altered imposters, Proc. IEEE ICASSP 1999(2), 837-840 (1999) B.L. Pellom, J.H.L. Hansen: An experimental study of speaker verification sensitivity to computer voice-altered imposters, Proc. IEEE ICASSP 1999(2), 837-840 (1999)
37.56.
go back to reference D. Matrouf, J.-F. Bonastre, C. Fredouille: Effect of speech transformation on impostor acceptance, Proc. IEEE ICASSP 2006(2), 933-936 (2006) D. Matrouf, J.-F. Bonastre, C. Fredouille: Effect of speech transformation on impostor acceptance, Proc. IEEE ICASSP 2006(2), 933-936 (2006)
Metadata
Title
Text-Dependent Speaker Recognition
Author
Matthieu Hébert, Dr.
Copyright Year
2008
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-540-49127-9_37