Skip to main content
Top

2016 | OriginalPaper | Chapter

Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance

Authors : Andrey Shulipa, Sergey Novoselov, Aleksandr Melnikov

Published in: Speech and Computer

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In last years satisfactory performance of speaker recognition (SR) systems have been achieved in evaluations provided by NIST. It was possible due to using large datasets to train system parameters and accurate speaker variability modeling. In such a cases test and train conditions are similar and it ensures good performance for the evaluations. However in practical applications when training and testing conditions are different the problem of mismatching of the optimal SR system parameters occurs. It is the main problem in the deployment of the real application systems. It leads to reducing SR systems effectiveness. This paper investigates discriminative and generative approaches for the adaptation of the parameters of the speaker recognition systems and proposes effective solutions to improve their performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Brümmer, N., de Villiers, E.: The bosaris toolkit: theory, algorithms and code for surviving the new dcf. arXiv preprint (2013). arXiv:1304.2865 Brümmer, N., de Villiers, E.: The bosaris toolkit: theory, algorithms and code for surviving the new dcf. arXiv preprint (2013). arXiv:​1304.​2865
4.
go back to reference Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio, Speech Lang. Process. 19(4), 788–798 (2011)CrossRef Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio, Speech Lang. Process. 19(4), 788–798 (2011)CrossRef
5.
go back to reference Doddington, G.R., Przybocki, M.A., Martin, A.F., Reynolds, D.A.: The NIST speaker recognition evaluation-overview, methodology, systems, results, perspective. Speech Commun. 31(2), 225–254 (2000)CrossRef Doddington, G.R., Przybocki, M.A., Martin, A.F., Reynolds, D.A.: The NIST speaker recognition evaluation-overview, methodology, systems, results, perspective. Speech Commun. 31(2), 225–254 (2000)CrossRef
6.
go back to reference Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Interspeech, pp. 249–252 (2011) Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Interspeech, pp. 249–252 (2011)
7.
go back to reference Garcia-Romero, D., McCree, A.: Supervised domain adaptation for i-vector based speaker recognition. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4047–4051. IEEE (2014) Garcia-Romero, D., McCree, A.: Supervised domain adaptation for i-vector based speaker recognition. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4047–4051. IEEE (2014)
8.
go back to reference Garcia-Romero, D., McCree, A., Shum, S., Brummer, N., Vaquero, C.: Unsupervised domain adaptation for i-vector speaker recognition. In: Proceedings of Odyssey: The Speaker and Language Recognition Workshop (2014) Garcia-Romero, D., McCree, A., Shum, S., Brummer, N., Vaquero, C.: Unsupervised domain adaptation for i-vector speaker recognition. In: Proceedings of Odyssey: The Speaker and Language Recognition Workshop (2014)
9.
go back to reference Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Odyssey, p. 14 (2010) Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Odyssey, p. 14 (2010)
10.
go back to reference Novoselov, S., Pekhovsky, T., Simonchik, K.: STC speaker recognition system for the NIST i-vector challenge. In: Odyssey: The Speaker and Language Recognition Workshop, pp. 231–240 (2014) Novoselov, S., Pekhovsky, T., Simonchik, K.: STC speaker recognition system for the NIST i-vector challenge. In: Odyssey: The Speaker and Language Recognition Workshop, pp. 231–240 (2014)
11.
go back to reference Novoselov, S., Pekhovsky, T., Simonchik, K., Shulipa, A.: RBM-PLDA subsystem for the NIST i-vector challenge. System 8, 9 (2014) Novoselov, S., Pekhovsky, T., Simonchik, K., Shulipa, A.: RBM-PLDA subsystem for the NIST i-vector challenge. System 8, 9 (2014)
12.
go back to reference Pekhovsky, T., Novoselov, S., Sholohov, A., Kudashev, O.: On autoencoders in the i-vector space for speaker recognition Pekhovsky, T., Novoselov, S., Sholohov, A., Kudashev, O.: On autoencoders in the i-vector space for speaker recognition
13.
go back to reference Rohdin, J., Biswas, S., Shinoda, K.: Discriminative PLDA training with application-specific loss functions for speaker verification. In: Odyssey, The Speaker and Language Recognition Workshop (2014) Rohdin, J., Biswas, S., Shinoda, K.: Discriminative PLDA training with application-specific loss functions for speaker verification. In: Odyssey, The Speaker and Language Recognition Workshop (2014)
Metadata
Title
Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance
Authors
Andrey Shulipa
Sergey Novoselov
Aleksandr Melnikov
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-43958-7_14

Premium Partner