Skip to main content

2017 | OriginalPaper | Buchkapitel

An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance

verfasst von : Arnab Poddar, Md Sahidullah, Goutam Saha

Erschienen in: Pattern Recognition and Machine Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A prime challenge in automatic speaker verification (ASV) is to improve performance with short speech segments. The variability and uncertainty of intermediate model parameters associated with state-of-the-art i-vector based ASV system, extensively increases in short duration. To compensate increased variability, we propose an adaptive approach for estimation of model parameters. The pre-estimated universal background model (UBM) parameters are used for adaptation. The speaker models i.e., i-vectors are generated with the proposed adapted parameters. The ASV performance with the proposed approach considerably outperformed conventional i-vector based system on publicly available speech corpora, NIST SRE 2010, especially in short duration, as required in real-world applications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRef Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRef
2.
Zurück zum Zitat Kanagasundaram, A., Dean, D., Sridharan, S., Gonzalez-Dominguez, J., Gonzalez-Rodriguez, J., Ramos, D.: Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques. Speech Commun. 59, 69–82 (2014)CrossRef Kanagasundaram, A., Dean, D., Sridharan, S., Gonzalez-Dominguez, J., Gonzalez-Rodriguez, J., Ramos, D.: Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques. Speech Commun. 59, 69–82 (2014)CrossRef
3.
Zurück zum Zitat Kanagasundaram, A., Vogt, R., Dean, D.B., Sridharan, S., Mason, M.W.: i-vector based speaker recognition on short utterances. In: Proceedings of INTERSPEECH, pp. 2341–2344. ISCA (2011) Kanagasundaram, A., Vogt, R., Dean, D.B., Sridharan, S., Mason, M.W.: i-vector based speaker recognition on short utterances. In: Proceedings of INTERSPEECH, pp. 2341–2344. ISCA (2011)
4.
Zurück zum Zitat Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: The Speaker and Language Recognition Workshop, Odyssey, p. 14. ISCA (2010) Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: The Speaker and Language Recognition Workshop, Odyssey, p. 14. ISCA (2010)
5.
Zurück zum Zitat Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)CrossRef Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)CrossRef
6.
Zurück zum Zitat Poddar, A., Sahidullah, M., Saha, G.: Performance comparison of speaker recognition systems in presence of duration variability. In: Annual IEEE India Conference (INDICON), pp. 1–6. IEEE (2015) Poddar, A., Sahidullah, M., Saha, G.: Performance comparison of speaker recognition systems in presence of duration variability. In: Annual IEEE India Conference (INDICON), pp. 1–6. IEEE (2015)
7.
Zurück zum Zitat Poddar, A., Sahidullah, M., Saha, G.: Speaker verification with short utterances: a review of challenges. trends and opportunities. In: IET Biometrics (accepted with minor) (2017) Poddar, A., Sahidullah, M., Saha, G.: Speaker verification with short utterances: a review of challenges. trends and opportunities. In: IET Biometrics (accepted with minor) (2017)
8.
Zurück zum Zitat Poorjam, A.H., Saeidi, R., Kinnunen, T., Hautamäki, V.: Incorporating uncertainty as a quality measure in i-vector based language recognition, Odyssey, pp. 74–80 (2016) Poorjam, A.H., Saeidi, R., Kinnunen, T., Hautamäki, V.: Incorporating uncertainty as a quality measure in i-vector based language recognition, Odyssey, pp. 74–80 (2016)
9.
Zurück zum Zitat Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000)CrossRef Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Process. 10(1), 19–41 (2000)CrossRef
10.
Zurück zum Zitat Sahidullah, M., Saha, G.: Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Commun. 54(4), 543–565 (2012)CrossRef Sahidullah, M., Saha, G.: Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Commun. 54(4), 543–565 (2012)CrossRef
11.
Zurück zum Zitat Van Segbroeck, M., Travadi, R., Narayanan, S.S.: Rapid language identification. IEEE Trans. Audio Speech Lang. Process. 23(7), 1118–1129 (2015)CrossRef Van Segbroeck, M., Travadi, R., Narayanan, S.S.: Rapid language identification. IEEE Trans. Audio Speech Lang. Process. 23(7), 1118–1129 (2015)CrossRef
Metadaten
Titel
An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
verfasst von
Arnab Poddar
Md Sahidullah
Goutam Saha
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-69900-4_41

Premium Partner