Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

27.11.2015 | Ausgabe 1/2016

International Journal of Speech Technology 1/2016

Sub-vector based biometric speaker verification using MLLR super-vector

Zeitschrift:
International Journal of Speech Technology > Ausgabe 1/2016
Autoren:
A. K. Sarkar, J. F. Bonastre

Abstract

In this paper, we propose a sub-vector based speaker characterization method for biometric speaker verification, where speakers are represented by uniform segmentation of their maximum likelihood linear regression (MLLR) super-vectors called m-vectors. The MLLR transformation is estimated with respect to universal background model (UBM) without any speech/phonetic information. We introduce two strategies for segmentation of MLLR super-vector: one is called disjoint and other is an overlapped window technique. During test phase, m-vectors of the test utterance are scored against the claimant speaker. Before scoring, m-vectors are post-processed to compensate the session variability. In addition, we propose a clustering algorithm for multiple-class wise MLLR transformation, where Gaussian components of the UBM are clustered into different groups using the concept of expectation maximization (EM) and maximum likelihood (ML). In this case, MLLR transformations are estimated with respect to each class using the sufficient statistics accumulated from the Gaussian components belonging to the particular class, which are then used for m-vector system. The proposed method needs only once alignment of the data with respect to the UBM for multiple MLLR transformations. We first show that the proposed multi-class m-vector system shows promising speaker verification performance when compared to the conventional i-vector based speaker verification system. Secondly, the proposed EM based clustering technique is robust to the random initialization in-contrast to the conventional K-means algorithm and yields system performance better/equal which is best obtained by the K-means. Finally, we show that the fusion of the m-vector with the i-vector further improves the performance of the speaker verification in both score as well as feature domain. The experimental results are shown on various tasks of NIST 2008 speaker recognition evaluation (SRE) core condition.

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 30 Tage kostenlos.

Literatur
Über diesen Artikel

Weitere Artikel der Ausgabe 1/2016

International Journal of Speech Technology 1/2016 Zur Ausgabe