Top

Published in:

2019 | OriginalPaper | Chapter

Speaker Recognition in Orthogonal Complement of Time Session Variability Subspace

Authors : Satoru Tsuge, Shingo Kuroiwa

Published in: Intelligent Interactive Multimedia Systems and Services

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

A time session variability between the enrollment data and the recognized data degrades speaker recognition performance. Hence, the time session variability is one of the most important issues in the speaker recognition technology. In this paper, we propose a robust speaker recognition method for the time session variability. The proposed method estimates a time session variability subspace. Then, the proposed method carries out the speaker recognition in the orthogonal complement of the time session variability subspace. In addition, we incorporate a linear discriminant analysis method into the proposed method. In order to evaluate the proposed method, we conducted a speaker identification experiment. Experimental results show that the proposed method improves speaker identification performance of baseline.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Sink Nodes Deployment Algorithm for Wireless Sensor Networks Based on Geometrical Features

next chapter Verification of Identification Accuracy of Eye-Gaze Data on Driving Video

Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)CrossRef

Matsui, T., Nishitani, T., Furui, S.: A study of model and a priori threshold updating in speaker verification. IEICE Trans. J81-DII(2), 268–276 (1998). (in Japanese)

Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007)CrossRef

Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Trans. Audio Speech Lang. Process. 15(4), 1448–1460 (2007)CrossRef

Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)CrossRef

Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Proceedings of Odyssey (2010)

Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRef

Makinae, H., Osanai, T., Kamada, T., Tanimoto, M.: Construction and preliminary analysis of a large-scale bone-conducted speech database. IEICE Techn. Rep. Speech 107(165), 97–102 (2007). (in Japanese)

Furui, S., Maekawa, K., Isahara, H.: A Japanese national project on spontaneous speech corpus and processing technology. In: Proceedings of ASR 2000, pp. 244–248 (2000)

10.

Partridge, M., Calvo, R.A.: Fast dimensionality reduction and simple PCA. Intell. Data Anal. 2, 203–214 (1998)CrossRef

11.

Tsuge, S., Kuroiwa, S.: AWA long-term recording speech corpus (AWA-LTR). In: Proceedings of 2013 International Workshop on Nonlinear Circuits, Communication and Signal Processing (NCSP 2013), pp. 17–20 (2013)

12.

Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Proceedings of Interspeech, pp. 249–252 (2011)

13.

Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)

14.

scikit-learn, machine learning in Python. http://scikit-learn.org/stable/

Title: Speaker Recognition in Orthogonal Complement of Time Session Variability Subspace
Authors: Satoru Tsuge
Shingo Kuroiwa
Publisher: Springer International Publishing
Book: Intelligent Interactive Multimedia Systems and Services
Print ISBN: 978-3-319-92230-0

Electronic ISBN: 978-3-319-92231-7

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-319-92231-7_11

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner