Top

Published in:

2016 | OriginalPaper | Chapter

Prosodic Features Based Text-dependent Speaker Recognition with Short Utterance

Authors : Jianwu Zhang, Jianchao He, Zhendong Wu, Ping Li

Published in: Computational Intelligence and Intelligent Systems

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Over the past several years, Gaussian mixtures models have been the dominant approach for modeling in text-independent speaker recognition field. But the recognition accuracy for these models declines when utterances’ length becomes short. Presently Mel-frequency cepstral coefficients are generally used to characterize the properties of the vocal tract and widely applied in speech recognition. In addition, prosodic features, such as pitch and formant, are generally considered to describe the glottal characteristics. However, the efficiency of those approaches remain unsatisfactory. In text-dependent short utterances speaker verification systems, prosodic features can assist to improve the recognition result theoretically. In order to optimize the performance of speaker verification systems under the framework of adapted GMM-UBM, we adopt a variant speaker verification system based on prosodic features, in which a dual-judgment-mechanism is used in order to integrate vocal tract features with prosodic features. Experimental results showed that the new speech recognition system led a better consequence.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A PCB Short Circuit Locating Scheme Based on Near Field Magnet Specific Point Detecting

next chapter User Oriented Semi-automatic Method of Constructing Domain Ontology

Jin, L., Xiaofeng, C., Mingqiang, L., et al.: Secure deduplication with efficient and reliable convergent key management. IEEE Trans. Parallel Distrib. Syst. 25(6), 1615–1625 (2014)CrossRef

Jin, L., Yatkit, L., Xiaofeng, C., et al.: A hybrid cloud approach for secure authorized deduplication. IEEE Trans. Parallel Distrib. Syst. 26(5), 1206–1216 (2015)CrossRef

Zhendong, W., Bin, L., et al.: High dimension space projection-based biometric encryption for fingerprint with fuzzy minutia. Soft Comput. (2015, in Press). doi:10.1007/s00500-015-1778-2

Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85, 1437–1462 (1997)CrossRef

Reynolds, D.A., Quatieri, T., Dunn, R.: Speaker verification using adapted gaussian mixture models. Digital Signal Process. 10, 19–41 (2000)CrossRef

Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: ICASSP, pp. 53–56 (2003)

Vogt, R., Sridharan, S., Michael, M.: Making confident speaker verification decisions with minimal speech. IEEE Trans. ASLP 18(6), 1182–1192 (2010)

Kenny, P., Boulianne, G., Dumouchel, P.: Eigenvoice modeling with sparse training data. IEEE Trans. Speech Audio Process. 13(3), 345–354 (2005)CrossRef

Dehak, N., Dehak, R., Glass, J., Reynolds, D., Kenny, P.: Cosine similarity scoring without score normalization techniques. In: Proceedings of Odyssey 2010 - The Speaker and Language Recognition Workshop (2010)

10.

Nosratighods, M., Ambikairajah, E., Epps, J., Carey, M.J.: A segment selection technique for speaker verification. Speech Commun. 52(9), 753–761 (2010)CrossRef

11.

Fattah, M.A.: Phoneme based speaker modeling to improve speaker recognition. Information 9(1), 135–147 (2010)

12.

Davis, S.B., Mermelstein, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. ASLP 28(4), 357–366 (1980)

13.

Chow, D., Abdulla, W.H.: Robust speaker identification based perceptual log area ratio and Gaussian mixture models. In: INTERSPEECH (2004)

14.

Matthieu, H.: Text-Dependent Speaker Recognition. Springer, Heidelberg (2008)

15.

Vogt, R.J., Lustri, C.J., Sridharan, S.: Factor analysis modelling for speaker verification with short utterances. In: Odyssey Speaker and Language Recognition Workshop. IEEE (2008)

16.

Vogt, R., Baker, B., Sridharan, S.: Factor analysis subspace estimation for speaker verification with short utterances. In: INTERSPEECH 2008, pp. 853–856 (2008)

17.

Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S., Mason, M.: I-vector based speaker recognition on short utterances. In: Annual Conference of the International Speech Communication Association (2011)

18.

Larcher, A., Bousquet, P.M., Lee, K.A., Matrouf, D., et al.: I-vectors in the context of phonetically-constrained short utterances for speaker verification. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP (2012)

19.

Bilmes, J.A.: A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Int. Comput. Sci. Inst. 4, 126 (1998)

20.

Rabiner, L., Cheng, M., Rosenberg, A.E., McGonegal, C.: A comparative performance study of several pitch detection algorithms. IEEE Trans. Acoust. Speech Signal Process. 24(5), 399–418 (1976)CrossRef

21.

Zhendong, W., Jie, Y., Jianwu, Z., Huaxin, H.: A hierarchical face recognition algorithm based on humanoid nonlinear least-squares computation. J. Ambient Intell. Humanized Comput. (2015, in Press). doi:10.1007/s12652-015-0321-8

Title: Prosodic Features Based Text-dependent Speaker Recognition with Short Utterance
Authors: Jianwu Zhang
Jianchao He
Zhendong Wu
Ping Li
Publisher: Springer Singapore
Book: Computational Intelligence and Intelligent Systems
Print ISBN: 978-981-10-0355-4

Electronic ISBN: 978-981-10-0356-1

Copyright Year: 2016
DOI: https://doi.org/10.1007/978-981-10-0356-1_57

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner