nach oben

International Journal of Speech Technology

Erschienen in:

18.11.2020

Mitigate the reverberation effect on the speaker verification performance using different methods

verfasst von: Khamis A. Al-Karawi

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech signals recorded in far-field or with a far receiver typically comprise additive noise and reverberation, which cause degradation and distortion in the reliability and intelligibility of speech signal, and the recognition performance of speaker recognition systems, with severe consequences in a wide range of real applications. Channel equalization, i.e. the removal or reduction or other cleaning methods of the channel effects, to some extent, mitigates the mismatching problem at the cost of added distortions to the vulnerable speech signal themselves, and therefore, its effectiveness is limited. Recent research indicates that a new speaker feature, gammatone frequency cepstral coefficients (GFCC), exhibits superior noise and reverberation robustness than other features. This paper proposed two methods to combat the effect of reverberation on speaker verification performance. The first method is using GFCC features as a robust feature to alleviate the effect of reverberation on system performance. While the second method is using multi training to combat the reverberation effect. Speaker verification experiments in the artificial and real reverberant conditions show the efficiency of the proposed methods in terms of decreased equal error rate EER and detection error trade-off DET.

Vorheriger Artikel A novel voice activity detection algorithm using modified global thresholding

Nächster Artikel Optimal feature selection for speech emotion recognition using enhanced cat swarm optimization algorithm

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Al-Karawi, K. A. (2019). Robustness speaker recognition based on feature space in clean and noisy condition. International Journal of Sensors, Wireless Communications and Control, 9, 1–10.CrossRef

Al-Karawi, K. A., Al-Noori, A. H., Li, F. F., & Ritchings, T. (2015). Automatic speaker recognition system in adverse conditions—Implication of noise and reverberation on system performance. International Journal of Information and Electronics Engineering, 5, 423.CrossRef

Al-Karawi, K. A., & Li, F. (2017). Robust speaker verification in reverberant conditions using estimated acoustic parameters—A maximum likelihood estimation and training on the fly approach. In 2017 Seventh International Conference on Innovative Computing Technology (INTECH) (pp. 52–57).

Al-Karawi, K. A., & Mohammed, D. Y. (2019). Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. International Journal of Speech Technology, 22(4), 1077–1084.CrossRef

Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65, 943–950.CrossRef

Al-Noori, A. H., Al-Karawi, K. A., & Li, F. F. (2015). Improving robustness of speaker recognition in noisy and reverberant conditions via training. In 2015 European Intelligence and Security Informatics Conference (EISIC) (pp. 180–180).

CATT-Acoustic. (2010). v8.0c, Room acoustic modelling software. Retrieved from http://www.catt.se.

Chen, Y.-W., & Lin, C.-J. (2006). Combining SVMs with various feature selection strategies. In Feature extraction. (pp. 315–324). Springer.

Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In Tenth Annual Conference of the International Speech Communication Association.

Ganapathy, S., Pelecanos, J., & Omar, M. K. (2011). Feature normalization for speaker verification in room reverberation. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 4836–4839).

González-Rodríguez, J., Ortega-García, J., Martín, C., & Hernández, L. (1996). Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, (pp. 1333–1336).

Mammone, R. J., Zhang, X., & Ramachandran, R. P. (1996). Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine, 13, 58.CrossRef

Ming, J., Hazen, T. J., Glass, J. R., & Reynolds, D. A. (2007). Robust speaker recognition in noisy conditions. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1711–1723.CrossRef

Ning, W., Ching, P. C., Nengheng, Z., & Tan, L. (2011). Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Transactions on Audio, Speech, and Language Processing, 19, 196–205.CrossRef

Petrick, R., Lohde, K., Wolff, M., & Hoffmann, R. (2007). The harming part of room acoustics in automatic speech recognition.

Rose, R. C., & Reynolds, D. A. (1990). Text independent speaker identification using automatic acoustic segmentation. In 1990 International Conference on Acoustics, Speech, and Signal Processing, ICASSP-90 (pp. 293–296).

Rossing, T. (2007). Introduction to acoustics. In Springer Handbook of Acoustics. (pp. 1–6). Springer, New York.

Sadjadi, S. O., Slaney, M., & Heck, L. (2013). MSR identity toolbox v1. 0: A MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter, 1(4), 1–32.

Sehr, A., Habets, E. A., Maas, R., & Kellermann, W. (2010). Towards a better understanding of the effect of reverberation on speech recognition performance. In Proc. IWAENC.

Shao, Y., Jin, Z., Wang, D., & Srinivasan, S. (2009). An auditory-based feature for robust speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009 (pp. 4625-4628).

Zhao, X., Shao, Y., & Wang, D. (2012). CASA-based robust speaker identification. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1608–1616.CrossRef

Zhao, X., Wang, Y., & Wang, D. (2014). Robust speaker identification in noisy and reverberant conditions.

Titel: Mitigate the reverberation effect on the speaker verification performance using different methods
verfasst von: Khamis A. Al-Karawi
Publikationsdatum: 18.11.2020
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 1/2021
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-020-09780-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2021

Deep analysis of an Arabic sentiment classification system based on lexical resource expansion and custom approaches building

An automatic histogram detection and information extraction from document images

A novel voice activity detection algorithm using modified global thresholding

Speech network analysis and anomaly detection based on FSS model

Arabic grapheme-to-phoneme conversion based on joint multi-gram model

Simultaneous Enhancement and Watermarking of Speech Signals