Skip to main content
Erschienen in: International Journal of Speech Technology 1/2021

18.11.2020

Mitigate the reverberation effect on the speaker verification performance using different methods

verfasst von: Khamis A. Al-Karawi

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech signals recorded in far-field or with a far receiver typically comprise additive noise and reverberation, which cause degradation and distortion in the reliability and intelligibility of speech signal, and the recognition performance of speaker recognition systems, with severe consequences in a wide range of real applications. Channel equalization, i.e. the removal or reduction or other cleaning methods of the channel effects, to some extent, mitigates the mismatching problem at the cost of added distortions to the vulnerable speech signal themselves, and therefore, its effectiveness is limited. Recent research indicates that a new speaker feature, gammatone frequency cepstral coefficients (GFCC), exhibits superior noise and reverberation robustness than other features. This paper proposed two methods to combat the effect of reverberation on speaker verification performance. The first method is using GFCC features as a robust feature to alleviate the effect of reverberation on system performance. While the second method is using multi training to combat the reverberation effect. Speaker verification experiments in the artificial and real reverberant conditions show the efficiency of the proposed methods in terms of decreased equal error rate EER and detection error trade-off DET.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Al-Karawi, K. A. (2019). Robustness speaker recognition based on feature space in clean and noisy condition. International Journal of Sensors, Wireless Communications and Control, 9, 1–10.CrossRef Al-Karawi, K. A. (2019). Robustness speaker recognition based on feature space in clean and noisy condition. International Journal of Sensors, Wireless Communications and Control, 9, 1–10.CrossRef
Zurück zum Zitat Al-Karawi, K. A., Al-Noori, A. H., Li, F. F., & Ritchings, T. (2015). Automatic speaker recognition system in adverse conditions—Implication of noise and reverberation on system performance. International Journal of Information and Electronics Engineering, 5, 423.CrossRef Al-Karawi, K. A., Al-Noori, A. H., Li, F. F., & Ritchings, T. (2015). Automatic speaker recognition system in adverse conditions—Implication of noise and reverberation on system performance. International Journal of Information and Electronics Engineering, 5, 423.CrossRef
Zurück zum Zitat Al-Karawi, K. A., & Li, F. (2017). Robust speaker verification in reverberant conditions using estimated acoustic parameters—A maximum likelihood estimation and training on the fly approach. In 2017 Seventh International Conference on Innovative Computing Technology (INTECH) (pp. 52–57). Al-Karawi, K. A., & Li, F. (2017). Robust speaker verification in reverberant conditions using estimated acoustic parameters—A maximum likelihood estimation and training on the fly approach. In 2017 Seventh International Conference on Innovative Computing Technology (INTECH) (pp. 52–57).
Zurück zum Zitat Al-Karawi, K. A., & Mohammed, D. Y. (2019). Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. International Journal of Speech Technology, 22(4), 1077–1084.CrossRef Al-Karawi, K. A., & Mohammed, D. Y. (2019). Early reflection detection using autocorrelation to improve robustness of speaker verification in reverberant conditions. International Journal of Speech Technology, 22(4), 1077–1084.CrossRef
Zurück zum Zitat Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65, 943–950.CrossRef Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65, 943–950.CrossRef
Zurück zum Zitat Al-Noori, A. H., Al-Karawi, K. A., & Li, F. F. (2015). Improving robustness of speaker recognition in noisy and reverberant conditions via training. In 2015 European Intelligence and Security Informatics Conference (EISIC) (pp. 180–180). Al-Noori, A. H., Al-Karawi, K. A., & Li, F. F. (2015). Improving robustness of speaker recognition in noisy and reverberant conditions via training. In 2015 European Intelligence and Security Informatics Conference (EISIC) (pp. 180–180).
Zurück zum Zitat Chen, Y.-W., & Lin, C.-J. (2006). Combining SVMs with various feature selection strategies. In Feature extraction. (pp. 315–324). Springer. Chen, Y.-W., & Lin, C.-J. (2006). Combining SVMs with various feature selection strategies. In Feature extraction. (pp. 315–324). Springer.
Zurück zum Zitat Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In Tenth Annual Conference of the International Speech Communication Association. Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In Tenth Annual Conference of the International Speech Communication Association.
Zurück zum Zitat Ganapathy, S., Pelecanos, J., & Omar, M. K. (2011). Feature normalization for speaker verification in room reverberation. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 4836–4839). Ganapathy, S., Pelecanos, J., & Omar, M. K. (2011). Feature normalization for speaker verification in room reverberation. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (pp. 4836–4839).
Zurück zum Zitat González-Rodríguez, J., Ortega-García, J., Martín, C., & Hernández, L. (1996). Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, (pp. 1333–1336). González-Rodríguez, J., Ortega-García, J., Martín, C., & Hernández, L. (1996). Increasing robustness in GMM speaker recognition systems for noisy and reverberant speech with low complexity microphone arrays. In Fourth International Conference on Spoken Language. ICSLP 96. Proceedings, (pp. 1333–1336).
Zurück zum Zitat Mammone, R. J., Zhang, X., & Ramachandran, R. P. (1996). Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine, 13, 58.CrossRef Mammone, R. J., Zhang, X., & Ramachandran, R. P. (1996). Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine, 13, 58.CrossRef
Zurück zum Zitat Ming, J., Hazen, T. J., Glass, J. R., & Reynolds, D. A. (2007). Robust speaker recognition in noisy conditions. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1711–1723.CrossRef Ming, J., Hazen, T. J., Glass, J. R., & Reynolds, D. A. (2007). Robust speaker recognition in noisy conditions. IEEE Transactions on Audio, Speech, and Language Processing, 15, 1711–1723.CrossRef
Zurück zum Zitat Ning, W., Ching, P. C., Nengheng, Z., & Tan, L. (2011). Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Transactions on Audio, Speech, and Language Processing, 19, 196–205.CrossRef Ning, W., Ching, P. C., Nengheng, Z., & Tan, L. (2011). Robust speaker recognition using denoised vocal source and vocal tract features. IEEE Transactions on Audio, Speech, and Language Processing, 19, 196–205.CrossRef
Zurück zum Zitat Petrick, R., Lohde, K., Wolff, M., & Hoffmann, R. (2007). The harming part of room acoustics in automatic speech recognition. Petrick, R., Lohde, K., Wolff, M., & Hoffmann, R. (2007). The harming part of room acoustics in automatic speech recognition.
Zurück zum Zitat Rose, R. C., & Reynolds, D. A. (1990). Text independent speaker identification using automatic acoustic segmentation. In 1990 International Conference on Acoustics, Speech, and Signal Processing, ICASSP-90 (pp. 293–296). Rose, R. C., & Reynolds, D. A. (1990). Text independent speaker identification using automatic acoustic segmentation. In 1990 International Conference on Acoustics, Speech, and Signal Processing, ICASSP-90 (pp. 293–296).
Zurück zum Zitat Rossing, T. (2007). Introduction to acoustics. In Springer Handbook of Acoustics. (pp. 1–6). Springer, New York. Rossing, T. (2007). Introduction to acoustics. In Springer Handbook of Acoustics. (pp. 1–6). Springer, New York.
Zurück zum Zitat Sadjadi, S. O., Slaney, M., & Heck, L. (2013). MSR identity toolbox v1. 0: A MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter, 1(4), 1–32. Sadjadi, S. O., Slaney, M., & Heck, L. (2013). MSR identity toolbox v1. 0: A MATLAB toolbox for speaker-recognition research. Speech and Language Processing Technical Committee Newsletter, 1(4), 1–32.
Zurück zum Zitat Sehr, A., Habets, E. A., Maas, R., & Kellermann, W. (2010). Towards a better understanding of the effect of reverberation on speech recognition performance. In Proc. IWAENC. Sehr, A., Habets, E. A., Maas, R., & Kellermann, W. (2010). Towards a better understanding of the effect of reverberation on speech recognition performance. In Proc. IWAENC.
Zurück zum Zitat Shao, Y., Jin, Z., Wang, D., & Srinivasan, S. (2009). An auditory-based feature for robust speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009 (pp. 4625-4628). Shao, Y., Jin, Z., Wang, D., & Srinivasan, S. (2009). An auditory-based feature for robust speech recognition. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009 (pp. 4625-4628).
Zurück zum Zitat Zhao, X., Shao, Y., & Wang, D. (2012). CASA-based robust speaker identification. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1608–1616.CrossRef Zhao, X., Shao, Y., & Wang, D. (2012). CASA-based robust speaker identification. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1608–1616.CrossRef
Zurück zum Zitat Zhao, X., Wang, Y., & Wang, D. (2014). Robust speaker identification in noisy and reverberant conditions. Zhao, X., Wang, Y., & Wang, D. (2014). Robust speaker identification in noisy and reverberant conditions.
Metadaten
Titel
Mitigate the reverberation effect on the speaker verification performance using different methods
verfasst von
Khamis A. Al-Karawi
Publikationsdatum
18.11.2020
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 1/2021
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-020-09780-1

Weitere Artikel der Ausgabe 1/2021

International Journal of Speech Technology 1/2021 Zur Ausgabe