Skip to main content
Top

2017 | OriginalPaper | Chapter

Effectiveness of Mel Scale-Based ESA-IFCC Features for Classification of Natural vs. Spoofed Speech

Authors : Madhu R. Kamble, Hemant A. Patil

Published in: Pattern Recognition and Machine Intelligence

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The performance of biometric systems based on Automatic Speaker Verification (ASV) degrades due to spoofing attacks, generated using different speech synthesis (SS) and voice conversion (VC) techniques. Results of recent ASV spoof 2015 challenge indicate that spoof-aware features are a possible solution, rather than focusing on a powerful classifier. In this paper, we investigate the effect of various frequency scales (such as, ERB, Mel and linear) applied on a Gabor filterbank. The output of filterbank was used to exploit the contribution of instantaneous frequency (IF) in each subband energy via Teager Energy Operator-based Energy Separation Algorithm (TEO-ESA) to capture possible changes in spectral envelope of spoofed speech. The IF is computed from narrowband components of the speech signal and Discrete Cosine Transform (DCT) is applied on deviations in IF, which are referred to as Instantaneous Frequency Cosine Coefficients (IFCC). The classification results on static features shows an EER of 1.32% with Mel frequency scale and 1.87% with linear. The results with delta feature of linear frequency scale gets reduced further to 1.39% whereas, with Mel scale, it increased by 0.64% on development set of ASV spoof 2015 challenge database.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Sig. Process. 28(4), 357–366 (1980)CrossRef Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Sig. Process. 28(4), 357–366 (1980)CrossRef
2.
go back to reference Dimitrios, D., Petros, M., Alexandros, P.: Auditory Teager energy Cepstrum coefficients for robust speech recognition. In: INTERSPEECH, pp. 3013–3016 (2005) Dimitrios, D., Petros, M., Alexandros, P.: Auditory Teager energy Cepstrum coefficients for robust speech recognition. In: INTERSPEECH, pp. 3013–3016 (2005)
3.
go back to reference Kaiser, J.F.: On a simple algorithm to calculate the energy of a signal. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 381–384, Albuquerque, New Mexico, USA (1990) Kaiser, J.F.: On a simple algorithm to calculate the energy of a signal. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 381–384, Albuquerque, New Mexico, USA (1990)
4.
go back to reference Kamble, M.R., Patil, H.A.: Novel energy separation based instantaneous frequency features for spoof speech detection. In: Accepted in European Signal Processing Conference (EUSIPCO), Kos Island, Greece, 28 August–2 September 2017 Kamble, M.R., Patil, H.A.: Novel energy separation based instantaneous frequency features for spoof speech detection. In: Accepted in European Signal Processing Conference (EUSIPCO), Kos Island, Greece, 28 August–2 September 2017
5.
go back to reference Maragos, P., Kaiser, J.F., Quatieri, T.F.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. Sig. Process. 41(10), 3024–3051 (1993)CrossRefMATH Maragos, P., Kaiser, J.F., Quatieri, T.F.: Energy separation in signal modulations with application to speech analysis. IEEE Trans. Sig. Process. 41(10), 3024–3051 (1993)CrossRefMATH
6.
go back to reference Maragos, P., Kaiser, J.F., Quatieri, T.F.: On separating amplitude from frequency modulations using energy operators. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 1–4, San Francisco, California, USA (1992) Maragos, P., Kaiser, J.F., Quatieri, T.F.: On separating amplitude from frequency modulations using energy operators. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 2, pp. 1–4, San Francisco, California, USA (1992)
7.
go back to reference Maragos, P., Kaiser, J.F., Quatieri, T.F.: On amplitude and frequency demodulation using energy operators. IEEE Trans. Sig. Process. 41(4), 1532–1550 (1993)CrossRefMATH Maragos, P., Kaiser, J.F., Quatieri, T.F.: On amplitude and frequency demodulation using energy operators. IEEE Trans. Sig. Process. 41(4), 1532–1550 (1993)CrossRefMATH
8.
go back to reference Patil, H.A., Kamble, M.R., Patel, T.B., Soni, M.H.: Novel variable length Teager energy separation based if features for replay detection. In: INTERSPEECH (2017, accepted) Patil, H.A., Kamble, M.R., Patel, T.B., Soni, M.H.: Novel variable length Teager energy separation based if features for replay detection. In: INTERSPEECH (2017, accepted)
9.
go back to reference Sailor, H.B., Kamble, M.R., Patil, H.A.: Unsupervised representation learning using convolutional restricted Boltzmann machine for spoof speech detection. In: INTERSPEECH (2017, accepted) Sailor, H.B., Kamble, M.R., Patil, H.A.: Unsupervised representation learning using convolutional restricted Boltzmann machine for spoof speech detection. In: INTERSPEECH (2017, accepted)
10.
go back to reference Vijayan, K., Reddy, P.R., Murty, K.S.R.: Significance of analytic phase of speech signals in speaker verification. Speech Commun. 81, 54–71 (2016)CrossRef Vijayan, K., Reddy, P.R., Murty, K.S.R.: Significance of analytic phase of speech signals in speaker verification. Speech Commun. 81, 54–71 (2016)CrossRef
11.
go back to reference Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)CrossRef Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)CrossRef
12.
go back to reference Wu, Z., Kinnunen, T., Evans, N.W.D., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A.: ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In: INTERSPEECH, pp. 2037–2041, Dresden, Germany (2015) Wu, Z., Kinnunen, T., Evans, N.W.D., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A.: ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In: INTERSPEECH, pp. 2037–2041, Dresden, Germany (2015)
Metadata
Title
Effectiveness of Mel Scale-Based ESA-IFCC Features for Classification of Natural vs. Spoofed Speech
Authors
Madhu R. Kamble
Hemant A. Patil
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-69900-4_39

Premium Partner