Skip to main content

2017 | OriginalPaper | Buchkapitel

Significance of Frequency Band Selection of MFCC for Text-Independent Speaker Identification

verfasst von : S. B. Dhonde, S. M. Jagade

Erschienen in: Proceedings of the International Conference on Data Engineering and Communication Technology

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents significance of Mel-frequency Cepstral Coefficients (MFCC) Frequency band selection for text-independent speaker identification. Recent studies have been focused on speaker specific information that may extends beyond telephonic passband. The selection of the frequency band is an important factor to effectively capture the speaker specific information present in the speech signal for speaker recognition. This paper focuses on development of a speaker identification system based on MFCC features which are modeled using vector quantization. Here, the frequency band is varied up to 7.75 kHz. Speaker identification experiments evaluated on TIMIT database consisting of 630 speaker shows that the average recognition rate achieved is 97.37 % in frequency band 0–4.85 kHz for 20 MFCC filters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Frédéric Bimbot, Jean-François Bonastre, Corinne Fredouille, Guillaume Gravier, Ivan Magrin-Chagnolleau, Sylvain Meignier, Teva Merlin, Javier Ortega-García, Dijana Petrovska-Delacrétaz, Douglas A. Reynolds: A tutorial on text-independent speaker verification, EURASIP Journal on Applied Signal Processing 2004, Hindawi, pp. 430–451 (2004). Frédéric Bimbot, Jean-François Bonastre, Corinne Fredouille, Guillaume Gravier, Ivan Magrin-Chagnolleau, Sylvain Meignier, Teva Merlin, Javier Ortega-García, Dijana Petrovska-Delacrétaz, Douglas A. Reynolds: A tutorial on text-independent speaker verification, EURASIP Journal on Applied Signal Processing 2004, Hindawi, pp. 430–451 (2004).
2.
Zurück zum Zitat Md Jahangir Alam, Tomi Kinnunen, Patrick Kenny, Pierre Ouellet, Douglas O’Shaughnessy: Multitaper MFCC and PLP features for speaker verification using i-vectors, Journal on Speech Communication, Elsevier, vol. 55, no. 2, pp. 237–251 (2013). Md Jahangir Alam, Tomi Kinnunen, Patrick Kenny, Pierre Ouellet, Douglas O’Shaughnessy: Multitaper MFCC and PLP features for speaker verification using i-vectors, Journal on Speech Communication, Elsevier, vol. 55, no. 2, pp. 237–251 (2013).
3.
Zurück zum Zitat Claude Turner, Anthony Joseph, Murat Aksu, Heather Langdond: The Wavelet and Fourier Transforms in Feature Extraction for Text-Dependent, Filterbank-Based Speaker Recognition, Journal onProcedia Computer Science, Elsevier, vol. 6, pp. 124–129 (2011). Claude Turner, Anthony Joseph, Murat Aksu, Heather Langdond: The Wavelet and Fourier Transforms in Feature Extraction for Text-Dependent, Filterbank-Based Speaker Recognition, Journal onProcedia Computer Science, Elsevier, vol. 6, pp. 124–129 (2011).
4.
Zurück zum Zitat Mangesh S. Deshpande, Raghunath S. Holambe: New Filter Structure based Admissible Wavelet Packet Transform for Text-Independent Speaker Identification, International Journal of Recent Trends in Engineering, vol. 2, no. 5, pp. 121–125 (2009). Mangesh S. Deshpande, Raghunath S. Holambe: New Filter Structure based Admissible Wavelet Packet Transform for Text-Independent Speaker Identification, International Journal of Recent Trends in Engineering, vol. 2, no. 5, pp. 121–125 (2009).
5.
Zurück zum Zitat Dr. Shaila D. Apte: Speech Processing Applications, in Speech and Audio Processing, Section 1, Section 2 and Section 3, pp. 1–6, 67, 91–92, 105–107, 129–132, Wiley India Edition. Dr. Shaila D. Apte: Speech Processing Applications, in Speech and Audio Processing, Section 1, Section 2 and Section 3, pp. 1–6, 67, 91–92, 105–107, 129–132, Wiley India Edition.
6.
Zurück zum Zitat Tomi Kinnunen, Haizhou Li: An overview of text-independent speaker recognition: From features to supervectors, Journal onSpeech Communication, Elsevier, vol. 52, no. 1, pp. 12–40 (2010). Tomi Kinnunen, Haizhou Li: An overview of text-independent speaker recognition: From features to supervectors, Journal onSpeech Communication, Elsevier, vol. 52, no. 1, pp. 12–40 (2010).
7.
Zurück zum Zitat Tomi Kinnunen, Rahim Saeidi, FilipSedlák, Kong Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li: Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.7, pp. 1990–2001 (2012). Tomi Kinnunen, Rahim Saeidi, FilipSedlák, Kong Aik Lee, Johan Sandberg, Maria Hansson-Sandsten, Haizhou Li: Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.7, pp. 1990–2001 (2012).
8.
Zurück zum Zitat Pawan K. Ajmera, Dattatray V. Jadhav, Ragunath S. Holambe: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram, Journal on Pattern Recognition, Elsevier, vol. 44, no. 10–11, pp. 2749–2759 (2011). Pawan K. Ajmera, Dattatray V. Jadhav, Ragunath S. Holambe: Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram, Journal on Pattern Recognition, Elsevier, vol. 44, no. 10–11, pp. 2749–2759 (2011).
9.
Zurück zum Zitat WU Zunjing, CAO Zhigang: Improved MFCC-Based Feature for Robust Speaker Identification, TUP Journals & Magazines, vol.10, no 2, pp. 158–161 (2005). WU Zunjing, CAO Zhigang: Improved MFCC-Based Feature for Robust Speaker Identification, TUP Journals & Magazines, vol.10, no 2, pp. 158–161 (2005).
10.
Zurück zum Zitat Jian-Da Wu, Bing-Fu Lin: Speaker identification using discrete wavelet packet transform technique with irregular decomposition, Journal on Expert Systems with Applications, Elsevier, vol. 36, no. 2, pp. 3136–3143 (2009). Jian-Da Wu, Bing-Fu Lin: Speaker identification using discrete wavelet packet transform technique with irregular decomposition, Journal on Expert Systems with Applications, Elsevier, vol. 36, no. 2, pp. 3136–3143 (2009).
11.
Zurück zum Zitat R. Shantha Selva Kumari, S. Selva Nidhyananthan, Anand.G: Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model, International Conference on Communication Technology and System Design 2011, Journal on Procedia Engineering, Elsevier, vol. 30, pp. 319–326 (2012). R. Shantha Selva Kumari, S. Selva Nidhyananthan, Anand.G: Fused Mel Feature sets based Text-Independent Speaker Identification using Gaussian Mixture Model, International Conference on Communication Technology and System Design 2011, Journal on Procedia Engineering, Elsevier, vol. 30, pp. 319–326 (2012).
12.
Zurück zum Zitat Seiichi Nakagawa, Longbiao Wang, and Shinji Ohtsuka: Speaker Identification and Verification by Combining MFCC and Phase Information, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.4, pp. 1085–1095 (2012). Seiichi Nakagawa, Longbiao Wang, and Shinji Ohtsuka: Speaker Identification and Verification by Combining MFCC and Phase Information, IEEE Transactions Audio, Speech and Language Processing, vol.20, no.4, pp. 1085–1095 (2012).
13.
Zurück zum Zitat Sumithra Manimegalai Govindan, Prakash Duraisamy, Xiaohui Yuan: Adaptive wavelet shrinkage for noise robust speaker recognition, Journal on Digital Signal Processing, Elsevier, vol. 33, pp. 180–190 (2014). Sumithra Manimegalai Govindan, Prakash Duraisamy, Xiaohui Yuan: Adaptive wavelet shrinkage for noise robust speaker recognition, Journal on Digital Signal Processing, Elsevier, vol. 33, pp. 180–190 (2014).
14.
Zurück zum Zitat Noor Almaadeed, Amar Aggoun, Abbes Amira: Speaker identification using multimodal neural networks and wavelet analysis, IET Journals and Magazines, vol. 4, no. 1, pp. 18–28 (2015). Noor Almaadeed, Amar Aggoun, Abbes Amira: Speaker identification using multimodal neural networks and wavelet analysis, IET Journals and Magazines, vol. 4, no. 1, pp. 18–28 (2015).
15.
Zurück zum Zitat Khaled Daqrouq, Tarek A. Tutunji: Speaker identification using vowels features through a combinedmethod of formants, wavelets, and neural network classifiers, Journal on Applied Soft Computing, Elsevier, vol. 27, pp. 231–239 (2015). Khaled Daqrouq, Tarek A. Tutunji: Speaker identification using vowels features through a combinedmethod of formants, wavelets, and neural network classifiers, Journal on Applied Soft Computing, Elsevier, vol. 27, pp. 231–239 (2015).
16.
Zurück zum Zitat Pradhan, G.; Prasanna, S.: Significance of speaker information in wideband speech, in Communications (NCC), 2011 National Conference on, pp. 1–5, (2011). Pradhan, G.; Prasanna, S.: Significance of speaker information in wideband speech, in Communications (NCC), 2011 National Conference on, pp. 1–5, (2011).
Metadaten
Titel
Significance of Frequency Band Selection of MFCC for Text-Independent Speaker Identification
verfasst von
S. B. Dhonde
S. M. Jagade
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-1678-3_21