Skip to main content

2015 | OriginalPaper | Buchkapitel

Speaker Verification Performance Evaluation Based on Open Source Speech Processing Software and TIMIT Speech Corpus

verfasst von : Piotr Kłosowski, Adam Dustor, Jacek Izydorczyk

Erschienen in: Computer Networks

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Creating of speaker recognition application requires advanced speech processing techniques realized by specialized speech processing software. It is very possible to improve the speaker recognition research by using speech processing platform based on open source software. The article presents the example of using open source speech processing software to perform speaker verification experiments designed to test various speaker recognition models based on different scenarios. Speaker verification efficiency was evaluated for each scenario using TIMIT speech corpus distributed by Linguistic Data Consortium. The experiment results allowed to compare and select the best scenario to build speaker model for speaker verification application.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dustor, A., Kłosowski, P., Izydorczyk, J.: Influence of feature dimensionality and model complexity on speaker verification performance. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2014. CCIS, vol. 431, pp. 177–186. Springer, Heidelberg (2014) CrossRef Dustor, A., Kłosowski, P., Izydorczyk, J.: Influence of feature dimensionality and model complexity on speaker verification performance. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2014. CCIS, vol. 431, pp. 177–186. Springer, Heidelberg (2014) CrossRef
2.
Zurück zum Zitat Dustor, A., Kłosowski, P., Izydorczyk, J.: Speaker recognition system with good generalization properties. In: Proceedings of International Conference on Multimedia Computing and Systems 2014 p. 73, Marrakech, Morocco, IEEE (2014) Dustor, A., Kłosowski, P., Izydorczyk, J.: Speaker recognition system with good generalization properties. In: Proceedings of International Conference on Multimedia Computing and Systems 2014 p. 73, Marrakech, Morocco, IEEE (2014)
3.
Zurück zum Zitat Rabiner, L.R., Schafer, R.W.: Introduction to digital speech processing. Found. Trends Sig. Process. 1(1–2), 1–194 (2007)CrossRef Rabiner, L.R., Schafer, R.W.: Introduction to digital speech processing. Found. Trends Sig. Process. 1(1–2), 1–194 (2007)CrossRef
4.
Zurück zum Zitat Kłosowski, P., Dustor, A., Izydorczyk, J., Kotas, J., Ślimok, J.: Speech recognition based on open source speech processing software. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2014. CCIS, vol. 431, pp. 308–317. Springer, Heidelberg (2014) CrossRef Kłosowski, P., Dustor, A., Izydorczyk, J., Kotas, J., Ślimok, J.: Speech recognition based on open source speech processing software. In: Kwiecień, A., Gaj, P., Stera, P. (eds.) CN 2014. CCIS, vol. 431, pp. 308–317. Springer, Heidelberg (2014) CrossRef
6.
Zurück zum Zitat Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circ. Sys. Mag. 11(2), 23–61 (2011)CrossRef Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circ. Sys. Mag. 11(2), 23–61 (2011)CrossRef
7.
Zurück zum Zitat Tsontzos, G., Orglmeister, R.: CMU Sphinx4 speech recognizer in a Service-oriented Computing style. In: IEEE International Conference on Service-Oriented Computing and Applications (SOCA), pp. 1–4 (2011) Tsontzos, G., Orglmeister, R.: CMU Sphinx4 speech recognizer in a Service-oriented Computing style. In: IEEE International Conference on Service-Oriented Computing and Applications (SOCA), pp. 1–4 (2011)
8.
Zurück zum Zitat Bilmes, J., Bartels, C.: Graphical model architectures for speech recognition. IEEE Sig Process. Mag. 22(5), 89–100 (2005)CrossRef Bilmes, J., Bartels, C.: Graphical model architectures for speech recognition. IEEE Sig Process. Mag. 22(5), 89–100 (2005)CrossRef
9.
Zurück zum Zitat Pellom, B., Hacioglu, K.: Recent improvements in the CU SONIC ASR system for noisy speech: the SPINE task. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hong Kong (Apr 2003) Pellom, B., Hacioglu, K.: Recent improvements in the CU SONIC ASR system for noisy speech: the SPINE task. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Hong Kong (Apr 2003)
10.
Zurück zum Zitat Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Department, Cambridge, UK (2002) Young, S., Evermann, G., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Department, Cambridge, UK (2002)
11.
Zurück zum Zitat Bonastre, J.F., Wils, F., Meignier, S.: ALIZE, a free toolkit for speaker recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05), vol. 1, pp. 737–740 (2005) Bonastre, J.F., Wils, F., Meignier, S.: ALIZE, a free toolkit for speaker recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05), vol. 1, pp. 737–740 (2005)
12.
Zurück zum Zitat Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms. Technical standard ES 201 108, v1.1.3. European Telecommunications Standards Institute (2003) Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms. Technical standard ES 201 108, v1.1.3. European Telecommunications Standards Institute (2003)
13.
Zurück zum Zitat Fauve, B.G.B., Matrouf, D., Scheffer, N., Bonastre, J.F., Mason, J.S.D.: State-of-the-art performance in text-independent speaker verification through open-source software. IEEE Trans. Audio, Speech, Lang. Process. 15(7), 1960–1968 (2007)CrossRef Fauve, B.G.B., Matrouf, D., Scheffer, N., Bonastre, J.F., Mason, J.S.D.: State-of-the-art performance in text-independent speaker verification through open-source software. IEEE Trans. Audio, Speech, Lang. Process. 15(7), 1960–1968 (2007)CrossRef
14.
Zurück zum Zitat Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. Linguistic Data Consortium, Philadelphia (1993) Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. Linguistic Data Consortium, Philadelphia (1993)
15.
Zurück zum Zitat Fisher, W.M., Doddington, G.R., Goudie-Marshall, K.M.: The DARPA speech recognition research database: specifications and status. In: Proceedings of DARPA Workshop on Speech Recognition, pp. 93–99 (1986) Fisher, W.M., Doddington, G.R., Goudie-Marshall, K.M.: The DARPA speech recognition research database: specifications and status. In: Proceedings of DARPA Workshop on Speech Recognition, pp. 93–99 (1986)
16.
Zurück zum Zitat Fernandez, S., Graves, A., Schmidhuber, J.: Phoneme recognition in TIMIT with BLSTM-CTC (2008) Fernandez, S., Graves, A., Schmidhuber, J.: Phoneme recognition in TIMIT with BLSTM-CTC (2008)
17.
Zurück zum Zitat Lopes, C., Perdigao, F.: Phoneme Recognition on the TIMIT Database (2011) Lopes, C., Perdigao, F.: Phoneme Recognition on the TIMIT Database (2011)
Metadaten
Titel
Speaker Verification Performance Evaluation Based on Open Source Speech Processing Software and TIMIT Speech Corpus
verfasst von
Piotr Kłosowski
Adam Dustor
Jacek Izydorczyk
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-19419-6_38

Premium Partner