Skip to main content
Erschienen in: International Journal of Speech Technology 3/2018

20.11.2017

Multi-style speaker recognition database in practical conditions

verfasst von: Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This work describes the process of collection and organization of a multi-style database for speaker recognition. The multi-style database organization is based on three different categories of speaker recognition: voice-password, text-dependent and text-independent framework. Three Indian institutes collaborated for the collection of the database at respective sites. The database is collected over an online telephone network that is deployed for speech based student attendance system. This enables the collection of data for a longer period from different speakers having session variabilities, which is useful for speaker verification (SV) studies in practical scenario. The database contains data of 923 speakers for the three different modes of SV and hence termed as multi-style speaker recognition database. This database is useful for session variability, multi-style speaker recognition and short utterance based SV studies. Initial results are reported over the database for the three different modes of SV. A copy of the database can be obtained by contacting the authors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Benyassine, A., Shlomot, E., Su, H. Y., Massaloux, D., Lamblin, C., & Petit, J. P. (1997). Itu-t recommendation g.729 annex b: A silence compression scheme for use with g.729 optimized for v. 70 digital simultaneous voice and data applications. IEEE Communications Magazine, 35(9), 64–73.CrossRef Benyassine, A., Shlomot, E., Su, H. Y., Massaloux, D., Lamblin, C., & Petit, J. P. (1997). Itu-t recommendation g.729 annex b: A silence compression scheme for use with g.729 optimized for v. 70 digital simultaneous voice and data applications. IEEE Communications Magazine, 35(9), 64–73.CrossRef
Zurück zum Zitat Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A. (May 2006). SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In Proceeding of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A. (May 2006). SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In Proceeding of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
Zurück zum Zitat Chakrabarty, D., Mahadeva Prasanna, S. R., & Das, R. K. (2013). Development and evaluation of online text-independent speaker verification system for remote person authentication. International Journal of Speech Technology, 16(1), 75–88.CrossRef Chakrabarty, D., Mahadeva Prasanna, S. R., & Das, R. K. (2013). Development and evaluation of online text-independent speaker verification system for remote person authentication. International Journal of Speech Technology, 16(1), 75–88.CrossRef
Zurück zum Zitat Das, R. K., & Prasanna, S. R. M. (2015). Speaker verification for variable duration segments and the effect of session variability, Chap. 16. Lecture notes in electrical engineering (pp. 193–200). New York: Springer. Das, R. K., & Prasanna, S. R. M. (2015). Speaker verification for variable duration segments and the effect of session variability, Chap. 16. Lecture notes in electrical engineering (pp. 193–200). New York: Springer.
Zurück zum Zitat Das, R. K., Jelil, S., & Prasanna, S. R. M. (2017). Development of multi-level speech based person authentication system. Journal of Signal Processing Systems, 88(3), 259–271.CrossRef Das, R. K., Jelil, S., & Prasanna, S. R. M. (2017). Development of multi-level speech based person authentication system. Journal of Signal Processing Systems, 88(3), 259–271.CrossRef
Zurück zum Zitat Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.CrossRef Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.CrossRef
Zurück zum Zitat Dey, S., Barman, S., Bhukya, R.K., Das, R.K., Haris, B.C., Prasanna, S.R.M., & Sinha, R. (2014). Speech biometric based attendance system. In National Conference on Communications. Dey, S., Barman, S., Bhukya, R.K., Das, R.K., Haris, B.C., Prasanna, S.R.M., & Sinha, R. (2014). Speech biometric based attendance system. In National Conference on Communications.
Zurück zum Zitat Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification. Wiley: New York.MATH Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification. Wiley: New York.MATH
Zurück zum Zitat Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech and Signal Processing, 29(2), 254–272.CrossRef Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech and Signal Processing, 29(2), 254–272.CrossRef
Zurück zum Zitat Hèbert, M. (2008). Text-dependent speaker recognition (pp. 743–762). Berlin, Heidelberg: Springer-Verlag. Hèbert, M. (2008). Text-dependent speaker recognition (pp. 743–762). Berlin, Heidelberg: Springer-Verlag.
Zurück zum Zitat Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S., & Mason, M. (2011). i-vector based speaker recognition on short utterances. In Interspeech 2011. Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S., & Mason, M. (2011). i-vector based speaker recognition on short utterances. In Interspeech 2011.
Zurück zum Zitat Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52, 12–40.CrossRef Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52, 12–40.CrossRef
Zurück zum Zitat Larcher, A., Lee, K. A., Ma, B., & Li, H. (2014). Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Communication, 60, 56–77.CrossRef Larcher, A., Lee, K. A., Ma, B., & Li, H. (2014). Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Communication, 60, 56–77.CrossRef
Zurück zum Zitat Lee, K. A., Larcher, A., Guangsen, W., Patrick, K., Brummer, N., van Leeuwen, D., et al. (2015). The RedDots data collection for speaker recognition. Interspeech 2015 Dresden, Germany (pp. 2996–3000). Red Hook, NY: NY Curran Associates, Inc. Lee, K. A., Larcher, A., Guangsen, W., Patrick, K., Brummer, N., van Leeuwen, D., et al. (2015). The RedDots data collection for speaker recognition. Interspeech 2015 Dresden, Germany (pp. 2996–3000). Red Hook, NY: NY Curran Associates, Inc.
Zurück zum Zitat Lee, K.-A., Larcher, A., Thai, H., Ma, B., & Li, H. (2011). Joint application of speech and speaker recognition for automation and security in smart home. In Interspeech, pp. 3317–3318. Lee, K.-A., Larcher, A., Thai, H., Ma, B., & Li, H. (2011). Joint application of speech and speaker recognition for automation and security in smart home. In Interspeech, pp. 3317–3318.
Zurück zum Zitat McLaren, M., Ferrer, L., Castan, D., & Lawson, A. (2016). The speakers in the wild (sitw) speaker recognition database. Interspeech, 2016, 818–822.CrossRef McLaren, M., Ferrer, L., Castan, D., & Lawson, A. (2016). The speakers in the wild (sitw) speaker recognition database. Interspeech, 2016, 818–822.CrossRef
Zurück zum Zitat NIST SRE Evaluations 1999-2016, NIST USA. NIST SRE Evaluations 1999-2016, NIST USA.
Zurück zum Zitat O’Shaughnessy, D. (1986). Speaker recognition. IEEE ASSP Magazine, 3(4), 4–17.CrossRef O’Shaughnessy, D. (1986). Speaker recognition. IEEE ASSP Magazine, 3(4), 4–17.CrossRef
Zurück zum Zitat Putra, B., & Suyanto. (2011). Implementation of secure speaker verification at web login page using mel frequency cepstral coefficient-gaussian mixture model (mfcc-gmm). In Instrumentation Control and Automation (ICA), 2011 2nd International Conference on, pp. 358–363. Putra, B., & Suyanto. (2011). Implementation of secure speaker verification at web login page using mel frequency cepstral coefficient-gaussian mixture model (mfcc-gmm). In Instrumentation Control and Automation (ICA), 2011 2nd International Conference on, pp. 358–363.
Zurück zum Zitat Sarkar, G., & Saha, G. (2010). Real time implementation of speaker identification system with frame picking algorithm. Procedia Computer Science, 2(0), 173–180. (Proceedings of the International Conference and Exhibition on Biometrics Technology).CrossRef Sarkar, G., & Saha, G. (2010). Real time implementation of speaker identification system with frame picking algorithm. Procedia Computer Science, 2(0), 173–180. (Proceedings of the International Conference and Exhibition on Biometrics Technology).CrossRef
Zurück zum Zitat Woo, R.H., Park, A., & Hazen, T.J. (2006). The mit mobile device speaker verification corpus: Data collection and preliminary experiments. In Proceeding of Odyssey, The Speaker & Language Recognition Workshop. Woo, R.H., Park, A., & Hazen, T.J. (2006). The mit mobile device speaker verification corpus: Data collection and preliminary experiments. In Proceeding of Odyssey, The Speaker & Language Recognition Workshop.
Zurück zum Zitat Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582.CrossRef Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582.CrossRef
Metadaten
Titel
Multi-style speaker recognition database in practical conditions
verfasst von
Rohan Kumar Das
Sarfaraz Jelil
S. R. Mahadeva Prasanna
Publikationsdatum
20.11.2017
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 3/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-017-9475-4

Weitere Artikel der Ausgabe 3/2018

International Journal of Speech Technology 3/2018 Zur Ausgabe

Neuer Inhalt