nach oben

International Journal of Speech Technology

Erschienen in:

20.11.2017

Multi-style speaker recognition database in practical conditions

verfasst von: Rohan Kumar Das, Sarfaraz Jelil, S. R. Mahadeva Prasanna

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This work describes the process of collection and organization of a multi-style database for speaker recognition. The multi-style database organization is based on three different categories of speaker recognition: voice-password, text-dependent and text-independent framework. Three Indian institutes collaborated for the collection of the database at respective sites. The database is collected over an online telephone network that is deployed for speech based student attendance system. This enables the collection of data for a longer period from different speakers having session variabilities, which is useful for speaker verification (SV) studies in practical scenario. The database contains data of 923 speakers for the three different modes of SV and hence termed as multi-style speaker recognition database. This database is useful for session variability, multi-style speaker recognition and short utterance based SV studies. Initial results are reported over the database for the three different modes of SV. A copy of the database can be obtained by contacting the authors.

Vorheriger Artikel Significance of duration modification for speaker verification under mismatch speech tempo condition

Nächster Artikel Neural network and GMM based feature mappings for consonant–vowel recognition in emotional environment

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Benyassine, A., Shlomot, E., Su, H. Y., Massaloux, D., Lamblin, C., & Petit, J. P. (1997). Itu-t recommendation g.729 annex b: A silence compression scheme for use with g.729 optimized for v. 70 digital simultaneous voice and data applications. IEEE Communications Magazine, 35(9), 64–73.CrossRef

Campbell, W.M., Sturim, D.E., Reynolds, D.A., Solomonoff, A. (May 2006). SVM based speaker verification using a GMM supervector kernel and NAP variability compensation. In Proceeding of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

Campbell, J., & Higgins, A.L. (1994). A YOHO speaker verification corpus ldc94s16. Available on LCD website: http://www.ldc.upenn.edu.

Chakrabarty, D., Mahadeva Prasanna, S. R., & Das, R. K. (2013). Development and evaluation of online text-independent speaker verification system for remote person authentication. International Journal of Speech Technology, 16(1), 75–88.CrossRef

Das, R. K., & Prasanna, S. R. M. (2015). Speaker verification for variable duration segments and the effect of session variability, Chap. 16. Lecture notes in electrical engineering (pp. 193–200). New York: Springer.

Das, R. K., Jelil, S., & Prasanna, S. R. M. (2017). Development of multi-level speech based person authentication system. Journal of Signal Processing Systems, 88(3), 259–271.CrossRef

Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19(4), 788–798.CrossRef

Dey, S., Barman, S., Bhukya, R.K., Das, R.K., Haris, B.C., Prasanna, S.R.M., & Sinha, R. (2014). Speech biometric based attendance system. In National Conference on Communications.

Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification. Wiley: New York.MATH

Furui, S. (1981). Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech and Signal Processing, 29(2), 254–272.CrossRef

Haris, B.C., Pradhan, G., Misra, A., Prasanna, S.R.M., Das, R.K., & Sinha, R. (2012). Multivariability speaker recognition database in indian scenario. International Journal of Speech Technology, 15(4), pp. 441–453, [Online]. http://dx.doi.org/10.1007/s10772-012-9140-x

Hèbert, M. (2008). Text-dependent speaker recognition (pp. 743–762). Berlin, Heidelberg: Springer-Verlag.

Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S., & Mason, M. (2011). i-vector based speaker recognition on short utterances. In Interspeech 2011.

Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52, 12–40.CrossRef

Larcher, A., Lee, K. A., Ma, B., & Li, H. (2014). Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Communication, 60, 56–77.CrossRef

Lee, K. A., Larcher, A., Guangsen, W., Patrick, K., Brummer, N., van Leeuwen, D., et al. (2015). The RedDots data collection for speaker recognition. Interspeech 2015 Dresden, Germany (pp. 2996–3000). Red Hook, NY: NY Curran Associates, Inc.

Lee, K.-A., Larcher, A., Thai, H., Ma, B., & Li, H. (2011). Joint application of speech and speaker recognition for automation and security in smart home. In Interspeech, pp. 3317–3318.

McLaren, M., Ferrer, L., Castan, D., & Lawson, A. (2016). The speakers in the wild (sitw) speaker recognition database. Interspeech, 2016, 818–822.CrossRef

NIST SRE Evaluations 1999-2016, NIST USA.

O’Shaughnessy, D. (1986). Speaker recognition. IEEE ASSP Magazine, 3(4), 4–17.CrossRef

Putra, B., & Suyanto. (2011). Implementation of secure speaker verification at web login page using mel frequency cepstral coefficient-gaussian mixture model (mfcc-gmm). In Instrumentation Control and Automation (ICA), 2011 2nd International Conference on, pp. 358–363.

Sarkar, G., & Saha, G. (2010). Real time implementation of speaker identification system with frame picking algorithm. Procedia Computer Science, 2(0), 173–180. (Proceedings of the International Conference and Exhibition on Biometrics Technology).CrossRef

Woo, R.H., Park, A., & Hazen, T.J. (2006). The mit mobile device speaker verification corpus: Data collection and preliminary experiments. In Proceeding of Odyssey, The Speaker & Language Recognition Workshop.

Yegnanarayana, B., Prasanna, S. R. M., Zachariah, J. M., & Gupta, C. S. (2005). Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13(4), 575–582.CrossRef

Titel: Multi-style speaker recognition database in practical conditions
verfasst von: Rohan Kumar Das
Sarfaraz Jelil
S. R. Mahadeva Prasanna
Publikationsdatum: 20.11.2017
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 3/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-017-9475-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2018

A new speech signal denoising algorithm using common vector approach

Agricultural price information acquisition using noise-robust Mandarin auto speech recognition

A computationally efficient approach for acoustic class specific VTLN using regression tree

The effect of different acoustic noise on speech signal formant frequency location

MIMO beamforming system for speech enhancement in realistic environment with multiple noise sources

Speech analysis and synthesis with a refined adaptive sinusoidal representation

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.