Skip to main content

2018 | OriginalPaper | Buchkapitel

Continuous Speech Recognition and Identification of the Speaker System

verfasst von : Diego Guffanti, Danilo Martínez, José Paladines, Andrea Sarmiento

Erschienen in: Proceedings of the International Conference on Information Technology & Systems (ICITS 2018)

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Currently speech recognition and speaker identification based on a biometric parameter such as voice have been treated as two different worlds and in the market there are no integrated applications of these systems. The design of a system could mean a great contribution to the development of personalized commands, in the area of home automation and robotics, thanks to the availability of the message and the identification of the speaker. Therefore, the development of an integrated biometric voice system is proposed, based on a single voice sample for the identification of the speaker and the message. We use GOOGLE SPEECH API, as a voice text translation tool, and Mel Frequency Cepstral Coefficients or MFCCs extracted from voice signal to identify speakers voice. Functional tests were carried with 50 randomly users, in the end of the study results show 96.4% efficiency in identification, demonstrating efficiency using MFCCs in speaker’s automatic recognition and verifying the use of GOOGLE SPEECH API as a fast, accurate and robust translation tool.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Leu, F.Y., Lin, G.L.: An MFCC-based speaker identification system. In: Proceedings of the International Conference on Advanced Information Networking and Applications, AINA, pp. 1055–1062 (2017) Leu, F.Y., Lin, G.L.: An MFCC-based speaker identification system. In: Proceedings of the International Conference on Advanced Information Networking and Applications, AINA, pp. 1055–1062 (2017)
3.
Zurück zum Zitat Povey, D., Ghoshal, A.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village (2011) Povey, D., Ghoshal, A.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding, Hilton Waikoloa Village (2011)
6.
Zurück zum Zitat Juang, B.H., Chen, T.: The past, present and future of speech processing. IEEE Signal Process. Mag. 15, 24–48 (1998). B.H. Juang (ed.)CrossRef Juang, B.H., Chen, T.: The past, present and future of speech processing. IEEE Signal Process. Mag. 15, 24–48 (1998). B.H. Juang (ed.)CrossRef
7.
Zurück zum Zitat Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Sig. Process. 28(4), 357–366 (1980)CrossRef Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Sig. Process. 28(4), 357–366 (1980)CrossRef
8.
Zurück zum Zitat Khalifa, O., Islam, R., Khan, S., Faizal, M., Dol, D.: Text independent automatic speaker recognition. In: 3rd International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, pp. 561–564 (2004) Khalifa, O., Islam, R., Khan, S., Faizal, M., Dol, D.: Text independent automatic speaker recognition. In: 3rd International Conference on Electrical and Computer Engineering, Dhaka, Bangladesh, pp. 561–564 (2004)
Metadaten
Titel
Continuous Speech Recognition and Identification of the Speaker System
verfasst von
Diego Guffanti
Danilo Martínez
José Paladines
Andrea Sarmiento
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-73450-7_72