nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Α Benchmarking of IBM, Google and Wit Automatic Speech Recognition Systems

verfasst von : Foteini Filippidou, Lefteris Moussiades

Erschienen in: Artificial Intelligence Applications and Innovations

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

As the requirements for automatic speech recognition are continually increasing, the demand for accuracy and efficiency is also of particular interest. In this paper, we present most of the well-known Automated Speech Recognition systems (ASR), and we benchmark three of them, namely the IBM Watson, Google, and Wit, using the WER, Hper, and Rper error metrics. The experimental results show that Google’s automatic speech recognition performs better among the three systems. We intend to extend the benchmarking both to include most of the available Automated Speech Recognition systems and increase our test data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Using Twitter to Predict Chart Position for Songs

Nächstes Kapitel A Two-Levels Data Anonymization Approach

Anusuya, M.A., Katti, S.K.: Speech Recognition by Machine, A Review. ArXiv10012267 Cs, January 2010

Sharma, F., Wasson, S.G.: A Speech Recognition and Synthesis Tool : Assistive Technology for Physically Disabled Persons (2012)

Using Statistical Methods in a Speech Recognition System for Romanian Language - ScienceDirect. https://www.sciencedirect.com/science/article/pii/S1474667015373067. Accessed 05 May 2019

Automatic Recognition of Spoken Digits. J. Acoust. Soc. Am. 24(6). https://asa.scitation.org/doi/abs/10.1121/1.1906946. Accessed 05 May 2019

Britain scores in easier man-machine communication. Sens. Rev. 1(4), 172–173 (1981)

Upton, J.: Speech recognition for computers in industry. Sens. Rev. 4(4), 177–178 (1984)CrossRef

Applications and potential of speech recognition. Sens. Rev. 4 (1983)

US7363228B2 - Speech recognition system and method - Google Patents. https://patents.google.com/patent/US7363228B2/en. Accessed 11 May 2019

Alyousefi, S.: Digital Automatic Speech Recognition using Kaldi, Thesis (2018)

10.

Srikanth, R., Salsman, J.: Automatic pronunciation scoring and mispronunciation detection using CMUSphinx. In: Proceedings of the Workshop on Speech and Language Processing Tools in Education, Mumbai, India, pp. 61–68 (2012)

11.

Karpagavalli, S., Chandra, E.: A Review on Automatic Speech Recognition Architecture and Approaches (2016)

12.

Shmyrev, N.: CMUSphinx Open Source Speech Recognition, CMUSphinx Open Source Speech Recognition. http://cmusphinx.github.io/. Accessed 05 May 2019

13.

Morbini, F. et al.: Which ASR should I choose for my dialogue system? In: Proceedings of the SIGDIAL 2013 Conference, pp. 394–403 (2013)

14.

Lamere, P., et al.: The CMU Sphinx-4 Speech Recognition System, p. 4 (2003)

15.

Satori, H., ElHaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014)CrossRef

16.

Povey, D., et al.: The Kaldi Speech Recognition Toolkit (2011)

17.

Këpuska, V.: Comparing speech recognition systems (Microsoft API, Google API And CMU Sphinx). Int. J. Eng. Res. Appl. 07(03), 20–24 (2017). https://doi.org/10.1007/s10772-014-9223-yCrossRef

18.

Dragon NaturallySpeaking, Wikipedia, 17 April 2019

19.

Baker, J.M.: DragonDictatetm-30K: natural language speech recognition with 30,000 words. In: Presented at the First European Conference on Speech Communication and Technology, Paris, France, September 1989. Accessed 26 Apr 2019

20.

Amazon Transcribe – Automatic Speech Recognition - AWS, Amazon Web Services, Inc. https://aws.amazon.com/transcribe/. Accessed 26 Apr 2019

21.

erhopf: Speech-to-text with Azure Speech Services - Azure Cognitive Services. https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-to-text. Accessed 05 May 2019

22.

Wit.ai. https://wit.ai/. Accessed 05 May 2019

23.

Speech Recognition API - Twilio. https://www.twilio.com/speech-recognition. Accessed 05 May 2019

24.

Houndify | Add voice enabled, conversational interface to anything. https://www.houndify.com/. Accessed 05 May 2019

25.

IBM Watson | IBM. https://www.ibm.com/watson. Accessed 05 May2019

26.

Sharp, R.D., et al.: The Watson speech recognition engine. In: 1997 IEEE International Conference Acoustics Speech Signal Process

27.

Goffin, V., et al.: The AT&T WATSON Speech Recognizer. In: Proceedings of ICASSP 3905 IEEE International Conference Acoustics Speech Signal Process (2005)

28.

Adorf, J.: Web Speech API (2013)

29.

Text to speech online. https://www.naturalreaders.com/. Accessed 05 May 2019

30.

Text to Speech | TTS SDK | Speech Recognition (ASR). https://www.ispeech.org/. Accessed 05 May 2019

31.

Sjölander, K., Beskow, J.: WAVESURFER - AN OPEN SOURCE SPEECH TOOL, p. 4 (2001)

32.

Lee, A., Kawahara, T.: Recent development of open-source speech recognition engine julius. In: Proceedings: APSIPA ASC 2009: Asia-Pacific Signal and Information Processing Association, 2009 Annual Summit and Conference, pp. 131–137, October 2009

33.

Errattahi, R., El Hannani, A., Ouahmane, H.: Automatic speech recognition errors detection and correction: a review. Proc. Comput. Sci. 128, 32–37 (2018)CrossRef

34.

Gaikwad, S.K., Gawali, B.W., Yannawar, P.: A review on speech recognition technique. Int. J. Comput. Appl. 10(3), 16–24 (2010)

35.

Morris, C., Maier, V., Green, P.: From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition, p. 4 (2004)

36.

Seljan, S., Dunđer, I.: Combined Automatic Speech Recognition and Machine Translation in Business Correspondence Domain for English-Croatian (2014)

37.

Madl, D.: Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output: cidermole/hjerson (2018)

Titel: Α Benchmarking of IBM, Google and Wit Automatic Speech Recognition Systems
verfasst von: Foteini Filippidou
Lefteris Moussiades
Verlag: Springer International Publishing
Buch: Artificial Intelligence Applications and Innovations
Print ISBN: 978-3-030-49160-4

Electronic ISBN: 978-3-030-49161-1

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-49161-1_7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner