Skip to main content
Top

2021 | OriginalPaper | Chapter

Continuous Speech Recognition Technologies—A Review

Authors : Shobha Bhatt, Anurag Jain, Amita Dev

Published in: Recent Developments in Acoustics

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Speech recognition is the most emerging field of research, as speech is the natural way of communication. This paper presents the different technologies used for continuous speech recognition. The structure of speech recognition system with different stages is described. Different feature extraction techniques for developing speech recognition system have been studied with merits and demerits. Due to the vital role of language modeling in speech recognition, various aspects of language modeling in speech recognition were presented. Widely used classification techniques for developing speech recognition system were discussed. Importance of speech corpus during the speech recognition process was described. Speech recognition tools for analysis and development purpose were explored. Parameters of speech recognition system testing were discussed. Finally, a comparative study was listed for different technological aspects of speech recognition.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Sarma BD, Mahadeva Prasanna SR (2017) Acoustic–phonetic analysis for speech recognition: a review. IETE Tech Rev 1–23 Sarma BD, Mahadeva Prasanna SR (2017) Acoustic–phonetic analysis for speech recognition: a review. IETE Tech Rev 1–23
3.
go back to reference Furui S (2007) Speech and speaker recognition evaluation. In: Dybkjær L, Hemsen H, Minker W (eds) Evaluation of text and speech systems. Text, speech and language technology, vol 37. Springer, Dordrecht Furui S (2007) Speech and speaker recognition evaluation. In: Dybkjær L, Hemsen H, Minker W (eds) Evaluation of text and speech systems. Text, speech and language technology, vol 37. Springer, Dordrecht
4.
go back to reference Saon George, Chien Jen-Tzung (2012) Large-vocabulary continuous speech recognition systems: a look at some recent advances. IEEE Signal Process Mag 29(6):18–33CrossRef Saon George, Chien Jen-Tzung (2012) Large-vocabulary continuous speech recognition systems: a look at some recent advances. IEEE Signal Process Mag 29(6):18–33CrossRef
6.
go back to reference Bahl LR et al (1999) Context dependent modeling of phones in continuous speech using decision trees. HLT Bahl LR et al (1999) Context dependent modeling of phones in continuous speech using decision trees. HLT
7.
go back to reference Cutajar M et al (2013) Comparative study of automatic speech recognition techniques. IET Signal Process 7(1):25–46 Cutajar M et al (2013) Comparative study of automatic speech recognition techniques. IET Signal Process 7(1):25–46
8.
go back to reference Lippmann Richard P (1989) Review of neural networks for speech recognition. Neural Comput 1(1):1–38CrossRef Lippmann Richard P (1989) Review of neural networks for speech recognition. Neural Comput 1(1):1–38CrossRef
9.
go back to reference Vimala C, Radha V (2015) Isolated speech recognition system for Tamil language using statistical pattern matching and machine learning techniques. J Eng Sci Technol (JESTEC) 10(5):617–632 Vimala C, Radha V (2015) Isolated speech recognition system for Tamil language using statistical pattern matching and machine learning techniques. J Eng Sci Technol (JESTEC) 10(5):617–632
10.
go back to reference Picone Joseph W (1993) Signal modeling techniques in speech recognition. Proc IEEE 81(9):1215–1247CrossRef Picone Joseph W (1993) Signal modeling techniques in speech recognition. Proc IEEE 81(9):1215–1247CrossRef
11.
go back to reference Fook CY et al (2013) Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J Electric Eng Comput Sci 21(1):983–1994 Fook CY et al (2013) Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J Electric Eng Comput Sci 21(1):983–1994
12.
go back to reference Scharenborg OE, Bouwman AGG, Boves LWJ (2000) Connected digit recognition with class specific word models Scharenborg OE, Bouwman AGG, Boves LWJ (2000) Connected digit recognition with class specific word models
13.
go back to reference Nieuwoudt C, Botha EC (1999) Connected digit recognition in Afrikaans using hidden Markov models Nieuwoudt C, Botha EC (1999) Connected digit recognition in Afrikaans using hidden Markov models
14.
go back to reference Bhiksha R, Singh R (2011) Design and implementation of speech recognition systems. Carniege Mellon School of Computer Science Bhiksha R, Singh R (2011) Design and implementation of speech recognition systems. Carniege Mellon School of Computer Science
15.
go back to reference Davel M, Martirosian O (2009) Pronunciation dictionary development in resource-scarce environments Davel M, Martirosian O (2009) Pronunciation dictionary development in resource-scarce environments
16.
go back to reference Wu T (2009) Feature selection in speech and speaker recognition. Katholieke Universiteit Leuven Wu T (2009) Feature selection in speech and speaker recognition. Katholieke Universiteit Leuven
17.
go back to reference Kumar K, Kim C, Stern RM (2011) Delta-spectral cepstral coefficients for robust speech recognition. In: 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE Kumar K, Kim C, Stern RM (2011) Delta-spectral cepstral coefficients for robust speech recognition. In: 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE
18.
go back to reference Aggarwal RK, Dave M (2012) Integration of multiple acoustic and language models for improved Hindi speech recognition system. Int J Speech Technol 15(2):165–180 Aggarwal RK, Dave M (2012) Integration of multiple acoustic and language models for improved Hindi speech recognition system. Int J Speech Technol 15(2):165–180
19.
go back to reference Bush M, Kopec G (1987) Network-based connected digit recognition. IEEE Trans Acoust Speech Signal Process 35(10):1401–1413 Bush M, Kopec G (1987) Network-based connected digit recognition. IEEE Trans Acoust Speech Signal Process 35(10):1401–1413
20.
go back to reference Singhal S, Dubey RK (2015) Automatic speech recognition for connected words using DTW/HMM for English/Hindi languages. In: 2015 Communication, control and intelligent systems (CCIS). IEEE Singhal S, Dubey RK (2015) Automatic speech recognition for connected words using DTW/HMM for English/Hindi languages. In: 2015 Communication, control and intelligent systems (CCIS). IEEE
21.
go back to reference He ZG, Liu ZM (2012) Chinese connected word speech recognition based on derivative dynamic time warping. In: Advanced materials research, vol 542. Trans Tech Publications He ZG, Liu ZM (2012) Chinese connected word speech recognition based on derivative dynamic time warping. In: Advanced materials research, vol 542. Trans Tech Publications
22.
go back to reference Bernardis G, Bourlard H (1998) Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems. In: Fifth international conference on spoken language processing Bernardis G, Bourlard H (1998) Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems. In: Fifth international conference on spoken language processing
23.
go back to reference Bourlard H, Morgan N (1998) Hybrid HMM/ANN systems for speech recognition: overview and new research directions. In: Adaptive processing of sequences and data structures. Springer, Berlin, pp 389–417 Bourlard H, Morgan N (1998) Hybrid HMM/ANN systems for speech recognition: overview and new research directions. In: Adaptive processing of sequences and data structures. Springer, Berlin, pp 389–417
24.
go back to reference Livescu Karen, Fosler-Lussier Eric, Metze Florian (2012) Subword modeling for automatic speech recognition: past, present, and emerging approaches. IEEE Signal Process Mag 29(6):44–57CrossRef Livescu Karen, Fosler-Lussier Eric, Metze Florian (2012) Subword modeling for automatic speech recognition: past, present, and emerging approaches. IEEE Signal Process Mag 29(6):44–57CrossRef
25.
go back to reference Renals S, McKelvie D, McInnes F (1991) A comparative study of continuous speech recognition using neural networks and hidden Markov models. In: 1991 International Conference on Acoustics, Speech, and Signal Processing. ICASSP-91. IEEE Renals S, McKelvie D, McInnes F (1991) A comparative study of continuous speech recognition using neural networks and hidden Markov models. In: 1991 International Conference on Acoustics, Speech, and Signal Processing. ICASSP-91. IEEE
26.
go back to reference Saini P, Kaur P, Dua M (2013) Hindi automatic speech recognition using htk. Int J Eng Trends Technol (IJETT), 4(6), 2223–2229 versité de Aix-en-Provence, 1998 Saini P, Kaur P, Dua M (2013) Hindi automatic speech recognition using htk. Int J Eng Trends Technol (IJETT), 4(6), 2223–2229 versité de Aix-en-Provence, 1998
27.
go back to reference Makhoul John, Schwartz Richard (1995) State of the art in continuous speech recognition. Proc Natl Acad Sci 92(22):9956–9963CrossRef Makhoul John, Schwartz Richard (1995) State of the art in continuous speech recognition. Proc Natl Acad Sci 92(22):9956–9963CrossRef
28.
go back to reference Klatt Dennis H (1977) Review of the ARPA speech understanding project. J Acoust Soc Am 62(6):1345–1366CrossRef Klatt Dennis H (1977) Review of the ARPA speech understanding project. J Acoust Soc Am 62(6):1345–1366CrossRef
29.
go back to reference Jelinek Frederick (1976) Continuous speech recognition by statistical methods. Proc IEEE 64(4):532–556CrossRef Jelinek Frederick (1976) Continuous speech recognition by statistical methods. Proc IEEE 64(4):532–556CrossRef
30.
go back to reference Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst Tech J 62(4): 1035–1074 Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst Tech J 62(4): 1035–1074
32.
go back to reference Dev Amita S, Agrawal S, Roy Choudhury D (2003) Categorization of Hindi phonemes by neural networks. AI & SOCIETY 17(3–4):375–382 Dev Amita S, Agrawal S, Roy Choudhury D (2003) Categorization of Hindi phonemes by neural networks. AI & SOCIETY 17(3–4):375–382
33.
go back to reference Anusuya MA, Katti SK (2011) Front end analysis of speech recognition: a review. Int J Speech Technol 14(2):99–145CrossRef Anusuya MA, Katti SK (2011) Front end analysis of speech recognition: a review. Int J Speech Technol 14(2):99–145CrossRef
35.
go back to reference Bhatt S, Dev A, Jain A Hindi speech vowel recognition using hidden Markov model. In: Proceedings of The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp 196–199 Bhatt S, Dev A, Jain A Hindi speech vowel recognition using hidden Markov model. In: Proceedings of The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp 196–199
38.
go back to reference Agrawal SS, Prakash N, Jain A (2010) Transformation of emotion based on acoustic features of intonation patterns for Hindi speech. Afr J Math Comput Sci Res 3(10): 255–266 Agrawal SS, Prakash N, Jain A (2010) Transformation of emotion based on acoustic features of intonation patterns for Hindi speech. Afr J Math Comput Sci Res 3(10): 255–266
39.
go back to reference Madan A, Gupta D (2014) Speech feature extraction and classification: a comparative review. Int J Comput Appl 90(9) Madan A, Gupta D (2014) Speech feature extraction and classification: a comparative review. Int J Comput Appl 90(9)
Metadata
Title
Continuous Speech Recognition Technologies—A Review
Authors
Shobha Bhatt
Anurag Jain
Amita Dev
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-5776-7_8

Premium Partners