Skip to main content

2022 | OriginalPaper | Buchkapitel

Comparison of Modelling ASR System with Different Features Extraction Methods Using Sequential Model

verfasst von : Aishwarya Suresh, Anushka Jain, Kriti Mathur, Pooja Gambhir

Erschienen in: Artificial Intelligence and Speech Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech recognition refers to a device’s ability to respond to spoken instructions. Speech recognition facilitates hands-free use of various gadgets and appliances (a godsend for many incapacitated persons), as well as supplying input for automatic translation and ready-to-print dictation. Many industries, including healthcare, military telecommunications, and personal computing, use speech recognition programmes. In our paper, we are including the comparison between the different feature extraction methods (BFCC, GFCC, MFCC, MFCC Delta, MFCC Double Delta, LFCC and NGCC) using neural networks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Wanli, Z., Guoxin, L.: The research of feature extraction based on MFCC for speaker recognition. In: Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, pp. 1074–1077. IEEE (Oct 2013) Wanli, Z., Guoxin, L.: The research of feature extraction based on MFCC for speaker recognition. In: Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, pp. 1074–1077. IEEE (Oct 2013)
2.
Zurück zum Zitat Dev Amita Agrawal, S.S.: A Novel MFCCs Normalization Technique for Robust Hindi Speech Recognition 17th International Congress on Acoustics (ICA) Rome, 2–7 Sept (2001) Dev Amita Agrawal, S.S.: A Novel MFCCs Normalization Technique for Robust Hindi Speech Recognition 17th International Congress on Acoustics (ICA) Rome, 2–7 Sept (2001)
3.
Zurück zum Zitat Kherdekar, V.A., Naik, S.A.: Convolution neural network model for recognition of speech for words used in mathematical expression. Turkish J. Comput. Math. Educ. (TURCOMAT) 12(6), 4034–4042 (2021) Kherdekar, V.A., Naik, S.A.: Convolution neural network model for recognition of speech for words used in mathematical expression. Turkish J. Comput. Math. Educ. (TURCOMAT) 12(6), 4034–4042 (2021)
4.
Zurück zum Zitat Huang, J.T., Li, J., Gong, Y.: An analysis of convolutional neural networks for speech recognition. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4989–4993. IEEE (Apr 2015) Huang, J.T., Li, J., Gong, Y.: An analysis of convolutional neural networks for speech recognition. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4989–4993. IEEE (Apr 2015)
5.
Zurück zum Zitat Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)CrossRef Abdel-Hamid, O., Mohamed, A.R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)CrossRef
6.
Zurück zum Zitat Phani Bhushan, S., Vani, H.Y., Shivkumar, D.K., Sreeraksha, M.R.: Stuttered speech recognition using convolutional neural networks. Int. J. Eng. Res. Technol. (IJERT) NCCDS 9(12) (2021) Phani Bhushan, S., Vani, H.Y., Shivkumar, D.K., Sreeraksha, M.R.: Stuttered speech recognition using convolutional neural networks. Int. J. Eng. Res. Technol. (IJERT) NCCDS 9(12) (2021)
7.
Zurück zum Zitat Gulzar, T., Singh, A., Sharma, S.: Comparative analysis of LPCC, MFCC and BFCC for the recognition of Hindi words using artificial neural networks. Int. J. Comput. Appl. 101(12), 22–27 (2014) Gulzar, T., Singh, A., Sharma, S.: Comparative analysis of LPCC, MFCC and BFCC for the recognition of Hindi words using artificial neural networks. Int. J. Comput. Appl. 101(12), 22–27 (2014)
8.
Zurück zum Zitat Li, X., Zhou, Z.: Speech command recognition with convolutional neural network. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017) Li, X., Zhou, Z.: Speech command recognition with convolutional neural network. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2017)
9.
Zurück zum Zitat Kubanek, M., Bobulski, J., Kulawik, J.: A method of speech coding for speech recognition using a convolutional neural network. Symmetry 11(9), 1185 (2019)CrossRef Kubanek, M., Bobulski, J., Kulawik, J.: A method of speech coding for speech recognition using a convolutional neural network. Symmetry 11(9), 1185 (2019)CrossRef
10.
Zurück zum Zitat Tamazin, M., Gouda, A., Khedr, M.: Enhanced automatic speech recognition system based on enhancing power-normalized cepstral coefficients. Appl. Sci. 9(10), 2166 (2019)CrossRef Tamazin, M., Gouda, A., Khedr, M.: Enhanced automatic speech recognition system based on enhancing power-normalized cepstral coefficients. Appl. Sci. 9(10), 2166 (2019)CrossRef
11.
Zurück zum Zitat Kanthi, A.N., Moinuddin, M.: Speaker Identification based on GFCC using GMM. Int. J. Innov. Res. Adv. Eng. 1(8), 224–232 (2014) Kanthi, A.N., Moinuddin, M.: Speaker Identification based on GFCC using GMM. Int. J. Innov. Res. Adv. Eng. 1(8), 224–232 (2014)
12.
Zurück zum Zitat Dev, A., Bansal, S.: Emotional Hindi speech: feature extraction and classification. In: IEEE Explorer, Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference (989-9-805-4415-1), 11–13 Mar 2015, pp. 1865–1868 (2015) Dev, A., Bansal, S.: Emotional Hindi speech: feature extraction and classification. In: IEEE Explorer, Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference (989-9-805-4415-1), 11–13 Mar 2015, pp. 1865–1868 (2015)
13.
Zurück zum Zitat Maurya, A., Kumar, D., Agarwal, R.K.: Speaker recognition for Hindi speech signal using MFCC-GMM approach. Procedia Comput. Sci. 125, 880–887 (2018)CrossRef Maurya, A., Kumar, D., Agarwal, R.K.: Speaker recognition for Hindi speech signal using MFCC-GMM approach. Procedia Comput. Sci. 125, 880–887 (2018)CrossRef
14.
Zurück zum Zitat Gambhira, P., Devb, A.: A run-through: text independent speaker identification using deep learning. In: Artificial Intelligence and Speech Technology: Proceedings of the 2nd International Conference on Artificial Intelligence and Speech Technology, (AIST2020), 19–20 Nov 2020, Delhi, India, pp. 139. CRC Press (June 2021) Gambhira, P., Devb, A.: A run-through: text independent speaker identification using deep learning. In: Artificial Intelligence and Speech Technology: Proceedings of the 2nd International Conference on Artificial Intelligence and Speech Technology, (AIST2020), 19–20 Nov 2020, Delhi, India, pp. 139. CRC Press (June 2021)
15.
16.
Zurück zum Zitat Kumar, C., Ur Rehman, F., Kumar, S., Mehmood, A., Shabir, G.: Analysis of MFCC and BFCC in a speaker identification system. In: 2018 International Conference on Computing, Mathematics and Engineering Technologies. iCoMET 2018 - Invent, Innovate and Integrate for Socioeconomic Development: Conference Proceedings, vol. Jan–Dec, pp. 1–5 (2018) Kumar, C., Ur Rehman, F., Kumar, S., Mehmood, A., Shabir, G.: Analysis of MFCC and BFCC in a speaker identification system. In: 2018 International Conference on Computing, Mathematics and Engineering Technologies. iCoMET 2018 - Invent, Innovate and Integrate for Socioeconomic Development: Conference Proceedings, vol. Jan–Dec, pp. 1–5 (2018)
Metadaten
Titel
Comparison of Modelling ASR System with Different Features Extraction Methods Using Sequential Model
verfasst von
Aishwarya Suresh
Anushka Jain
Kriti Mathur
Pooja Gambhir
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-95711-7_5

Premium Partner