Skip to main content

2022 | OriginalPaper | Buchkapitel

Speaker Independent Accent Based Speech Recognition for Malayalam Isolated Words: An LSTM-RNN Approach

verfasst von : Rizwana Kallooravi Thandil, K. P. Mohamed Basheer

Erschienen in: Artificial Intelligence and Speech Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatic speech recognition (ASR) has been a very active area of research for the past few decades. Though there are great advancements in ASR in many languages accent-based speech recognition is an area that is yet to be explored in many languages. Speech recognition by humans is an intuitive process and so is a tough process to make the computers automatically recognize human speech. Although speech recognition has achieved promising achievements for many languages; speech recognition for the Malayalam language is still in infancy. The scarcity of the datasets makes it researchers difficult to do the experiments. Here in this paper, we have experimented with Long Short-Term Memory (LSTM) a Recurrent Neural Network (RNN), for recognizing the accent-based isolated words in Malayalam. The datasets we used here have been constructed manually under a natural recording environment. We used Mel Frequency Cepstral Coefficient (MFCC) methods to extract the features from the audio signals. LSTM with RNN is used to train and build the model since this technology significantly outperforms all other feed-forward deep neural networks and other statistical methodologies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Li, J., Abdelrahman Mohamed, A., Zweig, G., Gong, Y.: LSTM time and frequency recurrence for automatic speech recognition. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015) Li, J., Abdelrahman Mohamed, A., Zweig, G., Gong, Y.: LSTM time and frequency recurrence for automatic speech recognition. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015)
2.
Zurück zum Zitat Miao, Y., Gowayyed, M., Metze, F.: EESEN: end-to-end speech recognition using deep RNN models and WFST-based decoding. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015) Miao, Y., Gowayyed, M., Metze, F.: EESEN: end-to-end speech recognition using deep RNN models and WFST-based decoding. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (2015)
3.
Zurück zum Zitat Yashwanth, H., Harish, M., Suman, D.: Automatic speech recognition using audio visual cues. In: IEEE India Annual conference, pp. 166–169 (2004) Yashwanth, H., Harish, M., Suman, D.: Automatic speech recognition using audio visual cues. In: IEEE India Annual conference, pp. 166–169 (2004)
4.
Zurück zum Zitat Aditya, A., Parikshit, A., Gaurav Deshmukh, G., Piyush, D.: Speech recognition using recurrent neural networks. In: 2018 International Conference on Current Trends Towards Converging Technologies (ICCTCT) (2018) Aditya, A., Parikshit, A., Gaurav Deshmukh, G., Piyush, D.: Speech recognition using recurrent neural networks. In: 2018 International Conference on Current Trends Towards Converging Technologies (ICCTCT) (2018)
5.
Zurück zum Zitat Bhushan, C.K.: Speech recognition using artificial neural network. Proc. Int. J. Comput. Commun. Instrum. Engg. (IJCCIE) 3(1) (2016) Bhushan, C.K.: Speech recognition using artificial neural network. Proc. Int. J. Comput. Commun. Instrum. Engg. (IJCCIE) 3(1) (2016)
6.
Zurück zum Zitat Shrawankar, U., Thakare, V.: Techniques for feature extraction in speech recognition system: a comparative study, (IJCAETS), pp. 412–418, ISSN 0974-3596 (2010) Shrawankar, U., Thakare, V.: Techniques for feature extraction in speech recognition system: a comparative study, (IJCAETS), pp. 412–418, ISSN 0974-3596 (2010)
8.
Zurück zum Zitat Ying, W., Zhang, L., Deng, H.: Sichuan dialect speech recognition with deep LSTM network. Frontiers Comput. Sci. 14, 378–387 (2019)CrossRef Ying, W., Zhang, L., Deng, H.: Sichuan dialect speech recognition with deep LSTM network. Frontiers Comput. Sci. 14, 378–387 (2019)CrossRef
9.
Zurück zum Zitat James, P.E., Kit, M.H., Vaithilingam, C.A., Chiat, A.T.W.: Recurrent neural network-based speech recognition using MATLAB. Int. J. Intell. Enterp. 7(1/2/3), 56–66 (2020)CrossRef James, P.E., Kit, M.H., Vaithilingam, C.A., Chiat, A.T.W.: Recurrent neural network-based speech recognition using MATLAB. Int. J. Intell. Enterp. 7(1/2/3), 56–66 (2020)CrossRef
10.
Zurück zum Zitat Muneer, V.K., Mohamed Basheer, K.P.: The evolution of travel recommender systems: a comprehensive review. Malaya J. Matematik 8(4), 1777–1785 (2020)CrossRef Muneer, V.K., Mohamed Basheer, K.P.: The evolution of travel recommender systems: a comprehensive review. Malaya J. Matematik 8(4), 1777–1785 (2020)CrossRef
Metadaten
Titel
Speaker Independent Accent Based Speech Recognition for Malayalam Isolated Words: An LSTM-RNN Approach
verfasst von
Rizwana Kallooravi Thandil
K. P. Mohamed Basheer
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-95711-7_2

Premium Partner