Skip to main content

2018 | OriginalPaper | Buchkapitel

Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition

verfasst von : Lekshmi Krishna Ramachandran, Sherly Elizabeth

Erschienen in: Intelligent Human Computer Interaction

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatic Speech Recognition is a computer-driven transcription of spoken-language into human-readable text. This paper is focused on the development of an acoustic model for medium vocabulary, context independent, isolated Malayalam Speech Recognizer using Hidden Markov Model (HMM). In this work, the emission probabilities of syllables, based on HMMs are estimated from the Gaussian Mixture Model (GMM). Mel Frequency Cepstral Coefficient (MFCC) technique is used for feature extraction from the input speech. The generation of mixture weights for GMMs is done by implementing Dirichlet Distribution. The efficiency of thus generated Gaussian Mixture Model is verified with different Information Criteria namely Akaike Information Criterion, Bayes Information Criterion, Corrected AIC, Kullback Linear Information Criterion, corrected KIC and Approximated KIC (KICc, AKICc). The accuracy of medium vocabulary, speaker dependent and isolated Malayalam speech corpus for a single Gaussian is 90.91% and Word Error Rate (WER) is 11.9%. The word accuracy and WER of the system are calculated based on the experiments conducted for multivariate Gaussians. For Gaussian mixture five, a better word accuracy of 95.24% along with a WER of 4.76% is attained and the same is verified using Information Criteria.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)CrossRef Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)CrossRef
2.
Zurück zum Zitat Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRef Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRef
5.
Zurück zum Zitat Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010) Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)
6.
Zurück zum Zitat Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010) Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010)
7.
Zurück zum Zitat Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013) Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013)
8.
Zurück zum Zitat Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)CrossRef Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)CrossRef
9.
Zurück zum Zitat Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012) Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012)
10.
Zurück zum Zitat Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012) Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012)
11.
Zurück zum Zitat Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009) Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009)
12.
Zurück zum Zitat Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)CrossRef Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)CrossRef
13.
Zurück zum Zitat Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)CrossRef Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)CrossRef
14.
Zurück zum Zitat Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008) Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008)
15.
Zurück zum Zitat Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006) Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006)
16.
Zurück zum Zitat Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)MathSciNet Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)MathSciNet
17.
Zurück zum Zitat Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994) Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994)
19.
Zurück zum Zitat Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009) Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009)
20.
Zurück zum Zitat Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)MathSciNetCrossRef Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)MathSciNetCrossRef
21.
Zurück zum Zitat Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010) Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010)
23.
Zurück zum Zitat Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)MathSciNetCrossRef Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)MathSciNetCrossRef
25.
Zurück zum Zitat Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)MathSciNetCrossRef Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)MathSciNetCrossRef
26.
Zurück zum Zitat Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)MathSciNetCrossRef Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)MathSciNetCrossRef
27.
Zurück zum Zitat Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)CrossRef Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)CrossRef
Metadaten
Titel
Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition
verfasst von
Lekshmi Krishna Ramachandran
Sherly Elizabeth
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-04021-5_11

Neuer Inhalt