nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition

verfasst von : Lekshmi Krishna Ramachandran, Sherly Elizabeth

Erschienen in: Intelligent Human Computer Interaction

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automatic Speech Recognition is a computer-driven transcription of spoken-language into human-readable text. This paper is focused on the development of an acoustic model for medium vocabulary, context independent, isolated Malayalam Speech Recognizer using Hidden Markov Model (HMM). In this work, the emission probabilities of syllables, based on HMMs are estimated from the Gaussian Mixture Model (GMM). Mel Frequency Cepstral Coefficient (MFCC) technique is used for feature extraction from the input speech. The generation of mixture weights for GMMs is done by implementing Dirichlet Distribution. The efficiency of thus generated Gaussian Mixture Model is verified with different Information Criteria namely Akaike Information Criterion, Bayes Information Criterion, Corrected AIC, Kullback Linear Information Criterion, corrected KIC and Approximated KIC (KICc, AKICc). The accuracy of medium vocabulary, speaker dependent and isolated Malayalam speech corpus for a single Gaussian is 90.91% and Word Error Rate (WER) is 11.9%. The word accuracy and WER of the system are calculated based on the experiments conducted for multivariate Gaussians. For Gaussian mixture five, a better word accuracy of 95.24% along with a WER of 4.76% is attained and the same is verified using Information Criteria.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Chaos Analysis of Speech Imagery of IPA Vowels

Nächstes Kapitel Social Choice Theory Based Domain Specific Hindi Stop Words List Construction and Its Application in Text Mining

Benzeghiba, M., et al.: Automatic speech recognition and speech variability: a review. Speech Commun. 49(10), 763–786 (2007)CrossRef

Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRef

Malayalam Language (2018). https://en.wikipedia.org/wiki/Malayalam. Accessed 02 Jun 2018

Dirichlet Distribution (2018). https://en.wikipedia.org/wiki/Dirichlet_distribution. Accessed 02 Jun 2018

Abushariah, A.A.M., Gunawan, T.S., Khalifa, O.O., Abushariah, M.A.M.: English digits speech recognition system based on hidden Markov models. In: 2010 International Conference on Computer and Communication Engineer (ICCCE 2010), pp. 1–5. IEEE Press (2010)

Al-Qatab, B.A., Ainon, R.N.: Arabic speech recognition using hidden Markov model toolkit (HTK). In: International Symposium in Information Technology (ITSim), vol. 2, pp. 557–562. IEEE (2010)

Saini, P., Kaur, P., Dua, M.: Hindi automatic speech recognition using HTK. Int. J. Eng. Trends Technol. (IJETT) 4(6), 2223–2229 (2013)

Kumar, K., Aggarwal, R., Jain, A.: A hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1(1), 25–32 (2012)CrossRef

Dua, M., Aggarwal, R., Kadyan, V., Dua, S.: Punjabi automatic speech recognition using HTK. IJCSI Int. J. Comput. Sci. Issues 9(4), 359 (2012)

10.

Bhaskar, P.V., Rao, S.R.M., Gopi, A.: HTK based telugu speech recognition. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 2(12), 307–314 (2012)

11.

Kurian, C., Balakrishnan, K.: Speech recognition of Malayalam numbers. In: World Congress on Nature & Biologically Inspired Computing, NaBIC 2009, pp. 1475–1479. IEEE (2009)

12.

Kurian, C., Balakrishnan, K.: Connected digit speech recognition system for Malayalam language. Sadhana 38(6), 1339–1346 (2013)CrossRef

13.

Kurian, C., Balakrishnan, K.: Development & evaluation of different acoustic models for Malayalam continuous speech recognition. Procedia Eng. 30, 1081–1088 (2012)CrossRef

14.

Krishnan, V.V., Jayakumar, A., Babu, A.P.: Speech recognition of isolated Malayalam words using wavelet features and artificial neural network. In: 4th IEEE International Symposium on Electronic Design, Test and Applications, DELTA 2008, pp. 240–243. IEEE (2008)

15.

Yu, K.: Generating Gaussian mixture models by model selection for speech recognition. F06 10–701 Final Project Report (2006)

16.

Akogul, S., Erisoglu, M.: A comparison of information criteria in clustering based on mixture of multivariate normal distributions. Math. Comput. Appl. 21(3), 34 (2016)MathSciNet

17.

Young, S.: Hidden Markov model toolkit: design and philosophy. CUED/F-INFENG/TR. 152, Cambridge University Engineering Department (1994)

18.

Yu, D., Deng, L.: Automatic Speech Recognition, A Deep Learning Approach. SCT. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3CrossRefMATH

19.

Reynolds, D.A.: Gaussian mixture models. Encycl. Biom. 2009, 659–663 (2009)

20.

Karlis, D., Xekalaki, E.: Choosing initial values for the EM algorithm for finite mixtures. Comput. Stat. Data Anal. 41(3), 577–590 (2003)MathSciNetCrossRef

21.

Steele, R.J., Raftery, A.E.: Performance of bayesian model selection criteria for gaussian mixture models. Front. Stat. Decis. Mak. Bayesian Anal. 2, 113–130 (2010)

22.

Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G. (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer, New York (1998). https://doi.org/10.1007/978-1-4612-1694-0_15

23.

Hurvich, C.M., Tsai, C.L.: Regression and time series model selection in small samples. Biometrika 76(2), 297–307 (1989)MathSciNetCrossRef

24.

Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRef

25.

Cavanaugh, J.E.: A large-sample model selection criterion based on kullback’s symmetric divergence. Stat. Probab. Lett. 42(4), 333–343 (1999)MathSciNetCrossRef

26.

Seghouane, A.K., Bekara, M.: A small sample model selection criterion based on kullback’s symmetric divergence. IEEE Trans. Signal Process. 52(12), 3314–3323 (2004)MathSciNetCrossRef

27.

Seghouane, A.K., Bekara, M., Fleury, G.: A criterion for model selection in the presence of incomplete data based on kullback’s symmetric divergence. Signal Process. 85(7), 1405–1417 (2005)CrossRef

28.

HTK hidden Markov model toolkit (1994). http://htk.eng.cam.ac.uk

Titel: Generation of GMM Weights by Dirichlet Distribution and Model Selection Using Information Criterion for Malayalam Speech Recognition
verfasst von: Lekshmi Krishna Ramachandran
Sherly Elizabeth
Verlag: Springer International Publishing
Buch: Intelligent Human Computer Interaction
Print ISBN: 978-3-030-04020-8

Electronic ISBN: 978-3-030-04021-5

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-04021-5_11

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Suresh Vittal/© Alteryx, Additiv gefertigte Teile/© Marina_Skoropadskaya | Getty Images | iStock, Warnschild "Land unter"/© Bluedesign / Fotolia, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.