nach oben

International Journal of Speech Technology

Erschienen in:

19.10.2015

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

verfasst von: Sruti Sruba Bharali, Sanjib Kr. Kalita

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

This paper describes the work done in implementation of speaker independent, isolated word recognizer for Assamese language. Linear predictive coding (LPC) analysis, LPC cepstral coefficients (LPCEPSTRA), linear mel-filter bank channel outputs and mel frequency cepstral coefficients (MFCC) are used to get the acoustical features. The hidden Markov model toolkit (HTK) using the Hidden Markov Model (HMM) has been used to build the different recognition models. The speech recognition model is trained for 10 Assamese words representing the digits from 0 (shounya) to 9 (no) in the Assamese language using fifteen speakers. Different models were created for each word which varied on the number of input feature values and the number of hidden states. The system obtained a maximum accuracy of 80 % for 39 MFCC features and a 7 state HMM model with 5 hidden states for a system with clean data and a maximum accuracy of 95 % for 26 LPCESPTRA features and a 7 state HMM model with 5 hidden states for a system with noisy data.

Vorheriger Artikel A comparative study of BA, APSO, GSA, hybrid PSOGSA and SPSO in dual channel speech enhancement

Nächster Artikel Robust glottal closure instant detection by jointly exploiting stationary wavelet transform and harmonic superposition

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Abushariah, M. A., Ainon, R. N., Zainuddin, R., Elshafei, M., & Khalifa, O. O. (2010). Natural speaker-independent Arabic speech recognition system based on Hidden Markov Models using Sphinx tools. In Computer and Communication Engineering (ICCCE) (pp. 1–6), 2010 International Conference on, IEEE.

Abushariah, M. A. A. M., Ainon, R. N., Zainuddin, R., Elshafei, M., & Khalifa, O. O. (2012). Arabic speaker-independent continuous automatic speech recognition based on a phonetically rich and balanced speech corpus. International Arab Journal of Information Technology, 9(1), 84–93.

Al-Qatab, B. A., & Ainon, R. N. (2010). Arabic speech recognition using hidden Markov model toolkit (HTK). In Information Technology (ITSim) (Vol. 2, pp. 557–562), 2010 International Symposium in, IEEE.

Bhaskar, P. V., & Rao, S. R. M. (2014). Telugu Speech Recognition System development using MFCC based Hidden Markov Model technique with Sphinx-4.

Bhattacharjee, U. (2013). A comparative study of LPCC and MFCC features for the recognition of assamese phonemes. In International Journal of Engineering Research and Technology (Vol. 2, No. 1 (January-2013)). ESRSA Publications.

Bourlard, H., & Morgan, N. (1998). Hybrid HMM/ANN systems for speech recognition: Overview and new research directions. In Adaptive processing of sequences and data structures (pp. 389–417). Berlin: Springer.

Dua, M., Aggarwal, R. K., Kadyan, V., & Dua, S. (2012). Punjabi automatic speech recognition using HTK. IJCSI International Journal of Computer Science Issues, 9(4), 1694-0814.

Eslam Mansour Mohammed, E. M. M., Mohammed Sharaf Sayed, M. S. S., Abdallaa Mohammed Moselhy, A. M. M., & Abdelaziz Alsayed Abdelnaiem, A. A. A. (2013). LPC and MFCC performance evaluation with artificial neural network for spoken language identification. International Journal of Signal Processing, Image Processing and Pattern Recognition, 6(3), 55–66.

Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., & Woodland, P. (1997). The HTK book (Vol. 2). Cambridge: Entropic Cambridge Research Laboratory.

Hassan, F., Kotwal, M. R. A., Khan, M. S. A., & Huda, M. N. (2012). Gender independent Bangla automatic speech recognition. In Informatics, Electronics & Vision (ICIEV) (pp. 144–148), 2012 International Conference on, IEEE.

Krishna, K. M., Lakshmi M. V., & Laksmi, S. S. (2014). Feature extraction and dimensionality reduction using IPS for isolated tamil words speech recognizer. International Journal of Advanced Research in Computer and Communication Engineering, 3(3).

Kumar, K., & Aggarwal, R. K. (2011). Hindi speech recognition system using HTK. International Journal of Computing and Business Research, 2(2), 2229–6166.

Kumar, K., Aggarwal, R. K., & Jain, A. (2012). A Hindi speech recognition system for connected words using HTK. International Journal of Computational, Systems Engineering, 1(1), 25–32.CrossRef

Mankala, S. R., Bojja, S. R., Ramaiah, V. S., & Rao, R. R. (2014). Automatic speech processing using HTK for Telugu language. International Journal of Advances in Engineering & Technology, 6(6), 2572–2578.

Mehta, L. R., Mahajan, S. P., & Dabhade, A. S. (2013). Comparative study of MFCC and LPC for Marathi isolated word recognition system. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(6), 2133–2139.

Mohamed, A., & Nair, K. N. (2012). HMM/ANN hybrid model for continuous Malayalam speech recognition. Procedia Engineering, 30, 616–622.CrossRef

Moreau, N. (2002). HTK v. 3.1 Basic Tutorial. TechnischeUniversität Berlin.

Pruthi, T., Saksena, S., & Das, P. K. (2000). Swaranjali: Isolated word recognition for Hindi language using VQ and HMM. In International Conference on Multimedia Processing and Systems (ICMPS), IIT Madras.

Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.CrossRef

Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition (Vol. 14). Englewood Cliffs: PTR Prentice Hall.

Shinde, M. B., & Gandhe, D. S. (2013). Speech processing for isolated Marathi word recognition using MFCC and DTW features. International Journal of Innovations in Engineering and Technology, 3(1).

Sigappi, A. N., & Palanivel, S. (2012). Spoken word recognition strategy for Tamil language. International Journal of Computer Science Issue, 9(1), 1694-0814.

Titel: A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language
verfasst von: Sruti Sruba Bharali
Sanjib Kr. Kalita
Publikationsdatum: 19.10.2015
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 4/2015
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-015-9311-7

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Kryptowährungen/© gopixa / Getty Images / iStock, MG4 aus China auf dem Prüfstand im ADAC-Technik-Zentrum in Landsberg am Lech/© ADAC e.V., Chassis eines Elektrofahrzeugs/© chesky / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2015

An intelligent audio watermarking based on KNN learning algorithm

Sensitivity of automatic speaker identification to SVD digital audio watermarking

Noise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDA

Supervised and unsupervised separation of convolutive speech mixtures using f 0 and formant frequencies

Robust glottal closure instant detection by jointly exploiting stationary wavelet transform and harmonic superposition

Performance evaluation of a ACF-AMDF based pitch detection scheme in real-time

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.