Top

International Journal of Speech Technology

Published in:

23-03-2018

Speech recognition with reference to Assamese language using novel fusion technique

Authors: Sruti Sruba Bharali, Sanjib Kr. Kalita

Published in: International Journal of Speech Technology | Issue 2/2018

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper describes the implementation of a speech recognition system in Assamese language. The database for this research work consists of a vocabulary of ten Assamese words. The models for speech recognition have been trained using Hidden Markov Model, Vector Quantization technique and I-vector technique. Two new fusion methods have been proposed in this research study by combining the three techniques.

previous article Manner of articulation based Bengali phoneme classification

next article Emirati-accented speaker identification in each of neutral and shouted talking environments

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Balleda, J., Murthy, H. A., & Nagarajan, T. (2000). Language identification from short segments of speech. In Interspeech, Beijing.

Bansal, P., Dev, A., & Jain, S. B. (2007). Automatic speaker identification using vector quantization. Asian Journal of Information Technology, 6(9), 938–942.

Bharali, S. S., & Kalita, S. K. (2015). A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. International Journal of Speech Technology, 18(4), 673–684.CrossRef

Biswas, S., Rohdin, J., & Shinoda, K. (2014). I-vector selection for effective PLDA modeling in speaker recognition. In Proceedings Odyssey the speaker and language recognition workshop, Brno (pp. 100–105).

Debyeche, M., Haton, J. P., & Houacine, A. (2014). A new vector quantization approach for discrete HMM speech recognition system. International Journal of Computing, 5(1), 72–78.MathSciNet

Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., & Dumouchel, P. (2009). Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification, vol. 9. In Interspeech, Brighton.

Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011a). Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing, 19.4, 788–798.CrossRef

Dehak, N., Torres-Carrasquillo, P. A., Reynolds, D. A., & Dehak, R. (2011b). Language recognition via I-vectors and dimensionality reduction. In Interspeech, Florence (pp. 857–860).

En-Naimani, Z. A. K. A. R. I., A. E., Lazaar, M. O. H. A. M. E. D., & Ettaouil, M. O. H. A. M. E. D. (2014). Hybrid system of optimal self organizing maps and hidden Markov Model for Arabic digits recognition. WSEAS Transactions on Systems, 13(60), 606–616.

Garcia-Romero, D., & Espy-Wilson, C. Y. (2011). Analysis of I-vector length normalization in speaker recognition systems. In Interspeech, Florence (pp. 249–252).

Hassan, F., Khan, M. S. A., Kotwal, M. R. A., & Huda, M. N. (2012). Gender independent bangia automatic speech recognition. In International Conference on Informatics, Electronics & Vision (ICIEV).

Kanagasundaram, A., Vogt, R., Dean, D. B., Sridharan, S., & Mason, M. W. (2011). I-vector based speaker recognition on short utterances. In Proceedings of the 12th annual conference of the international speech communication association. International speech communication association (ISCA), Florence (pp. 2341–2344).

Kumar, K., & Aggarwal, R. K. (2011). Hindi speech recognition system using HTK. International Journal of Computing and Business Research, 2(2), 2229–6166.

Kumar, K., Aggarwal, R. K., & Jain, A. (2012). A Hindi speech recognition system for connected words using HTK. International Journal of Computational Systems Engineering, 1(1), 25–32.CrossRef

Kumar, R., & Singh, M. (2011). Spoken isolated word recognition of Punjabi language using dynamic time warp technique. In Information systems for Indian languages. Berlin: Springer (pp. 301–301).

Kurian, C., & Balakrishnan, K. (2009). Speech recognition of Malayalam numbers. In IEEE World Congress on Nature & Biologically Inspired Computing, 2009. NaBIC 2009, Coimbatore (pp. 1475–1479).

Matějka, P., Glembek, O., Castaldo, F., Alam, M. J., Plchot, O., Kenny, P., & Černocky, J. (2011). Full-covariance UBM and heavy-tailed PLDA in I-vector speaker verification. In IEEE International conference on acoustics, speech and signal processing (ICASSP) IEEE, Prague (pp. 4828).

Misra, D. D., Dutta, K., Bhattacharjee, U., Sarma, K. K., & Goswami, P. K. (2015). Assamese vowel speech recognition using GMM and ANN approaches. In Recent trends in intelligent and emerging systems (pp. 163–170). New Delhi: Springer.

Muslima, U., & Islam, M. B. (2014). Experimental framework for mel-scaled LP based Bangla speech recognition. In 2013 IEEE 16th international conference on computer and information technology (ICCIT), Khulna (pp. 56–59).

Pruthi, T., Saksena, S., & Das, P. K. (2000). Swaranjali: Isolated word recognition for Hindi language using VQ and HMM. In international conference on multimedia processing and systems (ICMPS), Chennai.

Rabiner, L. R. (1989). A tutorial on hidden Markov Models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.CrossRef

Rabiner, L. R., & Juang, B. H. (1986). An introduction to hidden Markov Models. IEEE ASSP Magazine, 3(1), 4–16.CrossRef

Rabiner, L. R., Levinson, S. E., & Sondhi, M. M. (1983). On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition. Bell System Technical Journal, 62(4), 1075–1105.CrossRef

Razavi, M., Rasipuram, R., & Magimai-Doss, M. (2014). On modeling context-dependent clustered states: Comparing HMM/GMM, hybrid HMM/ANN and KL-HMM approaches. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) , New York (pp. 7659–7663).

Senoussaoui, M., Kenny, P., Dehak, N., & Dumouchel, P. (2010). An I-vector extractor suitable for speaker recognition with both microphone and telephone speech. In Odyssey, Brno.

Sharma, M., & Sarma, K. K. (2015). Dialectal Assamese vowel speech detection using acoustic phonetic features, KNN and RNN. In 2015 IEEE 2nd international conference on signal processing and integrated networks (SPIN), Noida (pp. 674–678).

Soong, F. K., Rosenberg, A. E., Juang, B. H., & Rabiner, L. R. (1987). Report: A vector quantization approach to speaker recognition. AT&T Technical Journal, 66(2), 14–26.CrossRef

Verma, P., & Das, P. K. (2015). i-Vectors in speech processing applications: A survey. International Journal of Speech Technology, 18(4), 529–546.CrossRef

Zarrouk, E., Ayed, Y. B., & Gargouri, F. (2014). Hybrid continuous speech recognition systems by HMM, MLP and SVM: A comparative study. International Journal of Speech Technology, 17(3), 223–233.CrossRef

Zeinali, H., Sameti, H., & Burget, L. (2017). HMM-based phrase-independent I-vector extractor for text-dependent speaker verification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(7), 1421–1435.CrossRef

Title: Speech recognition with reference to Assamese language using novel fusion technique
Authors: Sruti Sruba Bharali
Sanjib Kr. Kalita
Publication date: 23-03-2018
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 2/2018
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-018-9501-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 2/2018

Manner of articulation based Bengali phoneme classification

An efficient wavelet-based adaptive filtering algorithm for automatic blind speech enhancement

A multi-tier security system (SAIL) for protecting audio signals from malicious exploits

Combined distributed incremental affine projection algorithm for acoustic echo cancellation

Adaptive framing based similarity measurement between time warped speech signals using Kalman filter

Emirati-accented speaker identification in each of neutral and shouted talking environments