nach oben

Erschienen in:

2022 | OriginalPaper | Buchkapitel

Hindi Phoneme Recognition - A Review

verfasst von : Shobha Bhatt, Amita Dev, Anurag Jain

Erschienen in: Artificial Intelligence and Speech Technology

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

A review for Hindi phoneme recognition is presented to address Hindi speech recognition. Different issues related to Hindi phonemes such as Hindi speech characteristics, features used in phoneme recognition, and classification method highlighted. Related work was also presented to highlight issues concerned with feature extraction, classification, and distinct features. Earlier reviews mostly addressed speech recognition technologies. This work is an early research study presented for Hindi phoneme recognition. A phoneme-based system is used to overcome the constraint of the requirement of large training samples for word-based models. Phoneme-based systems are widely used for large vocabulary speech recognition, different issues related to consonants and vowels were also included. The comparative analysis is presented for different feature extraction and classification techniques with a recognition score. The research helps by presenting issues related to phoneme recognition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel A Review on Speech Synthesis Based on Machine Learning

Nächstes Kapitel Comparison of Modelling ASR System with Different Features Extraction Methods Using Sequential Model

Bhatt, S., Jain, A., Dev, A.: Acoustic modeling in speech recognition: a systematic review. IJACSA Int. J. Adv. Comput. Sci. Appl. 11, 397–412 (2020)

Bhatt, S., Dev, A., Jain, A.: Hindi speech vowel recognition using hidden markov model. In: The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp. 196–199 (2018)

Lopes, C., Perdigao, F.: Phoneme recognition on the TIMIT database. Speech Technol. (2011). https://doi.org/10.5772/17600CrossRef

Strange, W.: Evolving theories of vowel perception. J. Acoust. Soc. Am. 85, 2081–2087 (1989). https://doi.org/10.1121/1.397860CrossRef

Vasquez, D., Gruhn, R., Minker, W.: Hierarchical Neural Network Structures for Phoneme Recognition. Springer, Berlin (2013). https://doi.org/10.1007/978-3-642-34425-1

Espy-Wilson, C.Y.: A feature-based semivowel recognition system. J. Acoust. Soc. Am. 96, 65–72 (1994). https://doi.org/10.1121/1.410375CrossRef

Bhatt, S., Jain, A., Dev, A.: CICD acoustic modeling based on monophone and triphone for HINDI speech recognition. In: International Conference on Artificial Intelligence and Speech Technology (AIST2019), 14–15th Nov (2019)

Mikolov, T., Zweig, G.: Context dependent recurrent neural network language model. In: 2012 IEEE Work Spoken Language Technology SLT 2012 – Proceeding, pp. 234–239 (2012). https://doi.org/10.1109/SLT.2012.6424228

Tüske, Z., Sundermeyer, M., Schlüter, R., Ney, H.: Context-dependent MLPs for LVCSR: TANDEM, hybrid or both? In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, vol. 1, pp. 8–21 (2012)

10.

Ganapathiraju, A., et al.: Syllable - a promising recognition unit for LVCSR. In: IEEE Workshop on Automatic Speech Recognition and Understanding Proceeding, pp. 207–214 (1997). https://doi.org/10.1109/asru.1997.659007

11.

Kumar, K., Aggarwal, R.K., Jain, A.: A Hindi speech recognition system for connected words using HTK. Int. J. Comput. Syst. Eng. 1, 25 (2012). https://doi.org/10.1504/ijcsyse.2012.044740CrossRef

12.

Sinha, S., Agrawal, S.S., Jain, A.: Continuous density Hidden Markov Model for context dependent Hindi speech recognition. In: International Conference on Advances in Computing, Communications and Informatics, pp. 1953–1958 (2013). https://doi.org/10.1109/ICACCI.2013.6637481

13.

Pruthi, T., Saksena, S., Das, P.K.: Swaranjali: isolated word recognition for Hindi language using VQ and HMM. Int. Conf. Multimed. Process. Syst. 1, 13–15 (2000)

14.

Dev, A.: Effect of retroflex sounds on the recognition of Hindi voiced and unvoiced stops. AI Soc. 23, 603–612 (2009). https://doi.org/10.1007/s00146-008-0179-9CrossRef

15.

Sharma, R.P., Khan, I., Farooq, O.: Acoustic study of Hindi unaspirated stop consonants in consonant-vowel (CV) context. Int. J. Eng. Tech. Res. 1, 5–9 (2014)

16.

Patil, V.V., Rao, P.: Detection of phonemic aspiration for spoken Hindi pronunciation evaluation. J. Phon. 54, 202–221 (2016). https://doi.org/10.1016/j.wocn.2015.11.001CrossRef

17.

Das, P.K., Agrawal, S.S.: Machine recognition of Hindi consonants and distinctive features using vector quantization. J. Acoust. Soc. Am. 103, 2779–2779 (1998). https://doi.org/10.1121/1.422255CrossRef

18.

Mishra, A.: Interlaced Derivation for HINDI phoneme- Viseme recognition from continuous speech. Int. J. Recent Res. Aspects 4, 172–176 (2017)

19.

Farooq, O., Datta, S., Shrotriya, M.C.: Wavelet sub-band based temporal features for robust hindi phoneme recognition. Int. J. Wavelets Multiresolut. Inf. Process. 8, 847–859 (2010). https://doi.org/10.1142/S0219691310003845CrossRef

20.

Khan, M., Jahan, M.: Classification of myoelectric signal for sub-vocal Hindi phoneme speech recognition. J. Intell. Fuzzy Syst. 35, 5585–5592 (2018). https://doi.org/10.3233/JIFS-161067CrossRef

21.

Tasnim Swarna, S., Ehsan, S., Islam, S., Jannat, M.E.: A comprehensive survey on bengali phoneme recognition. In: Proceedings of the International Conference on Engineering Research, Innovation and Education 2017 ICERIE 2017, pp. 1–7 (2017)

22.

Kshirsagar, A., Dighe, A., Nagar, K., Patidar, M.: Comparative study of phoneme recognition techniques. In: Proceeding of 2012 3rd International Conference on Computer and Communication Technologies ICCCT 2012, pp. 98–103 (2012). https://doi.org/10.1109/ICCCT.2012.28

23.

Yusnita, M.A., Paulraj, M.P., Yaacob, S., Abu Bakar, S., Saidatul, A., Abdullah, A.N.: Phoneme-based or isolated-word modeling speech recognition system? An overview. In: Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011, pp. 304–309 (2011). https://doi.org/10.1109/CSPA.2011.5759892

24.

Michálek, J., Vaněk, J.: A survey of recent DNN architectures on the TIMIT phone recognition task. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science, vol 11107. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00794-2_47

25.

Strik, H., Cucchiarini, C.: Modeling pronunciation variation for ASR: a survey of the literature. Speech Commun. 29, 225–246 (1999). https://doi.org/10.1016/S0167-6393(99)00038-2CrossRef

26.

AlDahri, S.S., Alotaibi, Y.A.: A crosslanguage survey of VOT values for stops (/d/, /t/). In: Proceeding - 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2010, vol. 3, pp. 334–338 (2010). https://doi.org/10.1109/ICICISYS.2010.5658744

27.

Fathima Nazarath, P.A.: Survey on phoneme recognition using support vector machine. In: National Conference on Emerging Research Trend in Electrical and Electronics Engineering (ERTE 19), pp. 187–192 (2019)

28.

Koizumi, T., Mori, M., Taniguchi, S., Maruya, M.: Recurrent neural networks for phoneme recognition. In: International Conference Spoken language processing, ICSLP, Proceeding, vol. 1, pp. 326–329 (1996). https://doi.org/10.1109/icslp.1996.607119

29.

Nirmala, S.R., Upashana, G.: Advances in computational research a review on landmark detection methodologies of stop consonants. Adv. Comput. Res. 8, 316–320 (2017)

30.

Bhatt, S., Jain, A., Dev, A.: Monophone-based connected word Hindi speech recognition improvement. Sādhanā 46, 1–17 (2021). https://doi.org/10.1007/S12046-021-01614-3CrossRef

31.

Nazzi, T., Cutler, A.: How consonants and vowels shape spoken-language recognition. Annu. Rev. Linguistics 5, 25–47 (2018). https://doi.org/10.1146/annurev-linguisticsCrossRef

32.

Bhatt, S., Dev, A., Jain, A.: Confusion analysis in phoneme based speech recognition in Hindi. J. Ambient Intell. Humaniz. Comput. 11, 4213–4238 (2020). https://doi.org/10.1007/s12652-020-01703-xCrossRef

33.

Bansal, P., Dev, A., Jain, S.B.: Optimum HMM combined with vector quantization for Hindi speech recognition. IETE J. Res. 54, 239–243 (2008). https://doi.org/10.4103/0377-2063.44216CrossRef

34.

Aarti, B., Kopparapu, S.K.: Spoken Indian language identification: a review of features and databases. Sadhana - Acad. Proc. Eng. Sci. 43, 1–14 (2018). https://doi.org/10.1007/s12046-018-0841-yCrossRef

35.

Malviya, S., Mishra, R., Tiwary, U.S.: Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus. In: Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2016. pp. 188–193 (2017). https://doi.org/10.1109/ICSDA.2016.7919009

36.

Sadhukhan, T., Bansal, S., Kumar, A.: Automatic identification of spoken language. IOSR J. Comput. Eng. 19, 84–89 (2017). https://doi.org/10.9790/0661-1902058489CrossRef

37.

Kachru, Y.: Hindi. John Benjamins Publishing, London (2006)CrossRef

38.

Khan, M., Jahan, M.: Sub-vocal speech pattern recognition of Hindi alphabet with surface electromyography signal. Perspect. Sci. 8, 558–560 (2016). https://doi.org/10.1016/j.pisc.2016.06.019CrossRef

39.

Bhatt, S., Jain, A., Dev, A.: Feature extraction techniques with analysis of confusing words for speech recognition in the Hindi language. Wirel. Pers. Commun. 118, 3303–3333 (2021). https://doi.org/10.1007/S11277-021-08181-0CrossRef

40.

Gales, M.J.F., Watanabe, S., Fosler-Lussier, E.: Structured discriminative models for speech recognition: an overview. IEEE Sig. Process. Mag. 29, 70–81 (2012). https://doi.org/10.1109/MSP.2012.2207140CrossRef

41.

Wason, R.: Deep learning: evolution and expansion. Cogn. Syst. Res. 52, 701–708 (2018). https://doi.org/10.1016/j.cogsys.2018.08.023CrossRef

42.

Liu, X., Gales, M.J.F., Sim, K.C., Yu, K.: Investigation of acoustic modeling techniques for LVCSR systems. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing I (2005). https://doi.org/10.1109/ICASSP.2005.1415247

43.

Zhang, J.X., Ling, Z.H., Liu, L.J., Jiang, Y., Dai, L.R.: Sequence-to-sequence acoustic modeling for voice conversion. IEEE/ACM Trans. Audio Speech Lang. Process. 27, 631–644 (2019). https://doi.org/10.1109/TASLP.2019.2892235CrossRef

44.

Koolagudi, S.G., Thakur, S.N., Barthwal, A., Singh, M.K., Rawat, R., Sreenivasa Rao, K.: Vowel recognition from telephonic speech using MFCCs and Gaussian mixture models. In: Communications in Computer and Information Science, pp. 170–177 (2012). https://doi.org/10.1007/978-3-642-32112-2_21

45.

Biswas, A., Sahu, P.K., Bhowmick, A., Chandra, M.: Hindi vowel classification using GFCC and formant analysis in sensor mismatch condition. WSEAS Trans. Syst. 13, 130–143 (2014)

46.

Moses, D.A., Mesgarani, N., Leonard, M.K., Chang, E.F.: Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J. Neural Eng. 13, 056004 (2016). https://doi.org/10.1088/1741-2560/13/5/056004CrossRef

47.

Gales, M., Young, S.: The application of hidden Markov models in speech recognition. Found. Trends Sig. Process. 1, 195–304 (2007). https://doi.org/10.1561/2000000004CrossRefMATH

48.

Dev, A., Agrawal, S.S., Choudhury, D.R.: Categorization of Hindi phonemes by neural networks. AI Soc. 17, 375–382 (2003). https://doi.org/10.1007/s00146-003-0263-0CrossRef

49.

Zaar, J., Dau, T.: Predicting consonant recognition and confusions in normal-hearing listeners. J. Acoust. Soc. Am. 141, 1051–1064 (2017). https://doi.org/10.1121/1.4976054CrossRef

50.

Mishra, S., Bhowmick, A., Shrotriya, M.C.: Hindi vowel classification using QCN-MFCC features. Perspect. Sci. 8, 28–31 (2016). https://doi.org/10.1016/j.pisc.2016.01.010CrossRef

Titel: Hindi Phoneme Recognition - A Review
verfasst von: Shobha Bhatt
Amita Dev
Anurag Jain
Verlag: Springer International Publishing
Buch: Artificial Intelligence and Speech Technology
Print ISBN: 978-3-030-95710-0

Electronic ISBN: 978-3-030-95711-7

Copyright-Jahr: 2022
DOI: https://doi.org/10.1007/978-3-030-95711-7_4

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner