Skip to main content

2022 | OriginalPaper | Buchkapitel

Hindi Phoneme Recognition - A Review

verfasst von : Shobha Bhatt, Amita Dev, Anurag Jain

Erschienen in: Artificial Intelligence and Speech Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A review for Hindi phoneme recognition is presented to address Hindi speech recognition. Different issues related to Hindi phonemes such as Hindi speech characteristics, features used in phoneme recognition, and classification method highlighted. Related work was also presented to highlight issues concerned with feature extraction, classification, and distinct features. Earlier reviews mostly addressed speech recognition technologies. This work is an early research study presented for Hindi phoneme recognition. A phoneme-based system is used to overcome the constraint of the requirement of large training samples for word-based models. Phoneme-based systems are widely used for large vocabulary speech recognition, different issues related to consonants and vowels were also included. The comparative analysis is presented for different feature extraction and classification techniques with a recognition score. The research helps by presenting issues related to phoneme recognition.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bhatt, S., Jain, A., Dev, A.: Acoustic modeling in speech recognition: a systematic review. IJACSA Int. J. Adv. Comput. Sci. Appl. 11, 397–412 (2020) Bhatt, S., Jain, A., Dev, A.: Acoustic modeling in speech recognition: a systematic review. IJACSA Int. J. Adv. Comput. Sci. Appl. 11, 397–412 (2020)
2.
Zurück zum Zitat Bhatt, S., Dev, A., Jain, A.: Hindi speech vowel recognition using hidden markov model. In: The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp. 196–199 (2018) Bhatt, S., Dev, A., Jain, A.: Hindi speech vowel recognition using hidden markov model. In: The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp. 196–199 (2018)
7.
Zurück zum Zitat Bhatt, S., Jain, A., Dev, A.: CICD acoustic modeling based on monophone and triphone for HINDI speech recognition. In: International Conference on Artificial Intelligence and Speech Technology (AIST2019), 14–15th Nov (2019) Bhatt, S., Jain, A., Dev, A.: CICD acoustic modeling based on monophone and triphone for HINDI speech recognition. In: International Conference on Artificial Intelligence and Speech Technology (AIST2019), 14–15th Nov (2019)
9.
Zurück zum Zitat Tüske, Z., Sundermeyer, M., Schlüter, R., Ney, H.: Context-dependent MLPs for LVCSR: TANDEM, hybrid or both? In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, vol. 1, pp. 8–21 (2012) Tüske, Z., Sundermeyer, M., Schlüter, R., Ney, H.: Context-dependent MLPs for LVCSR: TANDEM, hybrid or both? In: 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, vol. 1, pp. 8–21 (2012)
13.
Zurück zum Zitat Pruthi, T., Saksena, S., Das, P.K.: Swaranjali: isolated word recognition for Hindi language using VQ and HMM. Int. Conf. Multimed. Process. Syst. 1, 13–15 (2000) Pruthi, T., Saksena, S., Das, P.K.: Swaranjali: isolated word recognition for Hindi language using VQ and HMM. Int. Conf. Multimed. Process. Syst. 1, 13–15 (2000)
15.
Zurück zum Zitat Sharma, R.P., Khan, I., Farooq, O.: Acoustic study of Hindi unaspirated stop consonants in consonant-vowel (CV) context. Int. J. Eng. Tech. Res. 1, 5–9 (2014) Sharma, R.P., Khan, I., Farooq, O.: Acoustic study of Hindi unaspirated stop consonants in consonant-vowel (CV) context. Int. J. Eng. Tech. Res. 1, 5–9 (2014)
18.
Zurück zum Zitat Mishra, A.: Interlaced Derivation for HINDI phoneme- Viseme recognition from continuous speech. Int. J. Recent Res. Aspects 4, 172–176 (2017) Mishra, A.: Interlaced Derivation for HINDI phoneme- Viseme recognition from continuous speech. Int. J. Recent Res. Aspects 4, 172–176 (2017)
21.
Zurück zum Zitat Tasnim Swarna, S., Ehsan, S., Islam, S., Jannat, M.E.: A comprehensive survey on bengali phoneme recognition. In: Proceedings of the International Conference on Engineering Research, Innovation and Education 2017 ICERIE 2017, pp. 1–7 (2017) Tasnim Swarna, S., Ehsan, S., Islam, S., Jannat, M.E.: A comprehensive survey on bengali phoneme recognition. In: Proceedings of the International Conference on Engineering Research, Innovation and Education 2017 ICERIE 2017, pp. 1–7 (2017)
22.
Zurück zum Zitat Kshirsagar, A., Dighe, A., Nagar, K., Patidar, M.: Comparative study of phoneme recognition techniques. In: Proceeding of 2012 3rd International Conference on Computer and Communication Technologies ICCCT 2012, pp. 98–103 (2012). https://doi.org/10.1109/ICCCT.2012.28 Kshirsagar, A., Dighe, A., Nagar, K., Patidar, M.: Comparative study of phoneme recognition techniques. In: Proceeding of 2012 3rd International Conference on Computer and Communication Technologies ICCCT 2012, pp. 98–103 (2012). https://​doi.​org/​10.​1109/​ICCCT.​2012.​28
23.
Zurück zum Zitat Yusnita, M.A., Paulraj, M.P., Yaacob, S., Abu Bakar, S., Saidatul, A., Abdullah, A.N.: Phoneme-based or isolated-word modeling speech recognition system? An overview. In: Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011, pp. 304–309 (2011). https://doi.org/10.1109/CSPA.2011.5759892 Yusnita, M.A., Paulraj, M.P., Yaacob, S., Abu Bakar, S., Saidatul, A., Abdullah, A.N.: Phoneme-based or isolated-word modeling speech recognition system? An overview. In: Proceedings - 2011 IEEE 7th International Colloquium on Signal Processing and Its Applications, CSPA 2011, pp. 304–309 (2011). https://​doi.​org/​10.​1109/​CSPA.​2011.​5759892
24.
Zurück zum Zitat Michálek, J., Vaněk, J.: A survey of recent DNN architectures on the TIMIT phone recognition task. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science, vol 11107. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00794-2_47 Michálek, J., Vaněk, J.: A survey of recent DNN architectures on the TIMIT phone recognition task. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science, vol 11107. Springer, Cham (2018). https://​doi.​org/​10.​1007/​978-3-030-00794-2_​47
27.
Zurück zum Zitat Fathima Nazarath, P.A.: Survey on phoneme recognition using support vector machine. In: National Conference on Emerging Research Trend in Electrical and Electronics Engineering (ERTE 19), pp. 187–192 (2019) Fathima Nazarath, P.A.: Survey on phoneme recognition using support vector machine. In: National Conference on Emerging Research Trend in Electrical and Electronics Engineering (ERTE 19), pp. 187–192 (2019)
29.
Zurück zum Zitat Nirmala, S.R., Upashana, G.: Advances in computational research a review on landmark detection methodologies of stop consonants. Adv. Comput. Res. 8, 316–320 (2017) Nirmala, S.R., Upashana, G.: Advances in computational research a review on landmark detection methodologies of stop consonants. Adv. Comput. Res. 8, 316–320 (2017)
35.
Zurück zum Zitat Malviya, S., Mishra, R., Tiwary, U.S.: Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus. In: Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2016. pp. 188–193 (2017). https://doi.org/10.1109/ICSDA.2016.7919009 Malviya, S., Mishra, R., Tiwary, U.S.: Structural analysis of Hindi phonetics and a method for extraction of phonetically rich sentences from a very large Hindi text corpus. In: Conference of the Oriental Chapter of International Committee for Coordination and Standardization of Speech Databases and Assessment Techniques, O-COCOSDA 2016. pp. 188–193 (2017). https://​doi.​org/​10.​1109/​ICSDA.​2016.​7919009
37.
44.
Zurück zum Zitat Koolagudi, S.G., Thakur, S.N., Barthwal, A., Singh, M.K., Rawat, R., Sreenivasa Rao, K.: Vowel recognition from telephonic speech using MFCCs and Gaussian mixture models. In: Communications in Computer and Information Science, pp. 170–177 (2012). https://doi.org/10.1007/978-3-642-32112-2_21 Koolagudi, S.G., Thakur, S.N., Barthwal, A., Singh, M.K., Rawat, R., Sreenivasa Rao, K.: Vowel recognition from telephonic speech using MFCCs and Gaussian mixture models. In: Communications in Computer and Information Science, pp. 170–177 (2012). https://​doi.​org/​10.​1007/​978-3-642-32112-2_​21
45.
Zurück zum Zitat Biswas, A., Sahu, P.K., Bhowmick, A., Chandra, M.: Hindi vowel classification using GFCC and formant analysis in sensor mismatch condition. WSEAS Trans. Syst. 13, 130–143 (2014) Biswas, A., Sahu, P.K., Bhowmick, A., Chandra, M.: Hindi vowel classification using GFCC and formant analysis in sensor mismatch condition. WSEAS Trans. Syst. 13, 130–143 (2014)
Metadaten
Titel
Hindi Phoneme Recognition - A Review
verfasst von
Shobha Bhatt
Amita Dev
Anurag Jain
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-95711-7_4

Premium Partner