Skip to main content
Erschienen in: International Journal of Speech Technology 1/2022

16.10.2020

Hindi speech recognition using time delay neural network acoustic modeling with i-vector adaptation

verfasst von: Ankit Kumar, Rajesh Kumar Aggarwal

Erschienen in: International Journal of Speech Technology | Ausgabe 1/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

It is a need of time to build an Automatic Speech Recognition (ASR) system for low and limited resource languages. Usually, statistical techniques such as Hidden Markov Models (HMM) have been applied for Indian language ASR systems for the last two decades. In this work, we have selected the Time-delay Neural Network (TDNN) based acoustic modeling with i-vector adaptation for limited resource Hindi ASR. The TDNN can capture the extended temporal context of acoustic events. To reduce the training time, we used sub-sampling based TDNN architecture in this work. Further, data augmentation techniques have been applied to extend the size of training data developed by TIFR, Mumbai. The results show that data augmentation significantly improves the performance of the Hindi ASR. Further, \(\approx\) 4% average improvement has been recorded by applying i-vector adaptation in this work. We found the best system accuracy of 89.9% with TDNN based acoustic modeling with i-vector adaptation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abraham, B., Seeram, T., & Umesh, S. (2017). Transfer learning and distillation techniques to improve the acoustic modeling of low resource languages. In INTERSPEECH (pp. 2158–2162). Abraham, B., Seeram, T., & Umesh, S. (2017). Transfer learning and distillation techniques to improve the acoustic modeling of low resource languages. In INTERSPEECH (pp. 2158–2162).
Zurück zum Zitat Aggarwal, R. K., & Dave, M. (2011). Acoustic modeling problem for automatic speech recognition system: Advances and refinements (Part II). International Journal of Speech Technology, 14(4), 309.CrossRef Aggarwal, R. K., & Dave, M. (2011). Acoustic modeling problem for automatic speech recognition system: Advances and refinements (Part II). International Journal of Speech Technology, 14(4), 309.CrossRef
Zurück zum Zitat Aggarwal, R. K., & Dave, M. (2012). Filterbank optimization for robust ASR using GA and PSO. International Journal of Speech Technology, 15(2), 191–201.CrossRef Aggarwal, R. K., & Dave, M. (2012). Filterbank optimization for robust ASR using GA and PSO. International Journal of Speech Technology, 15(2), 191–201.CrossRef
Zurück zum Zitat Aggarwal, R. K., & Dave, M. (2013). Performance evaluation of sequentially combined heterogeneous feature streams for hindi speech recognition system. Telecommunication Systems, 52(3), 1457–1466.CrossRef Aggarwal, R. K., & Dave, M. (2013). Performance evaluation of sequentially combined heterogeneous feature streams for hindi speech recognition system. Telecommunication Systems, 52(3), 1457–1466.CrossRef
Zurück zum Zitat An, G., Brizan, D. G., Ma, M., Morales, M., Syed, A.R., & Rosenberg, A. (2015). Automatic recognition of unified parkinson’s disease rating from speech with acoustic, i-vector and phonotactic features. In Sixteenth Annual Conference of the International Speech Communication Association. An, G., Brizan, D. G., Ma, M., Morales, M., Syed, A.R., & Rosenberg, A. (2015). Automatic recognition of unified parkinson’s disease rating from speech with acoustic, i-vector and phonotactic features. In Sixteenth Annual Conference of the International Speech Communication Association.
Zurück zum Zitat Biswas, A., Menon, R., van der Westhuizen, E., & Niesler, T. (2019). Improved low-resource somali speech recognition by semi-supervised acoustic and language model training. arXiv preprint arXiv:1907.03064. Biswas, A., Menon, R., van der Westhuizen, E., & Niesler, T. (2019). Improved low-resource somali speech recognition by semi-supervised acoustic and language model training. arXiv preprint arXiv:​1907.​03064.
Zurück zum Zitat Biswas, A., Sahu, P. K., & Chandra, M. (2016). Admissible wavelet packet sub-band based harmonic energy features using anova fusion techniques for hindi phoneme recognition. IET Signal Processing, 10(8), 902–911.CrossRef Biswas, A., Sahu, P. K., & Chandra, M. (2016). Admissible wavelet packet sub-band based harmonic energy features using anova fusion techniques for hindi phoneme recognition. IET Signal Processing, 10(8), 902–911.CrossRef
Zurück zum Zitat Chellapriyadharshini, M., Toffy, A., & Ramasubramanian, V., et al. (2018). Semi-supervised and active-learning scenarios: Efficient acoustic model refinement for a low resource indian language. arXiv preprint arXiv:1810.06635. Chellapriyadharshini, M., Toffy, A., & Ramasubramanian, V., et al. (2018). Semi-supervised and active-learning scenarios: Efficient acoustic model refinement for a low resource indian language. arXiv preprint arXiv:​1810.​06635.
Zurück zum Zitat Chen, N. F., Lim, B. P., Hasegawa-Johnson, M. A., et al. (2017). Multitask learning for phone recognition of underresourced languages using mismatched transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(3), 501–514. Chen, N. F., Lim, B. P., Hasegawa-Johnson, M. A., et al. (2017). Multitask learning for phone recognition of underresourced languages using mismatched transcription. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(3), 501–514.
Zurück zum Zitat Chuangsuwanich, E. (2016). Multilingual techniques for low resource automatic speech recognition. Massachusetts Institute of Technology Cambridge United States: Tech. rep. Chuangsuwanich, E. (2016). Multilingual techniques for low resource automatic speech recognition. Massachusetts Institute of Technology Cambridge United States: Tech. rep.
Zurück zum Zitat Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for LVCSR using rectified linear units and dropout. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 8609–8613). IEEE. Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for LVCSR using rectified linear units and dropout. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 8609–8613). IEEE.
Zurück zum Zitat Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2011a). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 30–42.CrossRef Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2011a). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 30–42.CrossRef
Zurück zum Zitat Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2011b). Large vocabulary continuous speech recognition with context-dependent DBN-HMMS. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4688–4691). IEEE. Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2011b). Large vocabulary continuous speech recognition with context-dependent DBN-HMMS. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 4688–4691). IEEE.
Zurück zum Zitat Dua, M., Aggarwal, R. K., & Biswas, M. (2017). Discriminative training using heterogeneous feature vector for hindi automatic speech recognition system. In 2017 International Conference on Computer and Applications (ICCA) (pp. 158–162). IEEE. Dua, M., Aggarwal, R. K., & Biswas, M. (2017). Discriminative training using heterogeneous feature vector for hindi automatic speech recognition system. In 2017 International Conference on Computer and Applications (ICCA) (pp. 158–162). IEEE.
Zurück zum Zitat Dua, M., Aggarwal, R. K., & Biswas, M. (2018a). Discriminative training using noise robust integrated features and refined hmm modeling. Journal of Intelligent Systems, 29(1), 327–344.CrossRef Dua, M., Aggarwal, R. K., & Biswas, M. (2018a). Discriminative training using noise robust integrated features and refined hmm modeling. Journal of Intelligent Systems, 29(1), 327–344.CrossRef
Zurück zum Zitat Dua, M., Aggarwal, R. K., & Biswas, M. (2018b). Performance evaluation of hindi speech recognition system using optimized filterbanks. Engineering Science and Technology, an International Journal, 21(3), 389–398.CrossRef Dua, M., Aggarwal, R. K., & Biswas, M. (2018b). Performance evaluation of hindi speech recognition system using optimized filterbanks. Engineering Science and Technology, an International Journal, 21(3), 389–398.CrossRef
Zurück zum Zitat Eghbal-Zadeh, H., Lehner, B., Dorfer, M., & Widmer, G. (2016). CP-JKU submissions for dcase-2016: A hybrid approach using binaural i-vectors and deep convolutional neural networks. IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), 6, 5024–5028. Eghbal-Zadeh, H., Lehner, B., Dorfer, M., & Widmer, G. (2016). CP-JKU submissions for dcase-2016: A hybrid approach using binaural i-vectors and deep convolutional neural networks. IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), 6, 5024–5028.
Zurück zum Zitat Ghalehjegh, S. H., & Rose, R. C. (2015). Deep bottleneck features for i-vector based text-independent speaker verification. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 555–560). IEEE. Ghalehjegh, S. H., & Rose, R. C. (2015). Deep bottleneck features for i-vector based text-independent speaker verification. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) (pp. 555–560). IEEE.
Zurück zum Zitat Hartmann, W., Hsiao, R., & Tsakalidis, S. (2017). Alternative networks for monolingual bottleneck features. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5290–5294). IEEE. Hartmann, W., Hsiao, R., & Tsakalidis, S. (2017). Alternative networks for monolingual bottleneck features. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5290–5294). IEEE.
Zurück zum Zitat Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, Ar, Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.CrossRef Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, Ar, Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.CrossRef
Zurück zum Zitat Jaitly, N., & Hinton, G. (2011). Learning a better representation of speech soundwaves using restricted boltzmann machines. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5884–5887). IEEE. Jaitly, N., & Hinton, G. (2011). Learning a better representation of speech soundwaves using restricted boltzmann machines. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5884–5887). IEEE.
Zurück zum Zitat Jaitly, N., & Hinton, G. E. (2013). Vocal tract length perturbation (VTLP) improves speech recognition. In: Proc. ICML Workshop on Deep Learning for Audio, Speech and Language (Vol. 117). Jaitly, N., & Hinton, G. E. (2013). Vocal tract length perturbation (VTLP) improves speech recognition. In: Proc. ICML Workshop on Deep Learning for Audio, Speech and Language (Vol. 117).
Zurück zum Zitat Karafiát, M., Burget, L., Matějka, P., Glembek, O., & Černockỳ, J. (2011). ivector-based discriminative adaptation for automatic speech recognition. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 152–157). IEEE. Karafiát, M., Burget, L., Matějka, P., Glembek, O., & Černockỳ, J. (2011). ivector-based discriminative adaptation for automatic speech recognition. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 152–157). IEEE.
Zurück zum Zitat Ko, T., Peddinti, V., Povey, D., Seltzer, M. L., & Khudanpur, S. (2017). A study on data augmentation of reverberant speech for robust speech recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5220–5224). IEEE. Ko, T., Peddinti, V., Povey, D., Seltzer, M. L., & Khudanpur, S. (2017). A study on data augmentation of reverberant speech for robust speech recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5220–5224). IEEE.
Zurück zum Zitat Kreyssig, F.L., Zhang, C., & Woodland, P. C. (2018). Improved tdnns using deep kernels and frequency dependent grid-RNNS. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4864–4868). IEEE. Kreyssig, F.L., Zhang, C., & Woodland, P. C. (2018). Improved tdnns using deep kernels and frequency dependent grid-RNNS. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4864–4868). IEEE.
Zurück zum Zitat Liu, B., Zhang, W., Xu, X., & Chen, D. (2019). Time delay recurrent neural network for speech recognition. In Journal of Physics: Conference Series (Vol. 1229, p. 012078). IOP Publishing. Liu, B., Zhang, W., Xu, X., & Chen, D. (2019). Time delay recurrent neural network for speech recognition. In Journal of Physics: Conference Series (Vol. 1229, p. 012078). IOP Publishing.
Zurück zum Zitat Peddinti, V., Chen, G., Manohar, V., Ko, T., Povey, D., & Khudanpur, S. (2015a). JHU aspire system: Robust LVCSR with TDNNS, ivector adaptation and rnn-lms. In ASRU (pp. 539–546). Peddinti, V., Chen, G., Manohar, V., Ko, T., Povey, D., & Khudanpur, S. (2015a). JHU aspire system: Robust LVCSR with TDNNS, ivector adaptation and rnn-lms. In ASRU (pp. 539–546).
Zurück zum Zitat Peddinti, V., Chen, G., Povey, D., & Khudanpur, S. (2015b). Reverberation robust acoustic modeling using i-vectors with time delay neural networks. In Sixteenth Annual Conference of the International Speech Communication Association. Peddinti, V., Chen, G., Povey, D., & Khudanpur, S. (2015b). Reverberation robust acoustic modeling using i-vectors with time delay neural networks. In Sixteenth Annual Conference of the International Speech Communication Association.
Zurück zum Zitat Peddinti, V., Povey, D., & Khudanpur, S. (2015c). A time delay neural network architecture for efficient modeling of long temporal contexts. In Sixteenth Annual Conference of the International Speech Communication Association. Peddinti, V., Povey, D., & Khudanpur, S. (2015c). A time delay neural network architecture for efficient modeling of long temporal contexts. In Sixteenth Annual Conference of the International Speech Communication Association.
Zurück zum Zitat Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., & Schwarz, P., et al. (2011). The kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding, CONF. IEEE Signal Processing Society. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., & Schwarz, P., et al. (2011). The kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding, CONF. IEEE Signal Processing Society.
Zurück zum Zitat Povey, D., Peddinti, V., Galvez, D., Ghahremani, P., Manohar, V., Na, X., Wang, Y., & Khudanpur, S. (2016). Purely sequence-trained neural networks for ASR based on lattice-free MMI. In Interspeech (pp. 2751–2755). Povey, D., Peddinti, V., Galvez, D., Ghahremani, P., Manohar, V., Na, X., Wang, Y., & Khudanpur, S. (2016). Purely sequence-trained neural networks for ASR based on lattice-free MMI. In Interspeech (pp. 2751–2755).
Zurück zum Zitat Ragni, A., Knill, K., Rath, S. P., & Gales, M. (2014). Data augmentation for low resource languages. Ragni, A., Knill, K., Rath, S. P., & Gales, M. (2014). Data augmentation for low resource languages.
Zurück zum Zitat Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536.CrossRef Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536.CrossRef
Zurück zum Zitat Sak, H., Senior, A., & Beaufays, F. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. arXiv preprint arXiv:1402.1128. Sak, H., Senior, A., & Beaufays, F. (2014). Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. arXiv preprint arXiv:​1402.​1128.
Zurück zum Zitat Samudravijaya, K., Rao, P., & Agrawal, S. (2000). Hindi speech database. In Sixth International Conference on Spoken Language Processing. Samudravijaya, K., Rao, P., & Agrawal, S. (2000). Hindi speech database. In Sixth International Conference on Spoken Language Processing.
Zurück zum Zitat Saon, G., Soltau, H., Nahamoo, D., & Picheny, M. (2013). Speaker adaptation of neural network acoustic models using i-vectors. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (pp. 55–59). IEEE. Saon, G., Soltau, H., Nahamoo, D., & Picheny, M. (2013). Speaker adaptation of neural network acoustic models using i-vectors. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding (pp. 55–59). IEEE.
Zurück zum Zitat Seide, F., Li, G., Chen, X., & Yu, D. (2011). Feature engineering in context-dependent deep neural networks for conversational speech transcription. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 24–29). IEEE. Seide, F., Li, G., Chen, X., & Yu, D. (2011). Feature engineering in context-dependent deep neural networks for conversational speech transcription. In 2011 IEEE Workshop on Automatic Speech Recognition & Understanding (pp. 24–29). IEEE.
Zurück zum Zitat Sercu, T., Puhrsch, C., Kingsbury, B., & LeCun, Y. (2016). Very deep multilingual convolutional neural networks for LVCSR. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4955–4959). IEEE. Sercu, T., Puhrsch, C., Kingsbury, B., & LeCun, Y. (2016). Very deep multilingual convolutional neural networks for LVCSR. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4955–4959). IEEE.
Zurück zum Zitat Stolcke, A. (2002). Srilm-an extensible language modeling toolkit. In Seventh international conference on spoken language processing. Stolcke, A. (2002). Srilm-an extensible language modeling toolkit. In Seventh international conference on spoken language processing.
Zurück zum Zitat Trmal, J., Kumar, G., Manohar, V., Khudanpur, S., Post, M., & McNamee, P. (2017). Using of heterogeneous corpora for training of an ASR system. arXiv preprint arXiv:1706.00321. Trmal, J., Kumar, G., Manohar, V., Khudanpur, S., Post, M., & McNamee, P. (2017). Using of heterogeneous corpora for training of an ASR system. arXiv preprint arXiv:​1706.​00321.
Zurück zum Zitat Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3), 328–339.CrossRef Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., & Lang, K. J. (1989). Phoneme recognition using time-delay neural networks. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(3), 328–339.CrossRef
Zurück zum Zitat Weninger, F., Watanabe, S., Le Roux, J., Hershey, J., Tachioka, Y., Geiger, J., Schuller, B., & Rigoll, G. (2014). The merl/melco/tum system for the reverb challenge using deep recurrent neural network feature enhancement. In Proc. REVERB Workshop (pp. 1–8). Weninger, F., Watanabe, S., Le Roux, J., Hershey, J., Tachioka, Y., Geiger, J., Schuller, B., & Rigoll, G. (2014). The merl/melco/tum system for the reverb challenge using deep recurrent neural network feature enhancement. In Proc. REVERB Workshop (pp. 1–8).
Zurück zum Zitat Xu, H., Su, H., Ni, C., Xiao, X., Huang, H., Chng, E. S., & Li, H. (2016). Semi-supervised and cross-lingual knowledge transfer learnings for DNN hybrid acoustic models under low-resource conditions. In INTERSPEECH (pp. 1315–1319). Xu, H., Su, H., Ni, C., Xiao, X., Huang, H., Chng, E. S., & Li, H. (2016). Semi-supervised and cross-lingual knowledge transfer learnings for DNN hybrid acoustic models under low-resource conditions. In INTERSPEECH (pp. 1315–1319).
Zurück zum Zitat Xue, S., Abdel-Hamid, O., Jiang, H., Dai, L., & Liu, Q. (2014). Fast adaptation of deep neural network based on discriminant codes for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12), 1713–1725.CrossRef Xue, S., Abdel-Hamid, O., Jiang, H., Dai, L., & Liu, Q. (2014). Fast adaptation of deep neural network based on discriminant codes for speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12), 1713–1725.CrossRef
Metadaten
Titel
Hindi speech recognition using time delay neural network acoustic modeling with i-vector adaptation
verfasst von
Ankit Kumar
Rajesh Kumar Aggarwal
Publikationsdatum
16.10.2020
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 1/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-020-09757-0

Weitere Artikel der Ausgabe 1/2022

International Journal of Speech Technology 1/2022 Zur Ausgabe

Neuer Inhalt