This paper describes the BUT ‘Jilebi’ team’s speech recognition systems created for the 2018 low resource speech recognition challenge for Indian languages. We investigate modifications of multilingual time-delay neural network (TDNN) architectures with transfer learning and compare them to bi-directional residual memory networks (BRMN) and bi-directional LSTM. Our best submission based on system combination achieved word error rates of 13.92% (Tamil), 14.71% (Telugu) and 14.06% (Gujarati). We present the details of submitted systems and also the post-evaluation analysis done for lexicon discovery using unsupervised word segmentation.
Cite as: Pulugundla, B., Baskar, M.K., Kesiraju, S., Egorova, E., Karafiát, M., Burget, L., Černocký, J. (2018) BUT System for Low Resource Indian Language ASR. Proc. Interspeech 2018, 3182-3186, doi: 10.21437/Interspeech.2018-1302
@inproceedings{pulugundla18_interspeech, author={Bhargav Pulugundla and Murali Karthick Baskar and Santosh Kesiraju and Ekaterina Egorova and Martin Karafiát and Lukáš Burget and Jan Černocký}, title={{BUT System for Low Resource Indian Language ASR}}, year=2018, booktitle={Proc. Interspeech 2018}, pages={3182--3186}, doi={10.21437/Interspeech.2018-1302} }