Skip to main content

2017 | OriginalPaper | Buchkapitel

Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition

verfasst von : Josef Chaloupka

Erschienen in: Speech and Computer

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, a system for digits to words conversion for almost all Slavic languages is proposed. This system was developed for improvement of text corpora which we are using for building of a lexicon or for training of language models and acoustic models in the task of Large Vocabulary Continuous Speech Recognition (LVCSR). Strings of digits, some other special characters (%, €, $, ...) or abbreviations of physical units (km, m, cm, kg, l, \({}^\circ \)C, etc.) occur very often in our text corpora. It is in about 5% cases. The strings of digits or special characters are usually omitted if a lexicon is being built or if the language model is being trained. The task of digits to words conversion in non-inflected languages (e.g. English) is solved by relatively simple conversion or lookup table. The problem is more complex in inflected Slavic languages. The string of digits can be converted into several different word combinations. It depends on the context and resulting words are inflected by gender or cases. The main goal of this research was to find the rules (patterns) for conversion of string of digits into words for Slavic languages. The second goal was to unify this patterns over Slavic languages and to integrate them to the universal system for digits to words conversion.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Laurent, C., Bengio, Y., Courville, A.: Towards end-to-end speech recognition with deep convolutional neural networks. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, pp. 410–414 (2016). ISSN: 2308–457X Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Laurent, C., Bengio, Y., Courville, A.: Towards end-to-end speech recognition with deep convolutional neural networks. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2016, pp. 410–414 (2016). ISSN: 2308–457X
2.
Zurück zum Zitat Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Acero, A.: Recent advances in deep learning for speech research at Microsoft. In: IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP 2013, pp. 8604–8608 (2013). ISBN: 978-147990356-6 Deng, L., Li, J., Huang, J.-T., Yao, K., Yu, D., Seide, F., Seltzer, M., Acero, A.: Recent advances in deep learning for speech research at Microsoft. In: IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP 2013, pp. 8604–8608 (2013). ISBN: 978-147990356-6
3.
Zurück zum Zitat Nouza, J., Blavka, K., Zdansky, J., Cerva, P., Silovsky, J., Bohac, M., Chaloupka, J., Kucharova, M., Seps, L.: Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives. In: 2012 IEEE 14th International Workshop on Multimedia Signal Processing, MMSP 2012, pp. 337–342 (2012). ISBN: 978-146734572-9 Nouza, J., Blavka, K., Zdansky, J., Cerva, P., Silovsky, J., Bohac, M., Chaloupka, J., Kucharova, M., Seps, L.: Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives. In: 2012 IEEE 14th International Workshop on Multimedia Signal Processing, MMSP 2012, pp. 337–342 (2012). ISBN: 978-146734572-9
4.
Zurück zum Zitat Nouza, J., Zdansky, J., David, P., Cerva, P., Kolorenc, J., Nejedlova, D.: Fully automated system for Czech spoken broadcast transcription with very large (300K+) lexicon. In: Interspeech 2005, Lisboa, Portugal, pp. 1681–1684 (2005). ISSN: 1018–4074 Nouza, J., Zdansky, J., David, P., Cerva, P., Kolorenc, J., Nejedlova, D.: Fully automated system for Czech spoken broadcast transcription with very large (300K+) lexicon. In: Interspeech 2005, Lisboa, Portugal, pp. 1681–1684 (2005). ISSN: 1018–4074
5.
Zurück zum Zitat Nouza, J., Silovsky, J., Zdansky, J., Cerva, P., Kroul, M., Chaloupka, J.: Czech-to-Slovak adapted broadcast news transcription system. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association, (Interspeech 2008), pp. 2683–2686, 22–26 September, Brisbane, Australia (2008). ISSN: 1990–9772 Nouza, J., Silovsky, J., Zdansky, J., Cerva, P., Kroul, M., Chaloupka, J.: Czech-to-Slovak adapted broadcast news transcription system. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association, (Interspeech 2008), pp. 2683–2686, 22–26 September, Brisbane, Australia (2008). ISSN: 1990–9772
6.
Zurück zum Zitat Nouza, J., Cerva, P., Safarik, R.: Cross-lingual adaptation of broadcast transcription system to polish language using public data sources. In: 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poland, pp. 181–185 (2015). ISBN: 978-83-932640-8-7 Nouza, J., Cerva, P., Safarik, R.: Cross-lingual adaptation of broadcast transcription system to polish language using public data sources. In: 7th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poland, pp. 181–185 (2015). ISBN: 978-83-932640-8-7
7.
Zurück zum Zitat Nouza, J., Safarik, R., Cerva, P.: ASR for south slavic languages developed in almost automated way. In: Proceedings of the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016), San Francisco, USA, pp. 3868–3872 (2016). doi:10.21437/Interspeech.2016-747, Scopus EID: 2-s2.0-84994385032, ISSN: 2308-457X Nouza, J., Safarik, R., Cerva, P.: ASR for south slavic languages developed in almost automated way. In: Proceedings of the 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016), San Francisco, USA, pp. 3868–3872 (2016). doi:10.​21437/​Interspeech.​2016-747, Scopus EID: 2-s2.0-84994385032, ISSN: 2308-457X
8.
Zurück zum Zitat Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, ICASSP 2013, pp. 8609–8613 (2013). ISBN: 978-147990356-6 Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, ICASSP 2013, pp. 8609–8613 (2013). ISBN: 978-147990356-6
Metadaten
Titel
Digits to Words Converter for Slavic Languages in Systems of Automatic Speech Recognition
verfasst von
Josef Chaloupka
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-66429-3_30