Skip to main content

2021 | OriginalPaper | Buchkapitel

A Modular Approach for Romanian-English Speech Translation

verfasst von : Andrei-Marius Avram, Vasile Păiş, Dan Tufiş

Erschienen in: Natural Language Processing and Information Systems

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Automatic speech to speech translation is known to be highly beneficial in enabling people to directly communicate with each other when they do not share a common language. This work presents a modular system for Romanian to English and English to Romanian speech translation created by integrating four families of components in a cascaded manner: (1) automatic speech recognition, (2) transcription correction, (3) machine translation and (4) text-to-speech. We further experimented with several models for each component and present several indicators of the system’s performance. Modularity allows the system to be expanded with additional modules for each of the four components. The resulting system is currently deployed on RELATE and is available for public usage through the web interface of the platform.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aguero, P., Adell, J., Bonafonte, A.: Prosody generation for speech-to-speech translation. In: 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. vol. 1, p. I. IEEE (2006) Aguero, P., Adell, J., Bonafonte, A.: Prosody generation for speech-to-speech translation. In: 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. vol. 1, p. I. IEEE (2006)
2.
Zurück zum Zitat Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182. PMLR (2016) Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182. PMLR (2016)
3.
Zurück zum Zitat Avram, A.M., Păiş, V., TufiŞ, D.: Romanian speech recognition experiments from the robin project. ISSN 1843–911X, p. 103 Avram, A.M., Păiş, V., TufiŞ, D.: Romanian speech recognition experiments from the robin project. ISSN 1843–911X, p. 103
4.
Zurück zum Zitat Avram, A.M., Vasile, P., Tufis, D.: Towards a Romanian end-to-end automatic speech recognition based on deepspeech2. Proc. Rom. Acad. Ser. A. 21, 395–402 (2020) Avram, A.M., Vasile, P., Tufis, D.: Towards a Romanian end-to-end automatic speech recognition based on deepspeech2. Proc. Rom. Acad. Ser. A. 21, 395–402 (2020)
5.
Zurück zum Zitat Battenberg, E., et al.: Location-relative attention mechanisms for robust long-form speech synthesis. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6194–6198. IEEE (2020) Battenberg, E., et al.: Location-relative attention mechanisms for robust long-form speech synthesis. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6194–6198. IEEE (2020)
6.
Zurück zum Zitat Bérard, A., Besacier, L., Kocabiyikoglu, A.C., Pietquin, O.: End-to-end automatic speech translation of audiobooks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6224–6228. IEEE (2018) Bérard, A., Besacier, L., Kocabiyikoglu, A.C., Pietquin, O.: End-to-end automatic speech translation of audiobooks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6224–6228. IEEE (2018)
7.
Zurück zum Zitat Bérard, A., Pietquin, O., Servan, C., Besacier, L.: Listen and translate: a proof of concept for end-to-end speech-to-text translation. arXiv preprint arXiv:1612.01744 (2016) Bérard, A., Pietquin, O., Servan, C., Besacier, L.: Listen and translate: a proof of concept for end-to-end speech-to-text translation. arXiv preprint arXiv:​1612.​01744 (2016)
8.
Zurück zum Zitat Boros, T., Dumitrescu, S.D., Pais, V.: Tools and resources for Romanian text-to-speech and speech-to-text applications. arXiv preprint arXiv:1802.05583 (2018) Boros, T., Dumitrescu, S.D., Pais, V.: Tools and resources for Romanian text-to-speech and speech-to-text applications. arXiv preprint arXiv:​1802.​05583 (2018)
9.
Zurück zum Zitat Boroş, T., Tufiş, D.: Romanian-English speech translation. Proc. Roman. Acad. Ser. A 15(1), 68–75 (2014) Boroş, T., Tufiş, D.: Romanian-English speech translation. Proc. Roman. Acad. Ser. A 15(1), 68–75 (2014)
12.
13.
Zurück zum Zitat Jia, Y., et al.: Direct speech-to-speech translation with a sequence-to-sequence model. Proc. Interspeech 2019, 1123–1127 (2019)CrossRef Jia, Y., et al.: Direct speech-to-speech translation with a sequence-to-sequence model. Proc. Interspeech 2019, 1123–1127 (2019)CrossRef
14.
Zurück zum Zitat Ney, H.: Speech translation: coupling of recognition and translation. In: 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), vol. 1, pp. 517–520. IEEE (1999) Ney, H.: Speech translation: coupling of recognition and translation. In: 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No. 99CH36258), vol. 1, pp. 517–520. IEEE (1999)
15.
Zurück zum Zitat Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: LibriSpeech: an ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210. IEEE (2015) Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: LibriSpeech: an ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210. IEEE (2015)
16.
Zurück zum Zitat Păis, V., Tufiş, D., Ion, R.: Integration of Romanian NLP tools into the relate platform. In: International Conference on Linguistic Resources and Tools for Natural Language Processing (2019) Păis, V., Tufiş, D., Ion, R.: Integration of Romanian NLP tools into the relate platform. In: International Conference on Linguistic Resources and Tools for Natural Language Processing (2019)
18.
Zurück zum Zitat Stan, A., Yamagishi, J., King, S., Aylett, M.: The Romanian speech synthesis (RSS) corpus: building a high quality hmm-based speech synthesis system using a high sampling rate. Speech Commun. 53(3), 442–450 (2011)CrossRef Stan, A., Yamagishi, J., King, S., Aylett, M.: The Romanian speech synthesis (RSS) corpus: building a high quality hmm-based speech synthesis system using a high sampling rate. Speech Commun. 53(3), 442–450 (2011)CrossRef
19.
Zurück zum Zitat Vidal, E.: Finite-state speech-to-speech translation. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 111–114. IEEE (1997) Vidal, E.: Finite-state speech-to-speech translation. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 111–114. IEEE (1997)
Metadaten
Titel
A Modular Approach for Romanian-English Speech Translation
verfasst von
Andrei-Marius Avram
Vasile Păiş
Dan Tufiş
Copyright-Jahr
2021
DOI
https://doi.org/10.1007/978-3-030-80599-9_6

Premium Partner