Top

Published in:

2018 | OriginalPaper | Chapter

Towards a High-Quality Lemma-Based Text to Speech System for the Arabic Language

Authors : Oumaima Zine, Abdelouafi Meziane, Mohamed Boudchiche

Published in: Arabic Language Processing: From Theory to Practice

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Recent numbers put the Arabic language at around 250 million native speakers, making it the fifth spoken language regarding the number of speakers. Therefore, it has gained the interest of researchers in speech technologies in particular speech recognition and speech synthesis. Indeed, many researchers are still investigating in Arabic Text To Speech to deliver an intelligible and close to natural Text To Speech systems. Nevertheless, the most of the available free and semi-free Arabic Text To Speech systems are still away from the natural sounding as human voice does, and the generation of smooth voice is still involved. The primary intention of this work is to increase the quality of the produced speech resulting from the sub-segment based approach proposed in our previous work. To this end, a lemma-based approach for concatenative TTS synthesis is adopted and presented in this paper. In this context, a study of Arabic lemmas frequency was conducted to identify the highly frequent lemmas that often occur in written and spoken Classical and Modern Standard Arabic (MSA). This study reports an analysis of roughly 65 million words fully vocalized obtained from Tashkila corpus, Nemlar, and Al Jazeera. These latter cover modern and classical Arabic languages. As a result, an Arabic lemmatized frequency list was generated. The top 1,000 frequent lemmas were found to provide approximately 79% coverage of the Arabic words. Thus, the former were used as the basic acoustic units of our Text to Speech System. Finally, we demonstrate that this approach affords an improvement in the intelligibility and naturalness of a Text To Speech system with an overall rate 4.5 out of 5.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A POS-Based Preordering Approach for English-to-Arabic Statistical Machine Translation

next chapter Towards a Speech Recognizer for Multiple Languages Using Arabic Acoustic Model Application to Amazigh Language

http://shamela.ws/.

http://www.nemlar.org.

Chabchoub, A., Alahmadi, S., Cherif, A., Barkouti, W.: Di-Diphone Arabic speech synthesis concatenation. Int. J. Comput. Technol. 3, 218–222 (2012)

Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51, 1039–1064 (2009). https://doi.org/10.1016/j.specom.2009.04.004 CrossRef

Zine, O., Meziane, M.: Novel approach for quality enhancement of Arabic Text To Speech synthesis. In: Presented at 3rd International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2017 (2017). https://doi.org/10.1109/ATSIP.2017.8075550

Bozkurt, B., Öztürk, Ö., Dutoit, T.: Text design for TTS speech corpus building using a modified greedy selection. In: INTERSPEECH (2003)

Khan, R.A., Chitode, J.S.: Concatenative speech synthesis: a review. Int. J. of Comput. Appl. 136(3), 1–6 (2016). https://doi.org/10.5120/ijca2016907992

Hande, S.S.: A review of concatenative text to speech synthesis. Int. J. Latest Technol. Eng. Manag. Appl. Sci. IJLTEMAS 3(9), 12–15 (2014)

Hamacher, V., Chalupper, J., Eggers, J., Fischer, E., Kornagel, U., Puder, H., Rass, U.: Signal processing in high-end hearing aids: state of the art, challenges, and future trends. EURASIP J. Appl. Sig. Process. 2005, 2915–2929 (2005)MATH

Gonzalvo, X., Tazari, S., Chan, C., Becker, M., Gutkin, A., Silen, H.: Recent advances in Google real-time HMM-driven unit selection synthesizer. Presented at the September 8 (2016)

Abdelmalek, R., Mnasri, Z.: High quality Arabic text-to-speech synthesis using unit selection. In: 2016 13th International Multi-Conference on Systems, Signals and Devices (SSD), pp. 1–5. IEEE (2016)

10.

Rashad, M.Z., El-Bakry, H.M., Isma’il, I.R.: Diphone speech synthesis system for Arabic using MARY TTS. Int. J. Comput. Sci. Inf. Technol. 2, 18–26 (2010). https://doi.org/10.5121/ijcsit.2010.2402

11.

Alsharif, B., Tahboub, R., Arafeh, L.: Arabic text to speech synthesis using quran-based natural language processing module. J. Theor. Appl. Inf. Technol. 83, 148 (2016)

12.

Husni-Al-Muhtaseb, M.E., Al-Ghamdi, M.: Techniques for high quality arabic speech synthesis. Computer Science and Engineering, King Fahd University of Petroleum and Minerals (2003)

13.

Campbell, N.: Conversational speech synthesis and the need for some laughter. IEEE Trans. Audio Speech Lang. Process. 14, 1171–1178 (2006). https://doi.org/10.1109/TASL.2006.876131 CrossRef

14.

Dutoit, T., Pagel, V., Pierret, N., Bataille, F., van der Vrecken, O.: The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes. In: Proceedings of the Fourth International Conference on Spoken Language, ICSLP 1996, vol. 3, pp. 1393–1396 (1996)

15.

MaryTTS – Overview. http://mary.dfki.de/documentation/overview.html

16.

Karabetsos, S., Tsiakoulis, P., Chalamandaris, A., Raptis, S.: Embedded unit selection text-to-speech synthesis for mobile devices. IEEE Trans. Consum. Electron. 55, 613–621 (2009)CrossRef

17.

Buckwalter, T., Parkinson, D.: A Frequency Dictionary of Arabic: Core Vocabulary for Learners. Routledge, London (2014)

18.

Zaghouani, W., Bouamor, H., Hawwari, A., Diab, M., Obeid, O., Ghoneim, M., Alqahtani, S., Oflazer, K.: Guidelines and framework for a large scale Arabic diacritized corpus. In: The Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 3637–3643 (2016)

19.

Aljazeera Network, Aljazeera Learning Arabic Service 2016. http://learning.aljazeera.net/arabic. Accessed 10 Aug 2017

20.

Belinkov, Y., Magidow, A., Romanov, M., Shmidman, A., Koppel, M.: Shamela: a large-scale historical arabic corpus. arXiv Preprint arXiv:161208989 (2016)

21.

Yaseen, B.: Language technology for Arabic. NEMLAR, Center for Sprog-teknologi, Univ. of Copenhagen, Copenhagen (2005)

22.

Zeroual, I., Lakhouaja, A.: A new Quranic Corpus rich in morphosyntactical information. Int. J. Speech Technol. 19, 339–346 (2016). https://doi.org/10.1007/s10772-016-9335-7 CrossRef

23.

Boudchiche, M., Mazroui, A., Ould Abdallahi Ould Bebah, M., Lakhouaja, A., Boudlal, A.: AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J. King Saud Univ. Comput. Inf. Sci. 29(2), 141–146 (2017). https://doi.org/10.1016/j.jksuci.2016.05.002

24.

Boudchiche, M., Mazroui, A.: Approche hybride pour le développement d’un lemmatiseur pour la langue arabe. Presented at the 13th African Conference on Research in Computer Science and Applied Mathematics, Hammamet, Tunisia (2016)

25.

Masmoo3 - Arabic Audio Books. http://www.masmoo3.com/. Accessed 10 Aug 2017

26.

Boersma, P., Weenink, D.: Praat: doing phonetics by computer [Computer program]. Version 6.0.29. http://www.praat.org/. Accessed 24 May 2017

Title: Towards a High-Quality Lemma-Based Text to Speech System for the Arabic Language
Authors: Oumaima Zine
Abdelouafi Meziane
Mohamed Boudchiche
Publisher: Springer International Publishing
Book: Arabic Language Processing: From Theory to Practice
Print ISBN: 978-3-319-73499-6

Electronic ISBN: 978-3-319-73500-9

Copyright Year: 2018
DOI: https://doi.org/10.1007/978-3-319-73500-9_4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner