Skip to main content
Top

2018 | OriginalPaper | Chapter

Towards a Speech Recognizer for Multiple Languages Using Arabic Acoustic Model Application to Amazigh Language

Authors : Ali Sadiqui, Ahmed Zinedine

Published in: Arabic Language Processing: From Theory to Practice

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The construction of acoustic models of a language, used in automatic speech recognition (ASR) systems, is a developed technology achievable without great difficulty when a large amount of speech and written corpus is available. However, these technological resources are not available in a large part of languages called “Less Resourced Languages”. An alternative solution is to take advantage of the phonetic structures shared between the different languages to build an acoustic model for the target language.
In this paper, we will return to an experiment in this direction. Indeed, we used an acoustic model of the Arabic language to create one for the Amazigh language. The originality of our work comes from the will to address this language which has become an official language in Morocco, and which has not enough resources for the automatic speech recognition. In addition, both languages share several phonemes and certain characteristics. The realized system has reached a recognition rate of about 73% by word. The potential and the effectiveness of the proposed approach is demonstrated by experiments and comparison with other approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Lê, V.B.: Reconnaissance automatique de la parole pour des langues peu dotées. thèse de doctorat, Joseph Fourier - Grenoble1 (2006) Lê, V.B.: Reconnaissance automatique de la parole pour des langues peu dotées. thèse de doctorat, Joseph Fourier - Grenoble1 (2006)
2.
go back to reference Rabiner, L.-R., Schafer, R.-W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978) Rabiner, L.-R., Schafer, R.-W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)
3.
go back to reference Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall Inc, Englewood Cliffs (2008). Chapter 9 to end of Sect. 9.3 Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall Inc, Englewood Cliffs (2008). Chapter 9 to end of Sect. 9.3
4.
go back to reference Boite, R., Bourlard, H., Dutoit, T., Hancq, J., Leich, H.: Traitement de la parole. Presses Polytechniques et Universitaires Romandes, Collection Electricité, Lausanne, Switzerland (2000) Boite, R., Bourlard, H., Dutoit, T., Hancq, J., Leich, H.: Traitement de la parole. Presses Polytechniques et Universitaires Romandes, Collection Electricité, Lausanne, Switzerland (2000)
5.
go back to reference Schultz, T., Waibel, A.: Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35, 31–51 (2001)CrossRefMATH Schultz, T., Waibel, A.: Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35, 31–51 (2001)CrossRefMATH
6.
go back to reference Lin, H., Deng, L., Droppo, J., Yu, D., Acero, A.: Learning methods in multilingual speech recognition. In: Proceedings of the NIPS, Vancouver, BC, Canada (2008) Lin, H., Deng, L., Droppo, J., Yu, D., Acero, A.: Learning methods in multilingual speech recognition. In: Proceedings of the NIPS, Vancouver, BC, Canada (2008)
7.
go back to reference Byrne, W., et al.: Towards language independent acoustic modeling. In: Proceedings of the ICASSP (2000) Byrne, W., et al.: Towards language independent acoustic modeling. In: Proceedings of the ICASSP (2000)
8.
go back to reference Van Doremalen, J., Cucchiarini, C., Strik, H.: Optimizing automatic speech recognition for low-proficient non native speakers. EURASIP J. Audio Speech Music Process. 2010, 1–13 (2010)CrossRef Van Doremalen, J., Cucchiarini, C., Strik, H.: Optimizing automatic speech recognition for low-proficient non native speakers. EURASIP J. Audio Speech Music Process. 2010, 1–13 (2010)CrossRef
9.
go back to reference Heigold, G., Vanhoucke, V., Senior, A.W., Nguyen, P., Ranzato, M., Devin, M., Dean, J.: Multilingual acoustic models using distributed deep neural networks. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 8619–8623 (2013) Heigold, G., Vanhoucke, V., Senior, A.W., Nguyen, P., Ranzato, M., Devin, M., Dean, J.: Multilingual acoustic models using distributed deep neural networks. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 8619–8623 (2013)
10.
go back to reference Garcia, E., Mengusoglu, E., Janke, E.: Multilingual acoustic models for speech recognition in low-resource devices. In: Proceedings of the ICASSP (2007) Garcia, E., Mengusoglu, E., Janke, E.: Multilingual acoustic models for speech recognition in low-resource devices. In: Proceedings of the ICASSP (2007)
11.
go back to reference De Wachter, M., Demuynck, K., van Compernolle, D., Wambaq, P.: Data driven example based continuous speech recognition. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1133–1136 (2003) De Wachter, M., Demuynck, K., van Compernolle, D., Wambaq, P.: Data driven example based continuous speech recognition. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1133–1136 (2003)
12.
go back to reference Schultz, T., Kirchhoff, K. (eds.): Multilingual Speech Processing. Academic Press, Amsterdam (2006) Schultz, T., Kirchhoff, K. (eds.): Multilingual Speech Processing. Academic Press, Amsterdam (2006)
13.
go back to reference International Phonetic Association: Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, pp. 1–204 (1999) International Phonetic Association: Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, pp. 1–204 (1999)
15.
go back to reference Schultz, T.: GlobalPhone: A multilingual speech and text database developed at karlsruhe university. In: ICSLP 2002, Denver, CO, USA, Septembre 2002 Schultz, T.: GlobalPhone: A multilingual speech and text database developed at karlsruhe university. In: ICSLP 2002, Denver, CO, USA, Septembre 2002
17.
go back to reference Ali Sadiqui, Nouredine Chenfour, Réalisation d’un système de reconnaissance automatique de la parole arabe basée sur CMU Sphinx, article publié sur «Annals. Computer Science Series» Tome 8, Avril 2010 Ali Sadiqui, Nouredine Chenfour, Réalisation d’un système de reconnaissance automatique de la parole arabe basée sur CMU Sphinx, article publié sur «Annals. Computer Science Series» Tome 8, Avril 2010
19.
go back to reference Greenberg J.: The Languages of Africa. The Hague (1966) Greenberg J.: The Languages of Africa. The Hague (1966)
20.
go back to reference Ouakrim, O.: Fonética y fonología del Bereber, Survey at the University of Autònoma de Barcelona (1995) Ouakrim, O.: Fonética y fonología del Bereber, Survey at the University of Autònoma de Barcelona (1995)
21.
go back to reference El Barkani, B.: Le choix de la graphie tifinaghe pour enseigner, apprendre l’amazighe au Maroc: conditions, reprrésentation et pratiques. Linguistique. Université Jean Monnet -Saint-Etienne. Français (2010) El Barkani, B.: Le choix de la graphie tifinaghe pour enseigner, apprendre l’amazighe au Maroc: conditions, reprrésentation et pratiques. Linguistique. Université Jean Monnet -Saint-Etienne. Français (2010)
23.
go back to reference Leggetter, C.-J., Woodland, P.-C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–185 (1995)CrossRef Leggetter, C.-J., Woodland, P.-C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–185 (1995)CrossRef
24.
go back to reference Gauvain, J.-L., Lee, C.-H.: Maximum a posteriori estimation for multi-variate gaussian mixture observations of markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)CrossRef Gauvain, J.-L., Lee, C.-H.: Maximum a posteriori estimation for multi-variate gaussian mixture observations of markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)CrossRef
25.
go back to reference Wang, Z., Schultz, T.: Non-native spontaneous speech recognition through polyphone decision tress specialization. In: Eurospeech 2003, pp. 1449–1452, Geneva, Switzerland, September 2003 Wang, Z., Schultz, T.: Non-native spontaneous speech recognition through polyphone decision tress specialization. In: Eurospeech 2003, pp. 1449–1452, Geneva, Switzerland, September 2003
Metadata
Title
Towards a Speech Recognizer for Multiple Languages Using Arabic Acoustic Model Application to Amazigh Language
Authors
Ali Sadiqui
Ahmed Zinedine
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-73500-9_5

Premium Partner