Skip to main content
Erschienen in: International Journal of Speech Technology 2/2016

27.10.2015 | Special Issue Article

Towards an open platform based on HPSG formalism for the standard Arabic language

verfasst von: Mourad Loukam, Amar Balla, Mohamed Tayeb Laskri

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The aim of this paper is to present an open software platform for analysing texts in standard Arabic language. The originality of this platform is that it is an integrated software environment which offers all the necessary resources and tools for parsing Arabic texts. For formalising the several elements of the language, the HPSG formalism has been adopted because of its effectiveness and its ability to be adapted to any natural language. Currently, the platform is operational with an appreciable coverage of many Arabic syntactic structures. In the medium-term, our objective is to use the platform for developing applications for the Arabic language such as interfaces, learning, information retrieval…etc.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abdul-Mageed, M., Kübler, S., & Diab, M. (2012). Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis (pp. 19–28). Association for Computational Linguistics. Abdul-Mageed, M., Kübler, S., & Diab, M. (2012). Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd workshop in computational approaches to subjectivity and sentiment analysis (pp. 19–28). Association for Computational Linguistics.
Zurück zum Zitat Alabbas, M., & Ramsay, A. (2014). Improved Parsing for Arabic by Combining Diverse Dependency Parsers. In Human language technology challenges for computer science and linguistics (pp. 43–54). Springer International Publishing. Alabbas, M., & Ramsay, A. (2014). Improved Parsing for Arabic by Combining Diverse Dependency Parsers. In Human language technology challenges for computer science and linguistics (pp. 43–54). Springer International Publishing.
Zurück zum Zitat Al-diabat, M. (2012). Arabic text categorization using classification rule mining. Applied Mathematical Sciences, 6(81), 4033–4046. Al-diabat, M. (2012). Arabic text categorization using classification rule mining. Applied Mathematical Sciences, 6(81), 4033–4046.
Zurück zum Zitat Al-Jumaily, H., Martínez, P., Martínez-Fernández, J. L., & Van der Goot, E. (2012). A real time named entity recognition system for Arabic text mining. Language Resources and Evaluation, 46(4), 543–563.CrossRef Al-Jumaily, H., Martínez, P., Martínez-Fernández, J. L., & Van der Goot, E. (2012). A real time named entity recognition system for Arabic text mining. Language Resources and Evaluation, 46(4), 543–563.CrossRef
Zurück zum Zitat Al-Kabi, M. N., Alsmadi, I. M., Gigieh, A. H., Wahsheh, H. A., & Haidar, M. M. (2014). Opinion mining and analysis for arabic language. International Journal of Advanced Computer Science and Applications (IJACSA), 5(5), 181–195. Al-Kabi, M. N., Alsmadi, I. M., Gigieh, A. H., Wahsheh, H. A., & Haidar, M. M. (2014). Opinion mining and analysis for arabic language. International Journal of Advanced Computer Science and Applications (IJACSA), 5(5), 181–195.
Zurück zum Zitat Al-Taani, A. T., Msallam, M. M., & Wedian, S. A. (2012). A top-down chart parser for analyzing arabic sentences. International Arab Journal of Information Technology, 9(2), 109–116. Al-Taani, A. T., Msallam, M. M., & Wedian, S. A. (2012). A top-down chart parser for analyzing arabic sentences. International Arab Journal of Information Technology, 9(2), 109–116.
Zurück zum Zitat Azmi, A. M., & Al-Thanyyan, S. (2012). A text summarizer for Arabic. Computer Speech & Language, 26(4), 260–273.CrossRef Azmi, A. M., & Al-Thanyyan, S. (2012). A text summarizer for Arabic. Computer Speech & Language, 26(4), 260–273.CrossRef
Zurück zum Zitat Bender, E., & Lascarides, A. (2013). On modelling scope of inflectional negation. In P. Hofmeister & E. Norcliffe (Eds.), The core and the periphery: Data-driven perspectives on syntax inspired by Ivan A. Sag (pp. 101–124). Stanford: CSLI Publications. Bender, E., & Lascarides, A. (2013). On modelling scope of inflectional negation. In P. Hofmeister & E. Norcliffe (Eds.), The core and the periphery: Data-driven perspectives on syntax inspired by Ivan A. Sag (pp. 101–124). Stanford: CSLI Publications.
Zurück zum Zitat Copestake, A. (2002). Implementing typed feature structure grammars. Stanford: CSLI Publications, Stanford University. 2002.MATH Copestake, A. (2002). Implementing typed feature structure grammars. Stanford: CSLI Publications, Stanford University. 2002.MATH
Zurück zum Zitat Darwish, K. (2013). Named entity recognition using cross-lingual resources: Arabic as an example. In Proceedings of the 51st annual meeting of the association for computational linguistics (ACL) (pp. 1558–1567). Darwish, K. (2013). Named entity recognition using cross-lingual resources: Arabic as an example. In Proceedings of the 51st annual meeting of the association for computational linguistics (ACL) (pp. 1558–1567).
Zurück zum Zitat Farghaly, A. & Shaalan, K. (2009). Arabic natural language processing : challenges and solutions, ACM transactions on Asian language information processing, Vol. 8, No. 4, Article 14. Farghaly, A. & Shaalan, K. (2009). Arabic natural language processing : challenges and solutions, ACM transactions on Asian language information processing, Vol. 8, No. 4, Article 14.
Zurück zum Zitat Haddar, K., Fehri, H., & Romary, L. (2012). A prototype for projecting HPSG syntactic lexica towards LMF. Journal for Language Technology and Computational Linguistics, 27(1), 21–46. Haddar, K., Fehri, H., & Romary, L. (2012). A prototype for projecting HPSG syntactic lexica towards LMF. Journal for Language Technology and Computational Linguistics, 27(1), 21–46.
Zurück zum Zitat Hadrich Belguith, L., Alloulou, C. & Ben Hamadou, A. (2007). De la segmentation à l’analyse syntaxique de textes arabe’s. I3 Journal (Interaction—Intelligence—Information), Volume 7 (2), 2007. Hadrich Belguith, L., Alloulou, C. & Ben Hamadou, A. (2007). De la segmentation à l’analyse syntaxique de textes arabe’s. I3 Journal (Interaction—Intelligence—Information), Volume 7 (2), 2007.
Zurück zum Zitat Hann, M. (2011). Null conjoncts and bounds pronouns in Arabic. In Proceedings of HPSG 2011 conference, August 22–25, 2011, University of Washington, CSLI Publication. Hann, M. (2011). Null conjoncts and bounds pronouns in Arabic. In Proceedings of HPSG 2011 conference, August 22–25, 2011, University of Washington, CSLI Publication.
Zurück zum Zitat Hann, M. (2012). Arabic relativization patterns: A Unified HPSG analysis. In Proceedings of HPSG 2012 conference, Chugnam National University of Daejon, South Korea, CSLI Publications, July 18–19, 2012. Hann, M. (2012). Arabic relativization patterns: A Unified HPSG analysis. In Proceedings of HPSG 2012 conference, Chugnam National University of Daejon, South Korea, CSLI Publications, July 18–19, 2012.
Zurück zum Zitat Khorsheed, M. S., & Al-Thubaity, A. O. (2013). Comparative evaluation of text classification techniques using a large diverse Arabic dataset. Language resources and evaluation, 47(2), 513–538.CrossRef Khorsheed, M. S., & Al-Thubaity, A. O. (2013). Comparative evaluation of text classification techniques using a large diverse Arabic dataset. Language resources and evaluation, 47(2), 513–538.CrossRef
Zurück zum Zitat Loukam, M., Balla, A., & Laskri, M. T. (2013). PHARAS : Une plate-forme d’analyse basée sur le formalisme HPSG pour l’Arabe standard : Développements récents et perspectives. Revue RIST, 20(2), 20–31. Loukam, M., Balla, A., & Laskri, M. T. (2013). PHARAS : Une plate-forme d’analyse basée sur le formalisme HPSG pour l’Arabe standard : Développements récents et perspectives. Revue RIST, 20(2), 20–31.
Zurück zum Zitat Loukam, M., Balla, A. & Laskri, M.T. (2014). An open platform based on HPSG formalism for the standard Arabic language, workshop on free/open-source arabic corpora and corpora processing Tools, LREC conference 2014, May 27 2014, Reykyavik, Iceland, pp. 38–42. Loukam, M., Balla, A. & Laskri, M.T. (2014). An open platform based on HPSG formalism for the standard Arabic language, workshop on free/open-source arabic corpora and corpora processing Tools, LREC conference 2014, May 27 2014, Reykyavik, Iceland, pp. 38–42.
Zurück zum Zitat Mahdaouy, A. E., Gaussier, E., & Alaoui, S. O. E. (2014). ‘Exploring term proximity statistic for Arabic information retrieval’. In Information science and technology (CIST), 2014 Third IEEE international colloquium IEEE (pp. 272–277). Mahdaouy, A. E., Gaussier, E., & Alaoui, S. O. E. (2014). ‘Exploring term proximity statistic for Arabic information retrieval’. In Information science and technology (CIST), 2014 Third IEEE international colloquium IEEE (pp. 272–277).
Zurück zum Zitat Marton, Y., Chiang, D., & Resnik, P. (2012). Soft syntactic constraints for Arabic–English hierarchical phrase-based translation. Machine Translation, 26(1–2), 137–157.CrossRef Marton, Y., Chiang, D., & Resnik, P. (2012). Soft syntactic constraints for Arabic–English hierarchical phrase-based translation. Machine Translation, 26(1–2), 137–157.CrossRef
Zurück zum Zitat Miyao, Y. & Tsujii, J. (2005). Probabilistic disambiguation models for wide-coverage hpsg parsing. In Proceedings of ACL-2005 (pp. 83–90). Miyao, Y. & Tsujii, J. (2005). Probabilistic disambiguation models for wide-coverage hpsg parsing. In Proceedings of ACL-2005 (pp. 83–90).
Zurück zum Zitat Müller, S. (1996). The babel-system-an HPSG Fragment for German, a Parser, and a Dialogue Component. In Proceedings of the fourth international conference on the practical application of prolog language (pp. 263–277) London. Müller, S. (1996). The babel-system-an HPSG Fragment for German, a Parser, and a Dialogue Component. In Proceedings of the fourth international conference on the practical application of prolog language (pp. 263–277) London.
Zurück zum Zitat Müller, S. 2007. The Grammix CD Rom. a software collection for developing typed feature structure grammars. In T. H. King &E. M. Bender (Eds.) Proceedings of the grammar engineering across frameworks workshop 2007, ser. Studies in Computational Linguistics ONLINE, Stanford: CSLI Publications, 2007. Müller, S. 2007. The Grammix CD Rom. a software collection for developing typed feature structure grammars. In T. H. King &E. M. Bender (Eds.) Proceedings of the grammar engineering across frameworks workshop 2007, ser. Studies in Computational Linguistics ONLINE, Stanford: CSLI Publications, 2007.
Zurück zum Zitat Ninomiya, T., Matsuzaki, T., Tsuruoka,Y., Miyao, Y. & Tsujii, J. (2006). Extremely lexicalized models for accurate and fast HPSG parsing. In Proceedings of EMNLP. Ninomiya, T., Matsuzaki, T., Tsuruoka,Y., Miyao, Y. & Tsujii, J. (2006). Extremely lexicalized models for accurate and fast HPSG parsing. In Proceedings of EMNLP.
Zurück zum Zitat Pollard, C. & Sag, I.A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press and Stanford: CSLI Publications. Pollard, C. & Sag, I.A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press and Stanford: CSLI Publications.
Zurück zum Zitat Sag, I.A, Wasow, T. & Bender, E. (2003). Syntactic Theory: a formal introduction, 2nd edition, CSLI Publications, ISBN 9781575. Müller, S.2007. The Grammix CD Rom. A Software Collection for Developing Typed Feature Structure Grammars. In Tracy Holloway King and Emily M. Bender (eds.), Grammar Engineering across Frameworks 2007, Studies in Computational Linguistics ONLINE, Stanford: cslip. Sag, I.A, Wasow, T. & Bender, E. (2003). Syntactic Theory: a formal introduction, 2nd edition, CSLI Publications, ISBN 9781575. Müller, S.2007. The Grammix CD Rom. A Software Collection for Developing Typed Feature Structure Grammars. In Tracy Holloway King and Emily M. Bender (eds.), Grammar Engineering across Frameworks 2007, Studies in Computational Linguistics ONLINE, Stanford: cslip.
Zurück zum Zitat Thabtah, F., Gharaibeh, O., & Al-Zubaidy, R. (2012). Arabic text mining using rule based classification. Journal of Information & Knowledge Management, 11(01), 1250006.CrossRef Thabtah, F., Gharaibeh, O., & Al-Zubaidy, R. (2012). Arabic text mining using rule based classification. Journal of Information & Knowledge Management, 11(01), 1250006.CrossRef
Metadaten
Titel
Towards an open platform based on HPSG formalism for the standard Arabic language
verfasst von
Mourad Loukam
Amar Balla
Mohamed Tayeb Laskri
Publikationsdatum
27.10.2015
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 2/2016
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-015-9314-4

Weitere Artikel der Ausgabe 2/2016

International Journal of Speech Technology 2/2016 Zur Ausgabe

Special Issue Article

WIT: Weka interface translator

Neuer Inhalt