Skip to main content
Erschienen in: International Journal of Speech Technology 2/2017

23.03.2017

Extraction of terms and semantic relationships from Arabic texts for automatic construction of an ontology

verfasst von: Ali Benabdallah, Mohammed AlaEddine Abderrahim, Mohammed El-Amine Abderrahim

Erschienen in: International Journal of Speech Technology | Ausgabe 2/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The task of building an ontology from a textual corpus starts with the conceptualization phase, which extracts ontology concepts. These concepts are linked by semantic relationships. In this paper, we describe an approach to the construction of an ontology from an Arabic textual corpus, starting first with the collection and preparation of the corpus through normalization, removing stop words and stemming; then, to extract terms of our ontology, a statistical method for extracting simple and complex terms, called “the repeated segments method” are applied. To select segments with sufficient weight we apply the weighting method term frequency–inverse document frequency (TF–IDF), and to link these terms by semantic relationships we apply an automatic method of learning linguistic markers from text. This method requires a dataset of relationship pairs, which are extracted from two external resources: an Arabic dictionary of synonyms and antonyms and the lexical database Arabic WordNet. Finally, we present the results of our experimentation using our textual corpus. The evaluation of our approach shows encouraging results in terms of recall and precision.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on information and knowledge management (pp. 340–347). ACM Press, New York, NY, USA, ACM DL Digital Library, http://dl.acm.org/citation.cfm?id=584848. Accessed May 1, 2016. Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on information and knowledge management (pp. 340–347). ACM Press, New York, NY, USA, ACM DL Digital Library, http://​dl.​acm.​org/​citation.​cfm?​id=​584848. Accessed May 1, 2016.
Zurück zum Zitat Aussenac-Gilles, N., Despres, S., & Szulman, S. (2008). The TERMINAE method and platform for ontology engineering from texts. In P. Buitelar & P. Cimiano (Eds.), Bridging the gap between text and knowledge: Selected contributions to ontology learning from text (pp. 199–223). Amsterdam: IOS Press. Aussenac-Gilles, N., Despres, S., & Szulman, S. (2008). The TERMINAE method and platform for ontology engineering from texts. In P. Buitelar & P. Cimiano (Eds.), Bridging the gap between text and knowledge: Selected contributions to ontology learning from text (pp. 199–223). Amsterdam: IOS Press.
Zurück zum Zitat Benaissa, B. (2012). Construction semi-automatique d’ontologies à partir de textes arabes, Dissertation, University of Tlemcen, Algeria. Benaissa, B. (2012). Construction semi-automatique d’ontologies à partir de textes arabes, Dissertation, University of Tlemcen, Algeria.
Zurück zum Zitat Black, W., Elkateb, S., Rodriguez, H., Alkhalifa, M., Vossen, P., Pease, A., Bertran, M., & Fellbaum, C. (2006). The Arabic WordNet Project, Proceedings of LREC 2006. Black, W., Elkateb, S., Rodriguez, H., Alkhalifa, M., Vossen, P., Pease, A., Bertran, M., & Fellbaum, C. (2006). The Arabic WordNet Project, Proceedings of LREC 2006.
Zurück zum Zitat Cimiano, P., & Volker, J. (2005). Text2Onto—A framework for ontology learning and data-driven change discovery. In Natural language processing and information systems, lecture notes in computer science (pp. 257–271). Cimiano, P., & Volker, J. (2005). Text2Onto—A framework for ontology learning and data-driven change discovery. In Natural language processing and information systems, lecture notes in computer science (pp. 257–271).
Zurück zum Zitat Fortuna, B., Grobelnik, M., & Mladenic, D. (2006). Semi-automatic data driven ontology construction system. In Proceedings of the 9th international multi-conference Information Society IS (pp. 223–226). Ljubljana, Slovenia. Fortuna, B., Grobelnik, M., & Mladenic, D. (2006). Semi-automatic data driven ontology construction system. In Proceedings of the 9th international multi-conference Information Society IS (pp. 223–226). Ljubljana, Slovenia.
Zurück zum Zitat Girju, R, & Moldovan, D. (2002). Text mining for causal relations. In 15th international Florida artificial intelligence research society (pp. 360–364). Girju, R, & Moldovan, D. (2002). Text mining for causal relations. In 15th international Florida artificial intelligence research society (pp. 360–364).
Zurück zum Zitat Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th international conference on computational linguistics, (COLING’92) (pp. 539–545). Nantes, France. Hearst, M. A. (1992). Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th international conference on computational linguistics, (COLING’92) (pp. 539–545). Nantes, France.
Zurück zum Zitat Khaled, W., & Saad, D. (2012). Student’s dictionary of synonyms and opposites. Beirut, Lebanon: Alrouqy-Verlag. Khaled, W., & Saad, D. (2012). Student’s dictionary of synonyms and opposites. Beirut, Lebanon: Alrouqy-Verlag.
Zurück zum Zitat Mazari, A. C. (2013). Vers une approche statistique pour l’extraction des éléments de l’ontologie à partir des textes arabes. In: RML (Revue Maghrébine des langues), ISSN: 2253-0673, 8th edition, Oran Algeria (pp. 39–56). Mazari, A. C. (2013). Vers une approche statistique pour l’extraction des éléments de l’ontologie à partir des textes arabes. In: RML (Revue Maghrébine des langues), ISSN: 2253-0673, 8th edition, Oran Algeria (pp. 39–56).
Zurück zum Zitat Mondary, T., Després, S., Nazarenko, A., & Szulman, S. (2008). Construction d’ontologies à partir de textes: la phase de conceptualisation, 19èmes Journées Francophones d’Ingénierie des Connaissances, Nancy, France, LIPN—UMR 7030 University of Paris 13—CNRS. Mondary, T., Després, S., Nazarenko, A., & Szulman, S. (2008). Construction d’ontologies à partir de textes: la phase de conceptualisation, 19èmes Journées Francophones d’Ingénierie des Connaissances, Nancy, France, LIPN—UMR 7030 University of Paris 13—CNRS.
Zurück zum Zitat Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 1, 11–21.CrossRef Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 1, 11–21.CrossRef
Zurück zum Zitat Vergne, J. (2004). Découverte locale des mots vides dans des corpus bruts de langues inconnues, sans aucune ressource, 7eme Journées internationales d’analyse statistique des données textuelles, GREYC—University of Caen. Vergne, J. (2004). Découverte locale des mots vides dans des corpus bruts de langues inconnues, sans aucune ressource, 7eme Journées internationales d’analyse statistique des données textuelles, GREYC—University of Caen.
Zurück zum Zitat Zaidi-Ayad, S. (2013). Une plateforme pour la construction d’ontologie en arabe : Extraction des termes et des relations à partir de textes (Application sur le Saint Coran), Dissertation, University of Annaba Algeria. Zaidi-Ayad, S. (2013). Une plateforme pour la construction d’ontologie en arabe : Extraction des termes et des relations à partir de textes (Application sur le Saint Coran), Dissertation, University of Annaba Algeria.
Metadaten
Titel
Extraction of terms and semantic relationships from Arabic texts for automatic construction of an ontology
verfasst von
Ali Benabdallah
Mohammed AlaEddine Abderrahim
Mohammed El-Amine Abderrahim
Publikationsdatum
23.03.2017
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 2/2017
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-017-9405-5

Weitere Artikel der Ausgabe 2/2017

International Journal of Speech Technology 2/2017 Zur Ausgabe

Neuer Inhalt