Skip to main content
Top

2016 | OriginalPaper | Chapter

Improving Open Information Extraction for Semantic Web Tasks

Authors : Cheikh Kacfah Emani, Catarina Ferreira Da Silva, Bruno Fiès, Parisa Ghodous

Published in: Transactions on Computational Collective Intelligence XXI

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Open Information Extraction (OIE) aims to automatically identify all the possible assertions within a sentence. Results of this task are usually a set of triples (subject, predicate, object). In this paper, we first present what OIE is and how it can be improved when we work in a given domain of knowledge. Using a corpus made up of sentences in building engineering construction, we obtain an improvement of more than 18 %. Next, we show how OIE can be used at a base of a high-level semantic web task. Here we have applied OIE on formalisation of natural language definitions. We test this formalisation task on a corpus of sentences defining concepts found in the pizza ontology. At this stage, 70.27 % of our 37 sentences-corpus are fully rewritten in OWL DL.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
2
An exhaustive list of labels for phrases is available in the Penn Treebank [6].
 
4
With a large ontology, such comparison must take advantage of an index for the sake of scalability.
 
6
\(r_i\) is the subsumption or the set of elements of a more complex restriction (URI of the restriction property, OWL keywords for the type of the restriction, etc.) as explained in the introduction of Sect. 4.3.
 
7
Only for better understanding. The choice of or would not have changed anything.
 
10
Concepts’ tokens are usually surrounded by adjectives, adverbs, prepositions, etc.
 
Literature
3.
go back to reference Bast, H., Haussmann, E.: Open information extraction via contextual sentence decomposition. In: 2013 IEEE Seventh International Conference on Semantic Computing (ICSC), pp. 154–159. IEEE Computer Society (2013) Bast, H., Haussmann, E.: Open information extraction via contextual sentence decomposition. In: 2013 IEEE Seventh International Conference on Semantic Computing (ICSC), pp. 154–159. IEEE Computer Society (2013)
4.
go back to reference Bast, H., Haussmann, E.: More informative open information extraction via simple inference. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 585–590. Springer, Heidelberg (2014)CrossRef Bast, H., Haussmann, E.: More informative open information extraction via simple inference. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 585–590. Springer, Heidelberg (2014)CrossRef
5.
go back to reference Berg, J.: Aristotle’s theory of definition. In: ATTI del Convegno Internazionale di Storia della Logica, pp. 19–30 (1982) Berg, J.: Aristotle’s theory of definition. In: ATTI del Convegno Internazionale di Storia della Logica, pp. 19–30 (1982)
6.
go back to reference Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Marcinkiewicz, M.A., Schasberger, B.: Bracketing guidelines for treebank II Style Penn Treebank project. University of Pennsylvania 97 (1995) Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Marcinkiewicz, M.A., Schasberger, B.: Bracketing guidelines for treebank II Style Penn Treebank project. University of Pennsylvania 97 (1995)
7.
go back to reference Bühmann, L., Fleischhacker, D., Lehmann, J., Melo, A., Völker, J.: Inductive lexical learning of class expressions. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS, vol. 8876, pp. 42–53. Springer, Heidelberg (2014) Bühmann, L., Fleischhacker, D., Lehmann, J., Melo, A., Völker, J.: Inductive lexical learning of class expressions. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS, vol. 8876, pp. 42–53. Springer, Heidelberg (2014)
10.
go back to reference Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 355–366 (2013) Del Corro, L., Gemulla, R.: Clausie: clause-based open information extraction. In: Proceedings of the 22nd International Conference on World Wide Web, WWW 2013, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp. 355–366 (2013)
11.
go back to reference Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 1535–1545 (2011) Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 1535–1545 (2011)
12.
go back to reference Hadjieleftheriou, M., Srivastava, D.: Weighted set-based string similarity. IEEE Data Eng. Bull. 33(1), 25–36 (2010) Hadjieleftheriou, M., Srivastava, D.: Weighted set-based string similarity. IEEE Data Eng. Bull. 33(1), 25–36 (2010)
13.
go back to reference Horridge, M., Jupp, S., Moulton, G., Rector, A., Stevens, R., Wroe, C.: A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools Edition1.2. The University of Manchester, Manchester (2009) Horridge, M., Jupp, S., Moulton, G., Rector, A., Stevens, R., Wroe, C.: A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools Edition1.2. The University of Manchester, Manchester (2009)
14.
go back to reference Kacfah Emani, C.H., Ferreira Da Silva, C., B., Ghodous, P.: Improving open information extraction using domain knowledge. In: Surfacing the Deep and the Social Web (SDSW), Co-Located with The 13th ISWC, October 2014 Kacfah Emani, C.H., Ferreira Da Silva, C., B., Ghodous, P.: Improving open information extraction using domain knowledge. In: Surfacing the Deep and the Social Web (SDSW), Co-Located with The 13th ISWC, October 2014
15.
go back to reference Kacfah Emani, C.H., Ferreira Da Silva, C., Fis, B., Ghodous, P., Khosrowshahi, F.: Structural sentence decomposition via open information extraction. In: 18th International Conference Information Visualisation (IV2014), July 2014 Kacfah Emani, C.H., Ferreira Da Silva, C., Fis, B., Ghodous, P., Khosrowshahi, F.: Structural sentence decomposition via open information extraction. In: 18th International Conference Information Visualisation (IV2014), July 2014
16.
go back to reference Lehmann, J., Auer, S., Bühmann, L., Tramp, S.: Class expression learning for ontology engineering. Web Semant. Sci. Serv. Agents World Wide Web 9(1), 71–81 (2011)CrossRef Lehmann, J., Auer, S., Bühmann, L., Tramp, S.: Class expression learning for ontology engineering. Web Semant. Sci. Serv. Agents World Wide Web 9(1), 71–81 (2011)CrossRef
17.
go back to reference Mausam, S.,M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: EMNLP-CoNLL, pp. 523–534. Association for Computational Linguistics (2012) Mausam, S.,M., Bart, R., Soderland, S., Etzioni, O.: Open language learning for information extraction. In: EMNLP-CoNLL, pp. 523–534. Association for Computational Linguistics (2012)
18.
go back to reference Nguyen, V.T., Sallaberry, C., Gaio, M.: Mesure de la similarité entre termes et labels de concepts ontologiques. In: Conférence en Recherche D’information et Applications, pp. 415–430 (2013) Nguyen, V.T., Sallaberry, C., Gaio, M.: Mesure de la similarité entre termes et labels de concepts ontologiques. In: Conférence en Recherche D’information et Applications, pp. 415–430 (2013)
19.
go back to reference Sayah, K.: Automated Norm Extraction from Legal Texts. Master’s thesis, Utrecht University, August 2004 Sayah, K.: Automated Norm Extraction from Legal Texts. Master’s thesis, Utrecht University, August 2004
20.
go back to reference Tsatsaronis, G., Petrova, A., Kissa, M., Ma, Y., Distel, F., Baader, F., Schroeder, M.: Learning formal definitions for biomedical concepts. In: OWLED (2013) Tsatsaronis, G., Petrova, A., Kissa, M., Ma, Y., Distel, F., Baader, F., Schroeder, M.: Learning formal definitions for biomedical concepts. In: OWLED (2013)
21.
go back to reference Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.C., Gerber, D., Cimiano, P.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 639–648. ACM, New York (2012) Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A.C., Gerber, D., Cimiano, P.: Template-based question answering over RDF data. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 639–648. ACM, New York (2012)
22.
go back to reference Unger, C., Cimiano, P.: Pythia: compositional meaning construction for ontology-based question answering on the semantic web. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 153–160. Springer, Heidelberg (2011)CrossRef Unger, C., Cimiano, P.: Pythia: compositional meaning construction for ontology-based question answering on the semantic web. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 153–160. Springer, Heidelberg (2011)CrossRef
23.
go back to reference Völker, J., Hitzler, P., Cimiano, P.: Acquisition of OWL DL axioms from lexical resources. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 670–685. Springer, Heidelberg (2007)CrossRef Völker, J., Hitzler, P., Cimiano, P.: Acquisition of OWL DL axioms from lexical resources. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 670–685. Springer, Heidelberg (2007)CrossRef
24.
go back to reference Völker, J., Rudolph, S.: Lexico-logical acquisition of OWL DL axioms. In: Medina, R., Obiedkov, S. (eds.) ICFCA 2008. LNCS (LNAI), vol. 4933, pp. 62–77. Springer, Heidelberg (2008)CrossRef Völker, J., Rudolph, S.: Lexico-logical acquisition of OWL DL axioms. In: Medina, R., Obiedkov, S. (eds.) ICFCA 2008. LNCS (LNAI), vol. 4933, pp. 62–77. Springer, Heidelberg (2008)CrossRef
25.
go back to reference Wächter, T., Schroeder, M.: Semi-automated ontology generation within obo-edit. Bioinformatics 26(12), i88–i96 (2010)CrossRef Wächter, T., Schroeder, M.: Semi-automated ontology generation within obo-edit. Bioinformatics 26(12), i88–i96 (2010)CrossRef
26.
go back to reference Winkler, W.E.: The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Census Bureau (1999) Winkler, W.E.: The state of record linkage and current research problems. Technical report, Statistical Research Division, U.S. Census Bureau (1999)
Metadata
Title
Improving Open Information Extraction for Semantic Web Tasks
Authors
Cheikh Kacfah Emani
Catarina Ferreira Da Silva
Bruno Fiès
Parisa Ghodous
Copyright Year
2016
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49521-6_6

Premium Partner