Skip to main content

2019 | OriginalPaper | Buchkapitel

Extraction of RDF Statements from Text

verfasst von : Jose L. Martinez-Rodriguez, Ivan Lopez-Arevalo, Ana B. Rios-Alvarado, Julio Hernandez, Edwin Aldana-Bobadilla

Erschienen in: Knowledge Graphs and Semantic Web

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The vision of the Semantic Web is to get information with a defined meaning in a way that computers and people can work collaboratively. In this sense, the RDF model provides such a definition by linking and representing resources and descriptions through defined schemes and vocabularies. However, much of the information able to be represented is contained within plain text, which results in an unfeasible task by humans to annotate large scale data sources such as the Web. Therefore, this paper presents a strategy for the extraction and representation of RDF statements from text. The idea is to provide an architecture that receives sentences and returns triples with elements linked to resources and vocabularies of the Semantic Web. The results demonstrate the feasibility of representing RDF statements from text through an implementation following the proposed strategy.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
An ontology defines the concepts, terms, classes, taxonomies, and rules of a domain [11].
 
2
In this context, knowledge elements refer to Conceptual Knowledge [22] in terms of things or concepts and the way they are related to each other with the support of an ontology.
 
3
https://​www.​w3.​org/​TR/​sparql11-overview/​. All URLs in this paper were last accessed on 2019/04/15.
 
4
Different to formatted text, plain text does not contain any style information or graphical objects and refers to only readable characters.
 
5
Semantic roles identify the participants in an event guided by a verb and its underlying relationship [13].
 
9
WordNet is a lexical database for English http://​wordnet.​princeton.​edu.
 
10
From a First Order Logic perspective, the predicate of a sentence corresponds to the main verb and any auxiliaries surrounding it.
 
11
Although the architecture only admits plain text as input data, there are several types of data that could be considered such as structured data (e.g., databases, tables), images, or raw data (e.g., data from sensors).
 
12
In this work, we indistinctly refer to named entities as only entities.
 
14
This process is often supported by the Semantic Role Labeling task, which helps to determine the role or action performed by an entity within a statement.
 
23
The LonelyPlanet dataset was originally downloaded by Martin Kavalec from the site http://​www.​lonelyplanet.​com/​destinations.
 
Literatur
1.
Zurück zum Zitat Antoniou, G., Groth, P.T., van Harmelen, F., Hoekstra, R.: A Semantic Web Primer, 3rd edn. MIT Press, Cambridge (2012) Antoniou, G., Groth, P.T., van Harmelen, F., Hoekstra, R.: A Semantic Web Primer, 3rd edn. MIT Press, Cambridge (2012)
9.
Zurück zum Zitat Dutta, A., Meilicke, C., Stuckenschmidt, H.: Enriching structured knowledge with open information. In: Gangemi, A., Leonardi, S., Panconesi, A. (eds.) World Wide Web Conference (WWW), pp. 267–277. ACM (2015) Dutta, A., Meilicke, C., Stuckenschmidt, H.: Enriching structured knowledge with open information. In: Gangemi, A., Leonardi, S., Panconesi, A. (eds.) World Wide Web Conference (WWW), pp. 267–277. ACM (2015)
10.
Zurück zum Zitat Exner, P., Nugues, P.: Entity extraction: from unstructured text to DBpedia RDF triples. In: The Web of Linked Entities Workshop (WoLE 2012), pp. 58–69. CEUR-WS (2012) Exner, P., Nugues, P.: Entity extraction: from unstructured text to DBpedia RDF triples. In: The Web of Linked Entities Workshop (WoLE 2012), pp. 58–69. CEUR-WS (2012)
15.
Zurück zum Zitat Hernández, D., Hogan, A., Krötzsch, M.: Reifying RDF: what works well with wikidata? In: Liebig, T., Fokoue, A. (eds.) International Workshop on Scalable Semantic Web Knowledge Base Systems Co-located with ISWC, pp. 32–47. CEUR-WS.org (2015) Hernández, D., Hogan, A., Krötzsch, M.: Reifying RDF: what works well with wikidata? In: Liebig, T., Fokoue, A. (eds.) International Workshop on Scalable Semantic Web Knowledge Base Systems Co-located with ISWC, pp. 32–47. CEUR-WS.org (2015)
16.
Zurück zum Zitat Waitelonis, J., Exeler, C., Sack, H.: Linked data enabled generalized vector space model to improve document retrieval. In: NLP & DBpedia Workshop in Conjunction with ISWC 2015. CEUR (2015) Waitelonis, J., Exeler, C., Sack, H.: Linked data enabled generalized vector space model to improve document retrieval. In: NLP & DBpedia Workshop in Conjunction with ISWC 2015. CEUR (2015)
18.
Zurück zum Zitat Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)CrossRef Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)CrossRef
19.
Zurück zum Zitat Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Annual Meeting of the Association for Computational Linguistics (ACL), pp. 55–60 (2014) Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J.R., Bethard, S., McClosky, D.: The Stanford CoreNLP natural language processing toolkit. In: Annual Meeting of the Association for Computational Linguistics (ACL), pp. 55–60 (2014)
20.
Zurück zum Zitat Martinez-Rodriguez, J.L., Hernandez, J., Lopez-Arevalo, I., Rios-Alvarado, A.B.: A strategy for the integration of named entity extraction and linking results. In: Proceedings of the 3rd International Workshop on Semantic Web 2018 Co-located with 15th International Congress on Information (INFO 2018), 7 March 2018, Havana, Cuba, pp. 13–20. CEUR-WS.org (2018) Martinez-Rodriguez, J.L., Hernandez, J., Lopez-Arevalo, I., Rios-Alvarado, A.B.: A strategy for the integration of named entity extraction and linking results. In: Proceedings of the 3rd International Workshop on Semantic Web 2018 Co-located with 15th International Congress on Information (INFO 2018), 7 March 2018, Havana, Cuba, pp. 13–20. CEUR-WS.org (2018)
21.
Zurück zum Zitat Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web J. (2018, to appear) Martinez-Rodriguez, J.L., Hogan, A., Lopez-Arevalo, I.: Information extraction meets the semantic web: a survey. Semant. Web J. (2018, to appear)
23.
Zurück zum Zitat Pinto, A.M., Oliveira, H.G., Alves, A.O.: Comparing the performance of different NLP toolkits in formal and social media text. In: 5th Symposium on Languages, Applications and Technologies, SLATE, pp. 3:1–3:16 (2016) Pinto, A.M., Oliveira, H.G., Alves, A.O.: Comparing the performance of different NLP toolkits in formal and social media text. In: 5th Symposium on Languages, Applications and Technologies, SLATE, pp. 3:1–3:16 (2016)
24.
Zurück zum Zitat Randolph, J.J.: Free-marginal multirater kappa (multirater k [free]): An alternative to fleiss’ fixed-marginal multirater kappa. In: Joensuu Learning and Instruction Symposium (2005) Randolph, J.J.: Free-marginal multirater kappa (multirater k [free]): An alternative to fleiss’ fixed-marginal multirater kappa. In: Joensuu Learning and Instruction Symposium (2005)
25.
Zurück zum Zitat Rusu, D., Fortuna, B., Mladenic, D.: Automatically annotating text with linked open data. In: Bizer, C., Heath, T., Berners-Lee, T., Hausenblas, M. (eds.) WWW2011 Workshop on Linked Data on the Web. CEUR-WS.org (2011) Rusu, D., Fortuna, B., Mladenic, D.: Automatically annotating text with linked open data. In: Bizer, C., Heath, T., Berners-Lee, T., Hausenblas, M. (eds.) WWW2011 Workshop on Linked Data on the Web. CEUR-WS.org (2011)
Metadaten
Titel
Extraction of RDF Statements from Text
verfasst von
Jose L. Martinez-Rodriguez
Ivan Lopez-Arevalo
Ana B. Rios-Alvarado
Julio Hernandez
Edwin Aldana-Bobadilla
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-21395-4_7

Premium Partner