Skip to main content

2016 | OriginalPaper | Buchkapitel

Enabling Fine-Grained RDF Data Completeness Assessment

verfasst von : Fariz Darari, Simon Razniewski, Radityo Eko Prasojo, Werner Nutt

Erschienen in: Web Engineering

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, more and more RDF data is becoming available on the Semantic Web. While the Semantic Web is generally incomplete by nature, on certain topics, it already contains complete information and thus, queries may return all answers that exist in reality. In this paper we develop a technique to check query completeness based on RDF data annotated with completeness information, taking into account data-specific inferences that lead to an inference problem which is \(\varPi ^P_2\)-complete. We then identify a practically relevant fragment of completeness information, suitable for crowdsourced, entity-centric RDF data sources such as Wikidata, for which we develop an indexing technique that allows to scale completeness reasoning to Wikidata-scale data sources. We verify the applicability of our framework using Wikidata and develop COOL-WD, a completeness tool for Wikidata, used to annotate Wikidata with completeness statements and reason about the completeness of query answers over Wikidata. The tool is available at http://​cool-wd.​inf.​unibz.​it/​.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
4
Since in this work we focus on conjunctive queries which are monotonic, the direction \({\llbracket Q \rrbracket _{G'}} \supseteq {\llbracket Q \rrbracket _{G}}\) comes for free.
 
6
We do not allow the subject to be a variable as it is not practically reasonable (e.g., complete for all the entities and values of predicate child).
 
9
We do not measure query evaluation time for failure case since query evaluation is independent of the completeness of the query.
 
Literatur
1.
Zurück zum Zitat Hayes, P.J., Patel-Schneider, P.F. (eds.): RDF 1.1 Semantics. W3C Recommendation, 25 February 2014 Hayes, P.J., Patel-Schneider, P.F. (eds.): RDF 1.1 Semantics. W3C Recommendation, 25 February 2014
2.
Zurück zum Zitat Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef
3.
Zurück zum Zitat Darari, F., Nutt, W., Pirrò, G., Razniewski, S.: Completeness statements about rdf data sources and their use for query answering. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 66–83. Springer, Heidelberg (2013)CrossRef Darari, F., Nutt, W., Pirrò, G., Razniewski, S.: Completeness statements about rdf data sources and their use for query answering. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 66–83. Springer, Heidelberg (2013)CrossRef
4.
Zurück zum Zitat Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: ACM SIGMOD 2015, pp. 561–576 (2015) Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: ACM SIGMOD 2015, pp. 561–576 (2015)
5.
Zurück zum Zitat Harris, S., Seaborne, A. (eds.): SPARQL 1.1 Query Language. W3C Recommendation, 21 March 2013 Harris, S., Seaborne, A. (eds.): SPARQL 1.1 Query Language. W3C Recommendation, 21 March 2013
6.
Zurück zum Zitat Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRef Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRef
7.
Zurück zum Zitat Motro, A.: Integrity = Validity + Completeness. ACM Trans. Database Syst. 14(4), 480–502 (1989)CrossRef Motro, A.: Integrity = Validity + Completeness. ACM Trans. Database Syst. 14(4), 480–502 (1989)CrossRef
8.
Zurück zum Zitat Levy, A.Y.: Obtaining complete answers from incomplete databases. In: VLDB 1996, pp. 402–412 (1996) Levy, A.Y.: Obtaining complete answers from incomplete databases. In: VLDB 1996, pp. 402–412 (1996)
9.
Zurück zum Zitat Razniewski, S., Nutt, W.: Completeness of queries over incomplete databases. PVLDB 4(11), 749–760 (2011) Razniewski, S., Nutt, W.: Completeness of queries over incomplete databases. PVLDB 4(11), 749–760 (2011)
10.
Zurück zum Zitat Razniewski, S., Nutt, W.: Assessing query completeness over incomplete databases. In: VLDB Journal (submitted) Razniewski, S., Nutt, W.: Assessing query completeness over incomplete databases. In: VLDB Journal (submitted)
11.
Zurück zum Zitat Fürber, C., Hepp, M.: SWIQA - a semantic web information quality assessment framework. In: ECIS 2011 (2011) Fürber, C., Hepp, M.: SWIQA - a semantic web information quality assessment framework. In: ECIS 2011 (2011)
12.
Zurück zum Zitat Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Workshops, pp. 116–123 (2012) Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Workshops, pp. 116–123 (2012)
13.
Zurück zum Zitat Chu, X., Morcos, J., Ilyas, I.F., Ouzzani, M., Papotti, P., Tang, N., Ye, Y.: KATARA: a data cleaning system powered by knowledge bases and crowdsourcing. In: ACM SIGMOD 2015, pp. 1247–1261 (2015) Chu, X., Morcos, J., Ilyas, I.F., Ouzzani, M., Papotti, P., Tang, N., Ye, Y.: KATARA: a data cleaning system powered by knowledge bases and crowdsourcing. In: ACM SIGMOD 2015, pp. 1247–1261 (2015)
14.
Zurück zum Zitat Acosta, M., Simperl, E., Flöck, F., Vidal, M.-E.: HARE: a hybrid SPARQL engine to enhance query answers via crowdsourcing. In: K-CAP 2015, pp. 11:1–11:8 (2015) Acosta, M., Simperl, E., Flöck, F., Vidal, M.-E.: HARE: a hybrid SPARQL engine to enhance query answers via crowdsourcing. In: K-CAP 2015, pp. 11:1–11:8 (2015)
15.
Zurück zum Zitat Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW 2013, pp. 413–422 (2013) Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW 2013, pp. 413–422 (2013)
16.
Zurück zum Zitat Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD 2014, pp. 601–610 (2014) Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD 2014, pp. 601–610 (2014)
17.
Zurück zum Zitat Darari, F., Prasojo, R.E., Nutt, W.: Expressing no-value information in RDF. In: ISWC Posters and Demos (2015) Darari, F., Prasojo, R.E., Nutt, W.: Expressing no-value information in RDF. In: ISWC Posters and Demos (2015)
Metadaten
Titel
Enabling Fine-Grained RDF Data Completeness Assessment
verfasst von
Fariz Darari
Simon Razniewski
Radityo Eko Prasojo
Werner Nutt
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-38791-8_10

Premium Partner