Skip to main content
Top

2016 | OriginalPaper | Chapter

Enabling Fine-Grained RDF Data Completeness Assessment

Authors : Fariz Darari, Simon Razniewski, Radityo Eko Prasojo, Werner Nutt

Published in: Web Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Nowadays, more and more RDF data is becoming available on the Semantic Web. While the Semantic Web is generally incomplete by nature, on certain topics, it already contains complete information and thus, queries may return all answers that exist in reality. In this paper we develop a technique to check query completeness based on RDF data annotated with completeness information, taking into account data-specific inferences that lead to an inference problem which is \(\varPi ^P_2\)-complete. We then identify a practically relevant fragment of completeness information, suitable for crowdsourced, entity-centric RDF data sources such as Wikidata, for which we develop an indexing technique that allows to scale completeness reasoning to Wikidata-scale data sources. We verify the applicability of our framework using Wikidata and develop COOL-WD, a completeness tool for Wikidata, used to annotate Wikidata with completeness statements and reason about the completeness of query answers over Wikidata. The tool is available at http://​cool-wd.​inf.​unibz.​it/​.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
4
Since in this work we focus on conjunctive queries which are monotonic, the direction \({\llbracket Q \rrbracket _{G'}} \supseteq {\llbracket Q \rrbracket _{G}}\) comes for free.
 
6
We do not allow the subject to be a variable as it is not practically reasonable (e.g., complete for all the entities and values of predicate child).
 
9
We do not measure query evaluation time for failure case since query evaluation is independent of the completeness of the query.
 
Literature
1.
go back to reference Hayes, P.J., Patel-Schneider, P.F. (eds.): RDF 1.1 Semantics. W3C Recommendation, 25 February 2014 Hayes, P.J., Patel-Schneider, P.F. (eds.): RDF 1.1 Semantics. W3C Recommendation, 25 February 2014
2.
go back to reference Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef Vrandecic, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)CrossRef
3.
go back to reference Darari, F., Nutt, W., Pirrò, G., Razniewski, S.: Completeness statements about rdf data sources and their use for query answering. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 66–83. Springer, Heidelberg (2013)CrossRef Darari, F., Nutt, W., Pirrò, G., Razniewski, S.: Completeness statements about rdf data sources and their use for query answering. In: Alani, H., Kagal, L., Fokoue, A., Groth, P., Biemann, C., Parreira, J.X., Aroyo, L., Noy, N., Welty, C., Janowicz, K. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 66–83. Springer, Heidelberg (2013)CrossRef
4.
go back to reference Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: ACM SIGMOD 2015, pp. 561–576 (2015) Razniewski, S., Korn, F., Nutt, W., Srivastava, D.: Identifying the extent of completeness of query answers over partially complete databases. In: ACM SIGMOD 2015, pp. 561–576 (2015)
5.
go back to reference Harris, S., Seaborne, A. (eds.): SPARQL 1.1 Query Language. W3C Recommendation, 21 March 2013 Harris, S., Seaborne, A. (eds.): SPARQL 1.1 Query Language. W3C Recommendation, 21 March 2013
6.
go back to reference Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRef Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRef
7.
go back to reference Motro, A.: Integrity = Validity + Completeness. ACM Trans. Database Syst. 14(4), 480–502 (1989)CrossRef Motro, A.: Integrity = Validity + Completeness. ACM Trans. Database Syst. 14(4), 480–502 (1989)CrossRef
8.
go back to reference Levy, A.Y.: Obtaining complete answers from incomplete databases. In: VLDB 1996, pp. 402–412 (1996) Levy, A.Y.: Obtaining complete answers from incomplete databases. In: VLDB 1996, pp. 402–412 (1996)
9.
go back to reference Razniewski, S., Nutt, W.: Completeness of queries over incomplete databases. PVLDB 4(11), 749–760 (2011) Razniewski, S., Nutt, W.: Completeness of queries over incomplete databases. PVLDB 4(11), 749–760 (2011)
10.
go back to reference Razniewski, S., Nutt, W.: Assessing query completeness over incomplete databases. In: VLDB Journal (submitted) Razniewski, S., Nutt, W.: Assessing query completeness over incomplete databases. In: VLDB Journal (submitted)
11.
go back to reference Fürber, C., Hepp, M.: SWIQA - a semantic web information quality assessment framework. In: ECIS 2011 (2011) Fürber, C., Hepp, M.: SWIQA - a semantic web information quality assessment framework. In: ECIS 2011 (2011)
12.
go back to reference Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Workshops, pp. 116–123 (2012) Mendes, P.N., Mühleisen, H., Bizer, C.: Sieve: linked data quality assessment and fusion. In: EDBT/ICDT Workshops, pp. 116–123 (2012)
13.
go back to reference Chu, X., Morcos, J., Ilyas, I.F., Ouzzani, M., Papotti, P., Tang, N., Ye, Y.: KATARA: a data cleaning system powered by knowledge bases and crowdsourcing. In: ACM SIGMOD 2015, pp. 1247–1261 (2015) Chu, X., Morcos, J., Ilyas, I.F., Ouzzani, M., Papotti, P., Tang, N., Ye, Y.: KATARA: a data cleaning system powered by knowledge bases and crowdsourcing. In: ACM SIGMOD 2015, pp. 1247–1261 (2015)
14.
go back to reference Acosta, M., Simperl, E., Flöck, F., Vidal, M.-E.: HARE: a hybrid SPARQL engine to enhance query answers via crowdsourcing. In: K-CAP 2015, pp. 11:1–11:8 (2015) Acosta, M., Simperl, E., Flöck, F., Vidal, M.-E.: HARE: a hybrid SPARQL engine to enhance query answers via crowdsourcing. In: K-CAP 2015, pp. 11:1–11:8 (2015)
15.
go back to reference Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW 2013, pp. 413–422 (2013) Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: WWW 2013, pp. 413–422 (2013)
16.
go back to reference Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD 2014, pp. 601–610 (2014) Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: ACM SIGKDD 2014, pp. 601–610 (2014)
17.
go back to reference Darari, F., Prasojo, R.E., Nutt, W.: Expressing no-value information in RDF. In: ISWC Posters and Demos (2015) Darari, F., Prasojo, R.E., Nutt, W.: Expressing no-value information in RDF. In: ISWC Posters and Demos (2015)
Metadata
Title
Enabling Fine-Grained RDF Data Completeness Assessment
Authors
Fariz Darari
Simon Razniewski
Radityo Eko Prasojo
Werner Nutt
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-38791-8_10

Premium Partner