Skip to main content
Top

2019 | OriginalPaper | Chapter

Quality-Driven Query Processing over Federated RDF Data Sources

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The integration of data from heterogeneous sources is a common task in various domains to enable data-driven applications. Data sources may range from publicly available sources to sources within data lakes of companies. The added value generated by integrating and analyzing the data greatly depends on the quality of the underlying data. As a result, querying heterogeneous data sources as a way of integrating data enabling such applications needs to consider quality aspects. Quality-driven query processing over RDF data sources aims to study approaches which consider data quality description of the data sources to determine optimal query plans. In contrast to most federated query approaches, in quality-driven query processing the quality of an optimal plan and thus of the retrieved data, not only depends on efficiency typically measured as execution time but also on other quality criteria. In this work, we present the challenges associated with considering multiple quality criteria in federated query processing and derive our problem statement accordingly. We present our research questions to address the problem and the associated hypotheses. Finally, we outline our approach including an evaluation plan and provide preliminary results.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
4
The mediator uses query correspondence assertions (QCAs) in order to determine contents, i.e. available relations, of the sources.
 
Literature
2.
go back to reference Acosta, M., Simperl, E., Flöck, F., Vidal, M.E.: Enhancing answer completeness of SPARQL queries via crowdsourcing. J. Web Semant. 45, 41–62 (2017) CrossRef Acosta, M., Simperl, E., Flöck, F., Vidal, M.E.: Enhancing answer completeness of SPARQL queries via crowdsourcing. J. Web Semant. 45, 41–62 (2017) CrossRef
4.
go back to reference Ben Ellefi, M., et al.: RDF dataset profiling - a survey of features, methods, vocabularies and applications. Semant. Web 9(5), 677–705 (2018)CrossRef Ben Ellefi, M., et al.: RDF dataset profiling - a survey of features, methods, vocabularies and applications. Semant. Web 9(5), 677–705 (2018)CrossRef
6.
go back to reference Endris, K.M., Galkin, M., Lytra, I., Mami, M.N., Vidal, M.-E., Auer, S.: MULDER: querying the linked data web by bridging RDF molecule templates. In: Benslimane, D., Damiani, E., Grosky, W.I., Hameurlain, A., Sheth, A., Wagner, R.R. (eds.) DEXA 2017. LNCS, vol. 10438, pp. 3–18. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64468-4_1CrossRef Endris, K.M., Galkin, M., Lytra, I., Mami, M.N., Vidal, M.-E., Auer, S.: MULDER: querying the linked data web by bridging RDF molecule templates. In: Benslimane, D., Damiani, E., Grosky, W.I., Hameurlain, A., Sheth, A., Wagner, R.R. (eds.) DEXA 2017. LNCS, vol. 10438, pp. 3–18. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-64468-4_​1CrossRef
7.
go back to reference Färber, M., Bartscherer, F., Menne, C., Rettinger, A.: Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web 9(1), 77–129 (2017)CrossRef Färber, M., Bartscherer, F., Menne, C., Rettinger, A.: Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Semant. Web 9(1), 77–129 (2017)CrossRef
8.
go back to reference Görlitz, O., Staab, S.: Splendid: SPARQL endpoint federation exploiting VoID descriptions. In: Proceedings of the Second International Conference on Consuming Linked Data, vol. 782, pp. 13–24. CEUR-WS. org (2011) Görlitz, O., Staab, S.: Splendid: SPARQL endpoint federation exploiting VoID descriptions. In: Proceedings of the Second International Conference on Consuming Linked Data, vol. 782, pp. 13–24. CEUR-WS. org (2011)
9.
go back to reference Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: Proceedings of the 19th International Conference on World Wide Web - WWW 2010, p. 411. ACM Press, Raleigh, North Carolina, USA (2010) Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: Proceedings of the 19th International Conference on World Wide Web - WWW 2010, p. 411. ACM Press, Raleigh, North Carolina, USA (2010)
13.
go back to reference Ibaraki, T., Kameda, T.: On the optimal nesting order for computing N-relational joins. ACM Trans. Database Syst. 9(3), 482–502 (1984)MathSciNetCrossRef Ibaraki, T., Kameda, T.: On the optimal nesting order for computing N-relational joins. ACM Trans. Database Syst. 9(3), 482–502 (1984)MathSciNetCrossRef
15.
go back to reference Naumann, F., Leser, U., Freytag, J.C.: Quality-driven integration of heterogenous information systems. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, pp. 447–458 (1999) Naumann, F., Leser, U., Freytag, J.C.: Quality-driven integration of heterogenous information systems. In: VLDB 1999, Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, UK, pp. 447–458 (1999)
16.
go back to reference Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 984–994, April 2011 Neumann, T., Moerkotte, G.: Characteristic sets: accurate cardinality estimation for RDF queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 984–994, April 2011
20.
go back to reference Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P.: Heuristics-based query optimisation for SPARQL. In: Proceedings of the 15th International Conference on Extending Database Technology - EDBT 2012, p. 324. ACM Press, Berlin, Germany (2012) Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P.: Heuristics-based query optimisation for SPARQL. In: Proceedings of the 15th International Conference on Extending Database Technology - EDBT 2012, p. 324. ACM Press, Berlin, Germany (2012)
21.
go back to reference Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRef Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRef
22.
go back to reference Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)CrossRef Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)CrossRef
Metadata
Title
Quality-Driven Query Processing over Federated RDF Data Sources
Author
Lars Heling
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-32327-1_40