ABSTRACT
We present statistics on real world SPARQL queries that may be of interest for building SPARQL query processing engines and benchmarks. In particular, we analyze the syntactical structure of queries in a log of about 3 million queries, harvested from the DBPedia SPARQL endpoint. Although a sizable portion of the log is shown to consist of so-called conjunctive SPARQL queries, non-conjunctive queries that use SPARQL's union or optional operators are more than substantial. It is known, however, that query evaluation quickly becomes hard for queries including the non-conjunctive operators union or optional. We therefore drill deeper into the syntactical structure of the queries that are not conjunctive and show that in 50% of the cases, these queries satisfy certain structural restrictions that imply tractable evaluation in theory. We hope that the identification of these restrictions can aid in the future development of practical heuristics for processing non-conjunctive SPARQL queries.
- D. J. Abadi, A. Marcus, S. Madden, and K. J. Hollenbach. Scalable semantic web data management using vertical partitioning. In Proceedings of VLDB 2007, pages 411--422. ACM, 2007. Google ScholarDigital Library
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- M. Arias, J. D. Fernández, M. A. Martínez-Prieto, and P. de la Fuente. An empirical study of real-world sparql queries. CoRR, abs/1103.5043, 2011.Google Scholar
- E. I. Chong, S. Das, G. Eadon, and J. Srinivasan. An efficient SQL-based RDF querying scheme. In Proceedings of VLDB 2005, pages 1216--1227, 2005. Google ScholarDigital Library
- G. H. L. Fletcher and P. W. Beck. Scalable indexing of RDF graphs for efficient join processing. In D. W.-L. Cheung, I.-Y. Song, W. W. Chu, X. Hu, and J. J. Lin, editors, CIKM, pages 1513--1516. ACM, 2009. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. The complexity of acyclic conjunctive queries. J. ACM, 48(3):431--498, 2001. Google ScholarDigital Library
- K. Möller, M. Hausenblas, R. Cyganiak, S. Handschuh, and G. Grimnes. Learning from linked open data usage: Patterns & metrics. In Proceedings of the Web Science Conference 2010, 2010.Google Scholar
- T. Neumann and G. Weikum. Scalable join processing on very large RDF graphs. In SIGMOD 2009 Conference Proceedings, pages 627--640, 2009. Google ScholarDigital Library
- T. Neumann and G. Weikum. The RDF-3X engine for scalable management of RDF data. VLDB J., 19(1):91--113, 2010. Google ScholarDigital Library
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. ACM Trans. Database Syst., 34(3), 2009. Google ScholarDigital Library
- A. Polleres. From SPARQL to rules (and back). In WWW 2007 Conference Proceedings, pages 787--796. Google ScholarDigital Library
- E. Prud'hommeaux and A. Seaborne. SPARQL query language for RDF. Technical report, W3C Recommendation, 2008.Google Scholar
- Resource description framework( (RDF). Technical report, W3C. http://www.w3.org/RDF/.Google Scholar
- M. Schmidt, M. Meier, and G. Lausen. Foundations of SPARQL query optimization. In ICDT 2010 Proceedings, pages 4--33, 2010. Google ScholarDigital Library
- L. Sidirourgos, R. Goncalves, M. L. Kersten, N. Nes, and S. Manegold. Column-store support for RDF data management: not all swans are white. PVLDB, 1(2):1553--1563, 2008. Google ScholarDigital Library
- M. Y. Vardi. The complexity of relational query languages (extended abstract). In STOC, pages 137--146. ACM, 1982. Google ScholarDigital Library
- M.-E. Vidal, E. Ruckhaus, T. Lampo, A. Martínez, J. Sierra, and A. Polleres. Efficiently joining group patterns in SPARQL queries. In 7th Extended Semantic Web Conference, volume 6088 of Lecture Notes in Computer Science, pages 228--242. Springer, 2010. Google ScholarDigital Library
- W3C SWEO Community Project. Linking open data. http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData.Google Scholar
- C. Weiss, P. Karras, and A. Bernstein. Hexastore: sextuple indexing for semantic web data management. PVLDB, 1(1):1008--1019, 2008. Google ScholarDigital Library
Index Terms
- What are real SPARQL queries like?
Recommendations
Canonicalisation of Monotone SPARQL Queries
The Semantic Web – ISWC 2018AbstractCaching in the context of expressive query languages such as SPARQL is complicated by the difficulty of detecting equivalent queries: deciding if two conjunctive queries are equivalent is NP-complete, where adding further query features makes the ...
Equivalence and minimization of conjunctive queries under combined semantics
ICDT '12: Proceedings of the 15th International Conference on Database TheoryThe problems of query containment, equivalence, and minimization are fundamental problems in the context of query processing and optimization. In their classic work [2] published in 1977, Chandra and Merlin solved the three problems for the language of ...
Semantic Acyclicity Under Constraints
PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsA conjunctive query (CQ) is semantically acyclic if it is equivalent to an acyclic one. Semantic acyclicity has been studied in the constraint-free case, and deciding whether a query enjoys this property is NP-complete. However, in case the database is ...
Comments