ABSTRACT
In this paper, we formalize the problem of Basic Graph Pattern (BGP) optimization for SPARQL queries and main memory graph implementations of RDF data. We define and analyze the characteristics of heuristics for selectivity-based static BGP optimization. The heuristics range from simple triple pattern variable counting to more sophisticated selectivity estimation techniques. Customized summary statistics for RDF data enable the selectivity estimation of joined triple patterns and the development of efficient heuristics. Using the Lehigh University Benchmark (LUBM), we evaluate the performance of the heuristics for the queries provided by the LUBM and discuss some of them in more details.
- S. Bechhofer, F. van Harmelen, J. Hendler, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider, and L. A. Stein. OWL Web Ontology Language Reference. Technical report, W3C, 2004.Google Scholar
- D. Beckett and B. McBride. RDF/XML Syntax Specification. Technical report, W3C, 2004.Google Scholar
- A. Bernstein, C. Kiefer, and M. Stocker. OptARQ: A SPARQL Optimization Approach based on Triple Pattern Selectivity Estimation. Technical Report ifi-2007.03, University of Zurich, Department of Informatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland, March 2007.Google Scholar
- Bernstein, M. Stocker, and C. Kiefer. SPARQL Query Optimization Using Selectivity Estimation. In Poster Proceedings of the 6th International Semantic Web Conference (ISWC), Lecture Notes in Computer Science. Springer, 2007.Google Scholar
- J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, and K. Wilkinson. Jena: Implementing the Semantic Web Recommendations. Technical report, HP Laboratories, 2003.Google Scholar
- Y. Guo, Z. Pan, and J. Heflin. LUBM: A Benchmark for OWL Knowledge Base Systems. Web Semantics: Science, Services and Agents on the World Wide Web, 3(2-3):158--182, 2005. Google ScholarDigital Library
- A. Harth and S. Decker. Optimized Index Structures for Querying RDF from the Web. In Proc. of the 3rd Latin American Web Congress, page 71, 2005. Google ScholarDigital Library
- O. Hartig and R. Heese. The SPARQL Query Graph Model for Query Optimization. In Proc. of the 4th European Semantic Web Conf., 2007. Google ScholarDigital Library
- G. Klyne and J. J. Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. Technical report, W3C, 2004.Google Scholar
- B. J. Oommen and L. Rueda. The Efficiency of Modern-Day Histogram-Like Techniques for Query Optimization. The Computer Journal, 45(2):494--510, 2002.Google ScholarCross Ref
- J. Pérez, M. Arenas, and C. Gutierrez. Semantics and Complexity of SPARQL. In Proc. of the 5th Int. Semantic Web Conf., pages 30--43, 2006. Google ScholarDigital Library
- G. Piatetsky-Shapiro and C. Connell. Accurate Estimation of the Number of Tuples Satisfying a Condition. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 256--276, 1984. Google ScholarDigital Library
- V. Poosala and Y. E. Ioannidis. Selectivity Estimation Without the Attribute Value Independence Assumption. In The VLDB Journal, pages 486--495, 1997. Google ScholarDigital Library
- E. Prud'hommeaux and A. Seaborne. SPARQL Query Language for RDF. Technical report, W3C, 2008.Google Scholar
- P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access Path Selection in a Relational Database Management System. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 23--34, 1979. Google ScholarDigital Library
- M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and Randomized Optimization for the Join Ordering Problem. VLDB Journal: Very Large Data Bases, 6(3):191--208, 1997. Google ScholarDigital Library
- M. Stonebraker and L. A. Rowe. The Design of POSTGRES. In SIGMOD Conference, pages 340--355, 1986. Google ScholarDigital Library
- E. Wong and K. Youssefi. Decomposition: A Strategy for Query Processing. ACM Trans. Database Syst., 1(3):223--241, 1976. Google ScholarDigital Library
Index Terms
- SPARQL basic graph pattern optimization using selectivity estimation
Recommendations
Foundations of SPARQL query optimization
ICDT '10: Proceedings of the 13th International Conference on Database TheoryWe study fundamental aspects related to the efficient processing of the SPARQL query language for RDF, proposed by the W3C to encode machine-readable information in the Semantic Web. Our key contributions are (i) a complete complexity analysis for all ...
Selectivity estimation for SPARQL graph pattern
WWW '10: Proceedings of the 19th international conference on World wide webThis paper focuses on selectivity estimation for SPARQL graph patterns, which is crucial to RDF query optimization. The previous work takes the join uniformity assumption, which would lead to high inaccurate estimation in the cases where properties in ...
Selectivity Estimation of Correlated Properties in RDF Data for SPARQL Query Optimization
SKG '09: Proceedings of the 2009 Fifth International Conference on Semantics, Knowledge and GridNowadays mainstream RDF Repository Systems are based on RDBMS. The SPARQL query engine translates a SPARQL query into a SQL one, and then the RDBMS executes the SQL query. However the RDBMS optimizers, which usually assume that columns are statistically ...
Comments